s1krishnan – Dashboards, Scorecards & Visualization

Layered Donuts with extra fats and oils – not too healthy!

Source: https://www.ers.usda.gov/webdocs/publications/82220/eib-166.pdf?v=42762

This visualization is from a report from USDA (United States Department of Agriculture). The report is about U.S. Trends in Food Availability and Dietary Assessment of Loss-Adjusted Food Availability, 1970-2014. The graph under consideration for this blog, is the 2^nd Figure in the report and the first figure under the Findings category. It is a layered donut chart that shows the per person calorie consumption for each of the food category: Fruit and vegetables, Grains, Added fats and oils, Meat, eggs & nuts, Added sugar & sweeteners, and Dairy. The layered donut has two layers, inner layer for the per person calorie consumption for each category in the year 1970 and the outer layer which shows per person calorie consumption for each category in the year 2014 (except for Added fats and oils for which the outer layer shows consumption for the year 2010). Each category is color coded to identify the region demarcations in the donut layers and the numbers within each region indicate the calorie consumed per person for that category.

Things I liked about the visualization:

Visualization is simple and easy to understand. It depicts what the figure title says clearly, calorie consumption per person for 1970 and 2014 for each food group.
The graph is well labeled. The calories consumed in each category is clearly visible in contrast to the color of the region for each category.
The colors chosen are bright and appropriate, there is no overlapping of color shades or similar colors that would make it difficult to understand consumption of each food group.
The regions are aligned for both years, one in front of the other. This makes it easy to see the difference in calorie consumption for each group for the 2 years.

Things I did not like about the visualization:

The calorie consumption for Added fats and oils is shown for the year 2010 and not 2014 as data for that category was available till 2010 only. But it has been placed in the outer layer of donut with the rest of categories which depict calorie consumption for 2014, and the title says that outer ring says that outer ring depicts calories by food group for 2014. This is misleading.
The use of a layered donut chart looks like it is intended to give an idea of the proportion of each food group consumed in both years, with respect to each other. But, there is no indication of the percentage of each food group consumed. One has to try and figure out the proportion of each food group consumed from the relative portions of the donut. This may not give the exact idea of the proportions to the viewers.
I am not sure I like the idea of layered donuts for comparing the food group consumption for both years. The difference in proportions is not clear by just comparing the concentric rings. The choice of visualization does not do justice to the intent of the visualization.
Not including the percentage value of each food group may mislead the audience as they may relate an increase in calorie consumed between both years to an increase in proportion of food group consumed, which may not always be true(as in case of Added sugar and sweetners).

Critical Analysis of the visualization:

Let’s analyze the subjective dimension of the visualization:

Beautiful: I would definitely not call this a very beautiful visualization. The use of a layered donut chart does not seem to be the optimal choice for the purpose. It is not easy to compare the proportion of consumption of one category to the others for one particular year. It is also not easy to compare the change in proportion of food group consumed between the two years. For instance, it is not easy to determine the exact percentage of the food group Dairy’s consumption in year 1970. The use of a stacked bar graph for both years or line graphs would have helped visualizing these details better.
Functional: The functionality of the visualization could be improved. The current visualization labels only the calories consumed in both years, leaving the viewer to decipher the relative proportion of food group consumed by the size of the donut rings, which is not easy. The functionality could have been improved by separately visualizing the change in relative proportion of food groups consumed with respect to each other, along with the calories consumed. As change in proportion of food group consumed is equally important in deriving any useful insights, it should have been included as well.
Insightful: The visualization does a fair job in terms of being insightful. The ordinary audience (someone like me) will not necessarily have knowledge of the changing trends in food group consumption and this visualization gives a good idea of how food consumption trends has changed between 1970 and 2014. But this can also be improved. One way of improving could be by including age specific consumption of food groups, which gives a much clearer idea as to which age groups have shifted more from eating healthy food like fruits to consuming more calories of added fats and oils by eating more junk food. The second way is by including information for dietary guidelines for consumption of each food group. This would make the visualization more insightful, as the information on consumption of food groups will now have a context and give more useful insights.
Enlightening: The entire report is about changing trends in food availability and dietary assessment. But this graph only shows the change in calories consumed among the different food groups. Just by looking at the changing calorie consumption of each food group one cannot initiate any change as there is no clear indication of any impacts of changing calorie consumption on one’s diet/ health. The graph also does not give much information regarding the changing food availability to make any useful decisions regarding the availability of food choices.

Does this graph have a claim?

No I don’t think so. As mentioned earlier, this graph serves as a blanket visualization for the report which explains the change in trends of food group availability trends from 1970 to 2014. As it is the first graph and is expected to give an overview of all the information that is further explained in the following graphs, this graph needs to give an overview of the intended claim of the report. The report is intended to inform the audience regarding the changing food availability trends and an assessment of the diet of Americans. But by just looking at this graph, we do not see any claims for food availability or diet assessments.

Validation of data:

We do not know the change in consumption of each category over all the years between 1970 and 2014. Providing the information of just the start and end year may mislead the audience if the trends for the years in between show significant variations, as data can be cherry picked to make it look the way you want. The omission of data on years between the start and end years raises a question on the shown trends, since the results of one particular year can well be an anomaly and thus not indicative of a trend.

Redesign of the visualization:

As I mentioned earlier, the use of a layered donut chart is not the right choice for this visualization. The visualization intends to give an idea of change in calorie consumption per food group per person. I have redesigned the given visualization to give a better understanding of change in calorie consumption between 1970 and 2014. I have also visualized the change in proportion of each food group consumed, to give a clear idea of the change in trends of food group consumption.

Link to redesigned visualization:

https://docs.google.com/a/scu.edu/document/d/1Y4gUVDlOQPDrc7cviZpLI77NWSlO61DDwsa2vkNlyho/edit?usp=sharing

References:

Why not to use Pie/Donut charts:

http://geographymaterials.blogspot.com/2015/08/advantages-and-disadvantages-of-pie.html

Using the right chart for comparing values over time:

https://www.cardinalsolutions.com/blog/2016/05/data-visualization-best-practices-part-two-mistakes-to-avoid

Family and Living Arrangements in America

Source: https://www.census.gov/prod/2013pubs/p20-570.pdf

I found this article while searching for some data in census.gov website. This visualization is from a paper published on Families and Living Arrangement trends in the United States in the year 2012. This is the 1^st graph in the article, which includes a number of graphs depicting various trends in the American Family and Living Arrangements. The graph conveys the changing trends in different Household types from the years 1970 to 2012. The graph is a Stacked Bar Graph where each stacked bar for a given year, depicts the percentage share of that particular household with the total of the different stacks in a bar adding to 100%. We have stacked bars of household types for the years 1970, 1980, 1990, 2000, 2005, 2010 and 2012

The things I liked about the graph:

The graph is extremely easy to understand. The title of the graph is Household types, 1970 to 2012 and the graph shows exactly that. There is no confusion as to what is in the graph. It is a fairly simple graph, conveying what it is supposed to convey.
Each stack in the individual bars are labeled with the percentage number. Hence it is not very difficult to figure out the exact value of each household type share by looking at the graph.
The X-axis and Y-axis are both labeled clearly and there is no missing values or confusion regarding the scales.
The different colors used to identify the different household types helps in understanding the share of that household type in the whole bar.

Things I did not like about the graph:

In the paper the first sentence below this graph, marked in red says “The share of households that married couples maintained has fallen since 1970, while the share of non family households has increased”. Although this statement does appear to look true by looking at the graph, but the change does not look so drastic especially if you consider the years from 1990 to 2012. The change in trend in these years does not look too drastic but rather gradual. I feel if this statement was intended to be conveyed by the visualization, then it should be obviously evident and should not take multiple looks to understand.
The gap between the years for the consecutive bars, is not consistent. The gap between each of the first three bars is 10 years, then the gap between the years becomes 5 years for the next 4 bars and then ends with a 2-year gap between the last and second to the last bar. This inconsistency in the years may convey the wrong trends if the household type share for the missing years is considerably different from the depicted trend.
Some bars do not add up to a perfect 100. As the graph is about the percentage share of each household type for each year, it is necessary that individual shares of each household type for a year add up to a 100%. For the years, 1980 and 1995, the total adds up to 99.9% and for the years 1990,2005 and 2012 it adds up to a 100.1%

Critical Analysis of the visualization:

Beautiful: The visualization is clear and easy to understand. But I believe the use of stacked bar graph is not appropriate for this particular visualization. The aim of the visualization is to portray the changing trends in household types over the years. We know and Visualization Best Practices suggest that, line charts track changes or trends over time and show relationship between two or more variables. Thus, a line graph, with each household type depicted separately and differentiated by color would give a much clear view of the changing trends over the years.
Enlightening: According to me, the visualization by itself is not very enlightening. An enlightening visualization is one which initiates a change in the audience. This visualization on household types is definitely informative. It gives us an idea of the changing trends over the years. But it does not make the audience take any specific action. Are there any relevant impacts due to changing household trends? This is not clear and hence there is no potential changes that one can take based on this information.
I am also not sure as to why the start year is 1970, the visualization nor the article tries to explain the significance for the chosen time period. As we discussed in class on the validation of visualization, people can cherry pick the data to make the data look the way you want. Hence it is important that there is no question raised on the validity of data. May be if a longer period was chosen would have made the changing trends look different than what it shows now. There should be no question on the validity of data.

Redesign:

As I discussed, stacked bar graph is not the most ideal graph to design time changing trends. The use of a line graph would be a better choice to design the graph. The redesigned graph can be viewed at:

https://docs.google.com/a/scu.edu/document/d/1JzAz4AQXBmJlT5V5ZTKkNtYEsmhBsZ6zbFF4odZ9oAQ/edit?usp=sharing

References:

1)Choosing the right visualization for your purpose:

https://www.gooddata.com/blog/5-data-visualization-best-practices

2)Scaling an axis properly:

https://blog.graphiq.com/data-visualization-best-practices-91a35f1b29fa

3) When are 100% Stacked Bar Graphs useful:

https://www.perceptualedge.com/blog/?p=2239

The World’s most dangerous cities!

Visualization Link: http://www.economist.com/blogs/graphicdetail/2017/03/daily-chart-23

I came across this article while researching for my Individual Project. I was intrigued by the visualization and decided to read and understand it better. However, when I attempted the visualization without reading the article, I found it hard to get any valuable insights. On reading the article a couple of times and looking back at the visualization then, I understood what the author wished to convey.

To explain this visualization in brief, the legend indicates that homicides are measured per 100,000 population and this measure is called Homicide Rate. The regions are color coded with each region given a color. Latin America and Caribbean region and all countries and cities that fall under this region are in red color, similarly, the African region is given yellow color and North America is given blue. The visualization gives the Homicide Rates for the most dangerous cities and the corresponding ten most dangerous countries to which they belong. On the left we see the ten most dangerous countries listed with their time progressive Homicide Rates indicated in the 10 small graphs one below the other. On the right side of each small graphs for individual country, we see the cities in those countries placed along the X- axis as per their Homicide Rate which is denoted on the Y-axis. For example, Victoria in Mexico has a Homicide Rate of 60. The black solid vertical line one the X-axis indicates the National Homicide Rate for the country. For example, national Homicide Rate for Mexico is close to 16-18. The size of the circles indicates the range of homicides in the city. For example, Acapulco has between 100 and 1000 homicides. All these numbers are of the year 2016 or latest. I guess this fairly explains the contents of the visualization.

There was something intriguing about the visualization that drew my attention to it. For starters, I like the fact that they have tried to be as thorough as possible in explaining the reasons these cities are considered most dangerous. The article helps quite a bit in understanding the visualization and the reason for splitting the cities as per region – Latin America & Caribbean, North America and Africa. The details of cocaine cultivation and transport gives us a context for the reasons for homicides. Another aspect that I liked is that, they have given us multiple levels of homicide information regarding the cities – the National Homicide Rate, City Homicide Rate in comparison to the national Homicide Rate and the raw homicide count for each city. The reason I feel this is helpful is because, it gives us information in more than one dimension like –How many people died due to Homicides in 2016 and how many 100,00 people did homicide kill in that year. The color coding corresponds with the Homicide Rate with red indicating top contributors and blue indicating least which is in alignment with common color perceptions like red being associated to danger, yellow signifying moderation etc. Using size of circles to indicate the number of homicides in the city is also a good choice of visualization tool as we can easily get an idea of which cities have more homicides than the other.

The reason I chose this visualization for the blog is that, it gives us a lot of information that is useful. But it does not do it in the most effective way. I believe that if not for certain flaws, it would have been a very useful visualization in understanding homicides. The most obvious flaw is too much information in one visualization. For instance, the City Homicide Rate, National Homicide Rate and City Homicide count range is all present in one single line. This could lead to confusion and result in the reader forming wrong conclusions due to misinterpretation. For example, if we see Cape Town, we are immediately drawn towards its big circle, seeing that we might form a biased opinion that Cape Town is more dangerous than say a city like San Salvador. But in fact, the Homicide Rate for San Salvador is higher than that of Cape Town. Thus, number of people out of the population dying in San Salvador is much higher than Cape Town. Thus, presenting information about these two variables (the homicide rate and actual homicide count) together is not a good idea. Apart from this there are few other flaws. For example, the graphs of the countries on the left indicating the national Homicide Rate look incomplete and crammed up to fit the available space. Apart from the first and last graph of El Salvador and Jamaica, none of the other graphs in between have the upper limit demarcation on the Y-axis, the audience is expected to infer that the remaining graphs also have the same upper limit of Y-axis of 100. The graphs themselves being too small are difficult to read, to figure out the Homicide Rate at a particular point in time.

The article mentions that 43 of the 50 most dangerous cities in the world belong in the three regions mentioned in the graph. But if you count the number of names of cities on the graph, you will find that all 43 names are not present. Also, there are some circles in the graph which do not have names, especially if you see cities in Brazil. There are only four city names mentioned but we can see many more circles than four. The reader could have questions seeing this as to whether the additional circles represent homicides in cities not mentioned in the graph or do they mean something else. This is big inconsistency that may lead the audience to feel confused as to what does the visualization wants to convey. Also, the claim that these countries and cities are the most dangerous is not well supported with data. There is no mention of what is the global median Homicide Rate and how high are the Homicide Rates of these countries mentioned in comparison to this median rate.

I believe the entire visualization could have been broken down in at least 3 individual visualization and told as a story with interactive filters.

The first visualization could have consisted information just about the Homicide Rates of the 10 most dangerous countries (currently conveyed through the tiny graphs on the left side of the visualization). It could have included details of the time varying homicides in the countries and the reasons attributing to it, thus giving a sense of why the homicide rates are quite high in these countries.
The second visualization should have been for the city Homicide Rates in comparison with the national rate. In this visualization, we could simply plot the Homicide Rates of the city and the national Homicide Rate for the country alone without introducing the circles of different shape which cause confusion and potentially mislead the reader. Thus, it would give us the idea as to how each city fares in comparison to its national rate and in comparison, to each other.
The third visualization could be a Map Chart with all the cities and their countries and the size of circles indicating the number of homicides in each city. Using the size of circles to indicate number of homicides and the map chart itself to plot these cities and countries would help visualize the authors claim that Latin American and Caribbean regions remain the World’s most dangerous regions. I believe this breakdown would make it easy to understand the individual pieces of information and the story of how these pieces together indicate the most dangerous parts of the world when it comes to homicides.

Above is a similar visualization found in a 2014 Huffington Post article gives similar insights on 10 countries with highest murder rates. As we can see using the Map clearly conveys the regional dominance of Americas in Homicides that is mentioned in our visualization as well.

Along with the breakdown of visualizations, another improvement would be if there was more information of the global median Homicide Rates, which would have given a clear idea as to how much higher are the rates in these dangerous countries than the global median Homicide Rate.

References: http://www.huffingtonpost.com/2014/04/10/worlds-highest-murder-rates_n_5125188.html

Weak graph for a strong claim- Washington Post on Global Warming!

Sneha Krishnan

https://www.washingtonpost.com/news/wonk/wp/2013/07/09/you-cant-deny-global-warming-after-seeing-this-graph/?utm_term=.ab92a5adbdd7

If there is one imminent threat for mankind that worries me, then that would be Global Warming. For years now, we have seen scientists talk about its perils and consequences for our planet and in particular about the jeopardy to life on Earth. There is proof enough to believe that our actions are leading to the rising Global temperature. I often find it to be a huge irony, that we complain about the rising temperatures, drought and hot summers while relaxing in air conditioned houses or often driving comfortable air conditioned cars and listening to the news on the radio. In reality though, it is our actions that are causing the gradual Global temperature rise, thanks to increased CO2 emission levels due fossil fuel burning.

I stumbled upon this Washington Post article and I was instantly attracted by its title. Though a little old, I read this article to understand the findings shared by the World Meteorological Association. The graph in the link describes the Global temperature in degree Celsius for decades (a period of 10 years) from 1881 to 2010. As explained in the article, they have considered decades as the time to measure the Global temperature rather than individual years, as temperatures in a single year could also rise due to reasons other than climate change.

What I liked about the graph is that they have considered the time span of nearly 15 decades to show the changing Global temperature. This gives us an idea about Global temperature not just in the past few decades, but rather its progressive increase over several decades. The fact that they have measured temperatures over decades rather than individual years is also very helpful, as climate changes and Global temperature rises are evident over a longer period than over short intervals. Had this been for individual years, the difference in temperature between successive years would not be quite evident, which would have led us to believe that the temperature rise is not significant and we have anything to worry about. The color coding for each five decades also helps us understand the relative changes in temperature in a span of five decades. It helps us notice the steep increase in temperature from one color to the other, signifying a rapid increase in temperatures between those decades. The graph is simple and informative, it helps convey the relevant information to a novice reader who needs to know the impending dangers of global warming and be a responsible citizen. The hope is that they would consider carpooling the next time he takes their car to work or decide to bike to get groceries from the nearby grocery store.

While I liked the graph for depicting the increasing temperatures over nearly the last 15 decades, I found certain aspects to be confusing and not adding value to the graph. For instance, I did not understand why did the Y-axis start from 13.400 degree Celsius. It is not clear to me as a novice reader, whether it indicates the temperature before the starting decade of 1881 (i.e. 1871 to 1880) or is it considered the normal Global temperature. Also, the purpose for the center dashed line that goes through the graph (a little above the temperature on Y-axis of 13.950) is not clear. I am not sure if it is present in order to indicate something significant like the mean temperature (though the mean temperature is 13.95) in the graph or some other significant aspect. There is also inconsistency in using numbers in the graph, where the temperatures indicated on Y- axis have 3 digits post decimal and the temperatures on top of the bars have just 2 digits after the decimal point.

A significant drawback according to me, is that the graph does not convey a lot of information to support the claim made by the title. There is not enough information to explain the relationship between rising in Global temperature and factors contributing to Global Warming. If it had more information on the factors contributing to global warming (like CO2 emission levels) for decades in the graph showing significant increase in these factors as well, then the graph would support the claim of the article better. I also feel the horizontal dashed lines running to mark individual temperatures on the Y-axis do not add much value, as the temperature for each decade is mentioned on top of the bars.

In my opinion, the graph could be more useful and serve the claim better if certain things were done differently. The graph does not convey much information, especially regarding the correlation between rising temperatures and the factors contributing to it. Also, a layman may not know whether how much of an increase in the Global temperature is too much to damage the planet. The total temperature difference (between the starting and ending decade in the graph) is less than 1 degree Celsius, and this may seem not too significant, if not explained about the consequences with that increase. I would have included all the above information in the visualization. I would have used better visualization techniques and added a second graph to show the correlation between increasing Global temperature and factors contributing to rising temperatures. I would have also explained the scientific relation between those factors and the global temperature, so as to not leave room for doubt in the reader’s mind.