Blog 2 — Vehicles are in fatal crashes

This is a very cool graph, which is in the calendar view of the amount of car fatal crashes in 2010. On the left side of the graph, the rows indicate the month of the accident. The column indicates the actually date. Also, the difference between the shade tells how many vehicles are involved in fatal crashes.

The author doesn’t have a clear statement of his claim. I’m confused whether the author wants to claim that the vehicles involved in fatal crashes have the close relationship with data. For example, we can see from the graph that most vehicles involved in fatal crashes happen on weekends. Or he wants to claim that on the festival there will be more vehicles involved, such as New Year’s Day.

Also, although the author uses the validate data from the National Highway Traffic Safety Administration, the author still needs some other conditions besides date to convince the audience. These conditions could be weather, geography or unpredicted disasters. For instance, heavy snow in December may increase the amount of vehicle is involved in fatal crashes. However, in December, Boston will have the heavy snow, but California may not have the snow in that season. Therefore, the evidence of the graph can’t convince the audience very much.

Besides that, the author does a good job on showing the distribution of amount of vehicles involved. It’s very easy to see that the darker square has the largest number and the white square doesn’t have any vehicles involved.

Overall, this is a good visualisation. What I really like this graph about is that the calendar view is very creative and the idea is very new to the audience. The audience will be interested and easy to get the author’s idea. The author will be able to change the audience’s thought on viewing the amount of vehicles involved in fatal crashes based on date.

Reference:

http://www.coolinfographics.com/blog/2012/1/11/calendar-visualization-of-fatal-car-crashes.html

The 25 Top Causes of Car Accidents in the US

Western Movies with Bewildering Plots

The Western is a movie genre which tells stories set primarily in the later half of the 19th century in the American Old West, often centring on the life of a nomadic cowboy or gunfighter armed with a revolver and a rifle who rides a horse(Citation from Wikipedia). An article in The Hollywood Reporter on February 28, 2017 (http://www.hollywoodreporter.com/heat-vision/shadow-superheroes-westerns-are-quietly-popular-971841) discusses the resilience of the Western genre across six decades starting from the 60’s to current day. In the article, the author publishes a plot of the year-by-year count of the number of American produced Western films with data drawn from Box Office Mojo(shown below).

This stylised stacked bar plot is hard to comprehend from direct inspection and requires additional effort in understanding what the plot is trying to convey. The ways in which this plot is confounding are,

A stacked bar plot is used when the total in each category and their composition are relevant. It is great for visual aggregation of each category. In the above plot, however, all the stacked bars visually aggregate to the same total but are numerically different. In addition, each bar represents a particular year in each decade( the first bar represents year zero in all decades, the second bar represents year one etc.) which is not the information relevant to the article.
The labels at the top of the plot appear to indicate the starting point of each decade but only hold true for the first bar. There are bars associated with a particular label that begin even before the labelling threshold.
There is no effective display of information. It takes any user a little extra effort from their side to interpret the information being presented. Users expect a quick shot of the visualisation.
The colour palette used in the stacked bar is a collection of small variants of one colour which makes it difficult to distinguish the composition of each bar.
The time dimensions in the stacked bar graph has years of different resolutions changing in different dimensions, that is, years are increasing in single units vertically and in decades horizontally. Having one measurement unit increase in multiple dimensions at different resolutions only adds to the confusion.

Re-creating the graph:

Elimination of stacked bars: Grouped bars are preferred to stacked bars in this case because the aggregate information is not relevant to us. On the other hand, grouped bars allow us to compare data within a decade and across decades which is more useful.
Clear Labelling: The decades are represented with crisp differential colours which make it easy for the user to quickly observe data of the decade they are interested in. This information in the plot is represented in a slick while detailed manner, with the labels on the data points making it more accessible.
Time in one dimension: By grouping the bars, we are also ensuring that time as a measure stays in one dimension with changing resolutions(single years are represented as being parts of decades)

References:

http://www.hollywoodreporter.com/
http://1010wcsi.com/how-to-fix-each-of-the-7-mistakes-that-ruin-a-good-infographic/

The World’s most dangerous cities!

Visualization Link: http://www.economist.com/blogs/graphicdetail/2017/03/daily-chart-23

I came across this article while researching for my Individual Project. I was intrigued by the visualization and decided to read and understand it better. However, when I attempted the visualization without reading the article, I found it hard to get any valuable insights. On reading the article a couple of times and looking back at the visualization then, I understood what the author wished to convey.

To explain this visualization in brief, the legend indicates that homicides are measured per 100,000 population and this measure is called Homicide Rate. The regions are color coded with each region given a color. Latin America and Caribbean region and all countries and cities that fall under this region are in red color, similarly, the African region is given yellow color and North America is given blue. The visualization gives the Homicide Rates for the most dangerous cities and the corresponding ten most dangerous countries to which they belong. On the left we see the ten most dangerous countries listed with their time progressive Homicide Rates indicated in the 10 small graphs one below the other. On the right side of each small graphs for individual country, we see the cities in those countries placed along the X- axis as per their Homicide Rate which is denoted on the Y-axis. For example, Victoria in Mexico has a Homicide Rate of 60. The black solid vertical line one the X-axis indicates the National Homicide Rate for the country. For example, national Homicide Rate for Mexico is close to 16-18. The size of the circles indicates the range of homicides in the city. For example, Acapulco has between 100 and 1000 homicides. All these numbers are of the year 2016 or latest. I guess this fairly explains the contents of the visualization.

There was something intriguing about the visualization that drew my attention to it. For starters, I like the fact that they have tried to be as thorough as possible in explaining the reasons these cities are considered most dangerous. The article helps quite a bit in understanding the visualization and the reason for splitting the cities as per region – Latin America & Caribbean, North America and Africa. The details of cocaine cultivation and transport gives us a context for the reasons for homicides. Another aspect that I liked is that, they have given us multiple levels of homicide information regarding the cities – the National Homicide Rate, City Homicide Rate in comparison to the national Homicide Rate and the raw homicide count for each city. The reason I feel this is helpful is because, it gives us information in more than one dimension like –How many people died due to Homicides in 2016 and how many 100,00 people did homicide kill in that year. The color coding corresponds with the Homicide Rate with red indicating top contributors and blue indicating least which is in alignment with common color perceptions like red being associated to danger, yellow signifying moderation etc. Using size of circles to indicate the number of homicides in the city is also a good choice of visualization tool as we can easily get an idea of which cities have more homicides than the other.

The reason I chose this visualization for the blog is that, it gives us a lot of information that is useful. But it does not do it in the most effective way. I believe that if not for certain flaws, it would have been a very useful visualization in understanding homicides. The most obvious flaw is too much information in one visualization. For instance, the City Homicide Rate, National Homicide Rate and City Homicide count range is all present in one single line. This could lead to confusion and result in the reader forming wrong conclusions due to misinterpretation. For example, if we see Cape Town, we are immediately drawn towards its big circle, seeing that we might form a biased opinion that Cape Town is more dangerous than say a city like San Salvador. But in fact, the Homicide Rate for San Salvador is higher than that of Cape Town. Thus, number of people out of the population dying in San Salvador is much higher than Cape Town. Thus, presenting information about these two variables (the homicide rate and actual homicide count) together is not a good idea. Apart from this there are few other flaws. For example, the graphs of the countries on the left indicating the national Homicide Rate look incomplete and crammed up to fit the available space. Apart from the first and last graph of El Salvador and Jamaica, none of the other graphs in between have the upper limit demarcation on the Y-axis, the audience is expected to infer that the remaining graphs also have the same upper limit of Y-axis of 100. The graphs themselves being too small are difficult to read, to figure out the Homicide Rate at a particular point in time.

The article mentions that 43 of the 50 most dangerous cities in the world belong in the three regions mentioned in the graph. But if you count the number of names of cities on the graph, you will find that all 43 names are not present. Also, there are some circles in the graph which do not have names, especially if you see cities in Brazil. There are only four city names mentioned but we can see many more circles than four. The reader could have questions seeing this as to whether the additional circles represent homicides in cities not mentioned in the graph or do they mean something else. This is big inconsistency that may lead the audience to feel confused as to what does the visualization wants to convey. Also, the claim that these countries and cities are the most dangerous is not well supported with data. There is no mention of what is the global median Homicide Rate and how high are the Homicide Rates of these countries mentioned in comparison to this median rate.

I believe the entire visualization could have been broken down in at least 3 individual visualization and told as a story with interactive filters.

The first visualization could have consisted information just about the Homicide Rates of the 10 most dangerous countries (currently conveyed through the tiny graphs on the left side of the visualization). It could have included details of the time varying homicides in the countries and the reasons attributing to it, thus giving a sense of why the homicide rates are quite high in these countries.
The second visualization should have been for the city Homicide Rates in comparison with the national rate. In this visualization, we could simply plot the Homicide Rates of the city and the national Homicide Rate for the country alone without introducing the circles of different shape which cause confusion and potentially mislead the reader. Thus, it would give us the idea as to how each city fares in comparison to its national rate and in comparison, to each other.
The third visualization could be a Map Chart with all the cities and their countries and the size of circles indicating the number of homicides in each city. Using the size of circles to indicate number of homicides and the map chart itself to plot these cities and countries would help visualize the authors claim that Latin American and Caribbean regions remain the World’s most dangerous regions. I believe this breakdown would make it easy to understand the individual pieces of information and the story of how these pieces together indicate the most dangerous parts of the world when it comes to homicides.

Above is a similar visualization found in a 2014 Huffington Post article gives similar insights on 10 countries with highest murder rates. As we can see using the Map clearly conveys the regional dominance of Americas in Homicides that is mentioned in our visualization as well.

Along with the breakdown of visualizations, another improvement would be if there was more information of the global median Homicide Rates, which would have given a clear idea as to how much higher are the rates in these dangerous countries than the global median Homicide Rate.

References: http://www.huffingtonpost.com/2014/04/10/worlds-highest-murder-rates_n_5125188.html

From Clutter to Clarity…

Visualization is a powerful tool that can help tell a story, simplify a complicated data set and make it easy to identify patterns behind those complicated numbers through visual representation. However, if not used judiciously it can very easily over complicate simple things. Visualization is a means to an end and not an end in itself. The goal is not create a stunning visualization but to create a visualization that conveys the intended meaning and in an effective way. The key word is: “EFFECTIVE”

Designing an effective data visualization comes down to a lot of small details that can be the difference between an effective or a lousy visualization. Attention to detail, identifying and understanding your audience and making sure the various elements are aligned and consistent are some such details.

http://s32.photobucket.com/user/nsrivastava/media/Blog2_image_zpsypuxv39s.png.html

figure 1. Viz_1

The above example can perfectly describe the meaning of overuse of visualization tools available at one’s disposal. The above image intends to show the number of paid paternity leaves guaranteed to people in a given set of countries relative to US, which has none.

However, there are multiple issues with the above visualization which makes it ineffective. Let’s examine the following three major issues:

Clutter

Clutter means over complicating things when there is no need. In the above example the different sized pie chart pieces do not add any value to the visualization as they do not provide any new insights that are not otherwise available. Its just adds to confusion diverting the audience’s attention.

Color

Color can be used in a number of ways to convey a point, provide emphasis or compare and contrast different data points. It can also be employed to direct your audience’s eyes to where you want them to go. Color should be used strategically to drive across your point and not to simply beautify the visualization. In the above example, the color instead of making things simpler is complicating it. On first glance, the orange color representing Australia, Venezuela, Kenya and Denmark makes them look like a single country. If one goes by color, it looks like there are only 6 countries being compared.

Consistency

The data points regarding the guaranteed paternity leave changes from days to weeks. For half the countries the number represents weeks and for the rest, the number represents days. This inconsistency can lead to confusion. For example what does the zero in the center of circle with map of US indicates? Is it 0 weeks or 0 days? Also, representing US as a circle in the center while the rest of the countries are represented as pie chart pieces also indicates inconsistencies. It may confuse the audience into thinking that US is not a country or that it is in some way different from the rest of the countries. However, all that the visualization intends is to represent the numbers relative to US.

Following is an attempt to solve the above issues with a different interpretation of the same data.

http://i32.photobucket.com/albums/d27/nsrivastava/Blog2_img2_zpswgfbq5ip.png

Figure 2: Viz_2

This second visualization better represents the data for the following reasons:

Distinct color for each country and a clear legends allows the audience to clearly distinguish each country
The sorted bar graph clearly indicates US at the lowest level with 0 days of guaranteed paternity leaves with Iceland leading the pack with the highest number of paid leaves at 90 days
The paid paternity leaves are represented in number of days for all the countries so that the information is consistent across the visualization making it easy to compare.
The numbers on the bars clearly indicate the actual figures leaving no place for ambiguity or confusion.

References:

https://icharts.net/blog/data-expert-spotlight/data-visualization-essential-info-industry-thought-leader

http://viz.wtf/image/158594346945

http://www.flexmanage.com/2017/03/15/5-ways-for-powerful-data-visualizations/

200 Years that changed the World

Bikram Patnaik

Visualization Link: Money buys Life

‘THE RICH LIVE LONGER EVERYWHERE, BUT FOR THE POOR GEOGRAPHY MATTERS’

Did it ever strike to you that the place our ancestors called their home could have been a matter of life or death for them? Today, everyone knows that rich people generally live longer than poor people because they can afford money to leverage hi-tech medical facilities, but what about people 2 Centuries ago?

We will try to find answers to these questions and discuss if our main claim holds good by exploring this amazing interactive visualization.The visualization which we are about to discuss reviews data from 200 countries and compares life expectancy vs wealth for the past 200 years. The vertical axis shows the average life span in each country ranging from 25 – 85 years, where high up= long lives= good health, to the bottom= shorter life=sick. The horizontal axis shows the average income per person (GDP per capita) expressed in dollars per person per year, where the right=rich and to the left=poor. It’s interesting to see the usage of bubble chart for this, which is primarily used when you represent data that has three or more data series (In this case income, life span and size of the population) and each containing a set of values.

UNDERSTANDING THE DATA:

Let’s explore the visualization and understand it better. On the first look we can figure out that each country in the world is a bubble,the size of the bubbles represent the population size,color represents regions of the world (see on top right side). We start with circa 1800, all the countries had life expectancy less than 45 years and an income less than $4500. We can see that the United Kingdoms & Netherlands were among the richest countries but people in there had short lives. Underdeveloped healthcare systems and poor sanitation attributes to some of the reasons why all the countries had shorter life span and most importantly these acts as a warrant to our claim.

Now as we click ‘play’ the years start to roll in the world. Slowly income start to increase mainly in Europe and North America because of industrial revolution. As a result, they pulled away from the rest of the world. BUT, surprisingly health didn’t get much better. In 1900, only western countries were getting richer and richer and became healthier and healthier. Between World War I (1914) and WWII (1945) the difference between the rich and poor countries increase and it’s only after the WWII that most countries started to change in terms of wealth. The Arab countries became the richest and countries like China and India prosper as a result of their emerging economic growth.

Now in 2017, we observe a continuous world with high income countries (Qatar,Norway,USA) having a high life span and low income countries (Ethiopia, Niger, Liberia) have a lower life span, but interestingly all the countries are estimated to have more than 45 years of life expectancy, which only happens to be the maximum life of people in 1800. Though the difference between high income countries and low income countries are huge but their respective citizen’s longevity have come up significantly.

DRAWBACKS:

Undoubtedly the visualization is amazing in itself, but there are few snags which can alter the statistics if taken into consideration. First, the data collected are only with respect to inter-countries. But what it doesn’t include is the scope to look at the differences of incomes within the regions of a given country which would give insights to it’s growth/downfall. Second, while talking about population size of any country we only take into account it’s current citizens but there is a significant inflow of immigrants in these country every year contributing towards the economy. So there is a high probability that it might give us a different picture altogether.

FROM A CRITIQUE’S VIEWPOINT:

The number of different parameters presented on the interactive dashboard are overwhelming. For a new user it becomes hard and confusing, instead a simple drop down could be introduced to give the audience the flexibility to play around with their desired set of parameters. The vertical lines on the chart needs to be even spaced and the text for year should be at the top to avoid any kind of visual conflict. Also, while toggling on the ‘Map’ tab, it gives us an elliptical view of the globe and the bubble of each country doesn’t sync very well with their respective geographical location. This can be eliminated by displaying a flat world map view and being accurate about the geographical locations.

ALTERNATIVE APPROACH/MODIFICATION:

Though it’s visually appealing there are certain hiccups with this bubble charts visualization as well. It can be further enhanced and made simpler by adopting certain techniques.

The bubbles are opaque in nature creating a problem to clearly figure out countries with smaller population size. So, as an alternative I would recommend to use translucency and highlighting the boundaries of the bubble. These are powerful tools for dealing with over plotting, as you can see this in below visualization.
As we discussed earlier that the visualization doesn’t show the differences within a country, It can be modified by introducing an additional feature in which by selecting a country say United States, it will give you an overview of all the data values for the 50 states along with an appropriate color contrast. The modified version looks like the below visualization.

CONCLUSION:

I feel that this kind of visualization is really helpful when conveying a large amount of numeric information quickly to your audience but at the same time ensuring that viewers are visually literate. An important part of bubble chart visualization is to make sure that it is clear what each element of the chart means – color, circumference, how it fits on the scale otherwise the whole meaning can be lost. Similar approach/viz can be an advantage for organizations to analyze their financial sales with respect to their customer base. It will help them to come up with business metrics and promotional plans for their consumers.

Reference: Harvard Gazette, The New York Times 1, The New York Times 2, MIT News

Smart City.. but not so Smart Dashboard

“Smart city” is no more a buzzword. With the advancement in technologies and devices communicating with each other, thereby generating huge volumes of data, we can render insights to help build a smart city. I came across such a smart city dashboard with feeds showing the current health of London.

http://citydashboard.org/

The dashboard shows obvious stuff, like weather information, pollution level and tube status. There is also a feed from twitter showing whats trending in London. There are chunks of other data like the air pollution level and the FTSE index. Now all these data looks good on a 10,000 ft level, but to better understand why this dashboard was conceptualize, we need to ask two important questions:

What goals are we trying to achieve by measuring all kinds of data?
What data will be most useful to citizens? And how to cater relevant data to right audiences?

The obvious answer to first question would be to have a common platform which provide its user access to important data. To be successful in its purpose, the portal needs real time feed of data. And I have observed several lags in providing real data feeds. As all these machines produce more data, how do we ensure that it can be readily understood and reused by all audiences.

Now let’s look at the other question. The essence of any dashboard lies in identifying its audiences. If the dashboard is used by the regular commuter, the subway data might be useful. They already know when and which train to catch, so even the running status should work fine; but for a tourist this data is useless. They would seek detail information about the subway service. I am also not sure how the FTSE index will be a good information. Below are some more limitations in the dashboard:

Too much information are presented in a small space and has ended up looking extremely cluttered and distracting.
Not all information are relevant to every group of audiences.
The color theme is quite distracting and serves no real purpose and this draw focus away from the data itself. Aesthetically the dashboard is not pleasing.
There are so many variations in the visualization style. There are boxes, line chart, temperature widget all in the same place.
There is no clear focus on any aspect. Audiences are actually seeing a lot of different numbers without getting much insights.

What can be done to make this better:

Identify what data is relevant and deliver it back to the relevant audience. One way of doing it is by giving the users to customize the dashboard as per their preferences.
Present some historical trends that could potentially help users when the dashboard is unable to get any live feed.
Improving the look and feel of the dashboard by using pleasing color, use of uniform visualizations and removal of unnecessary widget.
There should be a note to state briefly what each component do. This improves the overall usability of the application.

References: https://www.opendatasoft.com/2016/10/05/smart-city-dashboards/

Google search in China

Google has revolutionized the way we can search the content on internet. Offering a variety of services like Search, Maps, Apps etc. Google has made life easy for most of us. “Make Google your friend” is the favorite quote used by many. Though Google is well accepted name and a big brand company, its use and acceptance is restricted in China. There are many reasons (including political reasons) why Google has not succeeded in China. Recently I came across a blog where market leaders(in terms of revenue) in China for “Search” were shown. Author wanted to show how Google is NOT the leader in China. Here is the picture.

What I like about this visualization

1] Clear numbers showing “Baidu” has 79% market share. Google is far behind (only 11.9 % share)

What I did not like about the visualization

1] No context – This shows the value in Q3 of 2014. However no information is published why 2014 is taken as a reference/context. Readers like me are kept in dark about Google’s performance over a period of time. Is it increasing or decreasing. Similarly what is Baidu’s performance over years ? Just one year’s data does not give us a whole picture. I feel this is incomplete information.

2] Is this Exploratory / Explanatory visualization – This diagram forgets one of the core principles of visualizations. User does not get any idea if this diagram just explores the data or it explains Google’s presence in china

3] Color selection – Red color indicates alert, alarms or bad things. Here “Baidu” which is market leader is assigned red color (which is surprising). After lot of thinking, I came to conclusion (which may be wrong) – Since China’s flag has red color, and the owner of this visualization wanted to show strong presence of Baidu in China. Hence he kept same color as flag.

4] What does the ring shows – Maybe larger rings shows higher importance. But in that case larger rings should be placed at bottom with smaller one’s on top

5] Inefficient use of space – Reader knows that the blog is about China. So why to again show its flag? I think this is waste of space

I found one more bad visualization of the same data, which is shown below

Circle – Still not the perfect visualization

The above visualization has following problems

1] Shows the figures only for 1 year. No comparison over years.

2] Becomes difficult to compare values. Lot of space wastage (circle is empty in the middle)

How will I create this visualization

1] Data speaks a 1000 words – I will strongly prefer a comparative graph showing Google/Baidu performance over years. So my context will be stretched over a number of years. This will clearly show the increase/decrease in market share of different companies. See the below bar graph. User can clearly understand the performance of Baidu and Google over years. This clearly shows that Google is losing the market from 2012 onwards (16.2% in 2012 to 12% in 2014)

2] Less Space , more information – Use the available space wisely !

3] Color combinations – Use of standard color combination (clear distinction between Baidu and Google). I would still not prefer “yellow” for Google, but its much better than the above graph

References for blog- http://visual.ly/baidu-statistics-and-trends

References for blog – https://www.chinainternetwatch.com/7375/china-search-engine-market-q1-2014/

References for core principles of visualization -https://www.tableau.com/blog/stephen-few-data-visualization

State Tax Ratings

Justin Mungal

Tax is a powerful tool for implementing effective public policy. Few legislative mandates share its efficacy in shaping, seemingly overnight, corporate behavior. Inextricably, it is tied to the notion of the common good insofar as it pools society’s financial resources for funding that vision of social welfare and human well-being. Aside from technological innovation, it stands as one of the greatest formators of our modern economy. For that reason, there is large vested interest in shaping tax code and many a think-tank has arisen around the D.C. metropolitan in order to have a voice at that table of national discussion.

The Tax Foundation recently released its 2017 State Business Tax Climate Index. Their visualization shows a map of the fifty United States of America color coded as blue for the ten worst business tax climates, orange for the ten best business tax climates, and grey otherwise. Also, the individual rankings (1-50) are printed in white on each individual state. The visualization’s goal appears to be to create a KPI based on the results of their study in which they rank states according to 100 variables grouped into the five categories of: corporate taxes, individual income taxes, sales taxes, unemployment insurance taxes, and property taxes. The stated goal of the KPI is to enable tax policy makers to compare their state’s tax system to other American states. The rationale for comparing state tax systems is that most business decisions to move based on tax incentives are intrastate decisions rather than international ones. Thus, the ability to retain business stakeholders is based on the relative favorability of one’s state tax structure to another state’s. Furthermore, by ranking every single state according to 100 variables, states can build themselves a roadmap to improvement based on the differences of tax structure in higher ranking states.

While the visualization is the poster child of the Tax Foundation’s report, I find it exceptionally uninformative. Directly below the visualization they have printed the numerical rankings of each state. This tabular representation of the same data is much more straight forward and easier to digest. For example if I were a tax policy maker from North Dakota, ranked #29, I would have difficulty finding the next best state (i.e. #28) from whom I could learn how to improve my state tax structure. Indeed, one must scour the map until finally locating #28 on the state of Mississippi. Contrast this to the tabular data with its column of overall ranking, where the next best state is easily spotted (loading the table into Excel and filtering the data according to overall rank would make it even easier). Indeed, I find no benefit to studying the visualization over the table, as the mapping of the data essentially scatters the physical location of data whereas the table organizes it. The only benefit of the mapping is that it adds eye-catching color to the eighty-page report.

What I find most disappointing about the Tax Foundation’s visualization is that the report itself is very well done and informative. However, skimming the internet for similar visualizations, I find the Pew Foundation’s:and Wallet Hub’s: maps of state tax data.

The Pew Trust Foundation’s map has more interesting bins by which states are colored and Wallet Hub’s visualization delivers a heatmap; both maps working interactively to show the individual state ranking when the cursor is place over the state. While the Pew and Wall Hub reports cover different domains of data, they point out that a unique perspective on visualizing state tax data rankings is possible. Comparatively, the Tax Foundation’s visualization falls short as it does not offer any new perspective on the table of data immediately below but rather obfuscates those same results for the purpose of soliciting “eye candy.”

Given that the Tax Foundation’s report is high quality, I believe there is room for optimistic hope that their visualization can be improved. Moreover, I personally think that the table of results below the visualization is well organized and sufficiently summarizes their findings. That said, I would add on top of that data another data set that would make the visualization illuminating. For example, the map could be color coded to indicate the hottest states to which businesses relocated to due to tax incentives, with the original report rankings either being numerically printed as they have now or interactively projected as in the Pew map. This layering of data in the visualization would build upon the table and create a convincing argument as to why a state may want to change their tax code and which state’s tax code they should be modeling theirs after. Such a visualization would give state tax policy makers a clearer roadmap to economic success.

Resources:

<https://taxfoundation.org/2017-state-business-tax-climate-index-released-today/>

http://www.pewtrusts.org/en/multimedia/data-visualizations/2014/fiscal-50#ind0 and

https://wallethub.com/edu/best-worst-states-to-be-a-taxpayer/2416/

Belief in Evolution Vs National Wealth

Akshar Takle

From Calamities of Nature comes this bizarre graph relating national wealth (Gross Domestic Product) and belief in evolution, with each dot representing a country (Countries in same region have same color dots).

X-axis: GDP per capita

Y-axis: Share of people believing in the evolution theory

Share of people that thinks that humans have evolved from other animals vs GDP per capita of the country. pic.twitter.com/QZu80BkBlT

— Max Roser (@MaxCRoser) April 22, 2017

But this “enlightening” graph is probably enlightening us in a misleading way. The visualization lacks the story and hence its motive or goal.

What does the chart convey? Does being rich make you believe in evolution theory?

The relationship needs to suggest that countries that are wealthier, and whose inhabitants are doing better, have less impetus to be religious and hence less rejection of evolution theory. The missing links that could form a story would be: GDP per Capita -> Percentage of people who are educated -> Percentage of people adhering to scientific discoveries or evidences than religious beliefs -> Belief in Evolution.

We are not even talking about the elephant in the room – acceptance or condemnation of a person’s religion / dogmas to the evolution. There can be a sizable amount of people who are poor, not much educated and still believe in evolution because their religion / dogmas has nothing to say about that.

The graph does not show a robust picture. Most of the countries shown in graph represent the Abrahamic religions. What about the other countries? There are more than 100 countries with GDP less than 10000$. It would be interesting to see how it applies to the rest of the world.

There are also some doubts about that data- in specific how it was collected ? what was the sample like? what age groups? Younger people would be more acceptable and open towards the evolution as compared to their previous generation. Also in one of the studies conducted by Pew Research and NRK shows that 60% of Americans and 80% of Norwegians believe in evolution theory. That places USA (which is currently an outlier) between Sweden and Netherlands. A poll conducted by global research company Ipsos for Reuters News finds that four in ten (41%) identify as “as ‘evolutionist’s’ and believe that human beings were in fact created over a long period of time of evolution growing into fully formed human beings they are today from lower species such as apes.” Three in ten (28%) global citizens refer to themselves as “creationists and believe that human beings were in fact created by a spiritual force such as the God they believe in and do not believe that the origin of man came from evolving from other species such as apes”. Almost one third (31%) of the global population indicate they “simply don’t know what to believe and sometimes agree or disagree with theories and ideas put forward by both creationists and evolutionists”. This makes us seriously question the data itself.

While the correlation is really interesting and fun, it doesn’t really get to the point. Many of us would love to take away that accepting evolution theory would make us rich.

References:

http://www.calamitiesofnature.com/archive/?c=559

http://www.pewforum.org/2013/12/30/publics-views-on-human-evolution/

http://www.ipsos-na.com/news-polls/pressrelease.aspx?id=5217

Drink More Water, Save Some Money

Introduction

Three California cities including San Francisco, Oakland, and Albany, were under debate last year, on whether to pass a penny-per-ounce tax on sugary drinks (a fact update, in November, 2016, the Proposition passed in all three cities). The tax would have impacted various sectors including consumers (higher income vs. lower income), beverage companies, and the government. Below visualization was done by The Pew Charitable Trusts studying the percentage movement on sugary drinks and water in Berkeley (where soda tax was imposed previously) versus SF/Oakland.

Soda % Change In Comparison to Last Year Berkeley Oakland/SF

Impression

One of the biggest principles that I have learned thus far in the class is there is always an argument of what you want to achieve with the data, and with different audiences, you have different objectives.

I believe this organization (The Pew Charitable Trusts) had a stance of pro-soda tax. Knowing the organization’s perspective on this issue, this visualization is fairly effective on conveying its believe. The chart clearly shown that in the five months after Berkeley passed the soda tax, sales of sugary drinks decreased when compared to the same time previous year. The only drink experiencing growth in sales is water.

Improvement

In my opinion, this graph conveys effectively in general. However, we can incorporate other aspects of the data set to target the needs of these two groups: the beverage companies and the government.

Beverage Companies:

In this article, it pointed out the increased on soda tax would impact more on lower income households because price tends to be the deciding factor on which product to buy. As a matter of fact, Berkeley saw 21% drop in sugary drink consumption in the month after the tax was implemented.

Another concept we learned from the class was it does not have to always be 0 or 1 (e.g. global warming or not, to pass the soda tax proposition or not). In this case, beverage companies could not only focus on opposing the proposition, but to think how to lose the least amount over this proposition.

From Berkeley’s stats, we know lower-income households consumption might decrease significantly after the soda tax. I would revise the graph and analyze data from each store and identify data locate in lower income neighborhood. We can then compare the historically sales in those stores and come up with strategies accordingly.

Government:

The purpose of the tax was to increase government income. However, with people shifting to buying water (no tax), the government might not get much out of this proposition. Therefore, for the government, the analysis could be a prediction of SF/Oakland’s tax using Berkeley’s historic performance as a baseline.

References

http://www.pewtrusts.org/en/research-and-analysis/blogs/stateline/2016/10/17/sparring-over-soda-tax-cities-set-referendums

https://ballotpedia.org/San_Francisco,_California,_Soda_and_Sugary_Beverages_Tax,_Proposition_V_(November_2016)