Facebook Friendships

This visualization is extremely interesting with good aesthetics. As was discussed in class last week, this visualization covers most of the important aesthetics concepts such as getting it right in black and white (almost), no unjustified 3D and resolution over immersion.

Facebook worldwide friendships mapped

However, it is not as simple as it looks like. A lot of background analytics have gone into consideration before preparing this visualization. Let’s see how to decipher this.

Firstly, weights were defined for each pair of cities as a function of distance and the number of friends between them. Then the cities with were connected using the count of number of friends. The cities with the most friendships between them have been drawn on the top of others. The color ramp has been beautifully used so that the lines are created depending on the weights; which also means that the stronger the connections, the lines would be more visually prominent.

However, there are some fundamental problems with this visualization. Firstly, there is no legend or text representing what the visualization is all about. There should be a mechanism for the audience to know what it wants to assert basis the color, thickness and degree of shading of the connected lines. Secondly,  few areas on the map show no lines and is dark. This may be due to the fact that Facebook has not reached those locations or the usage is not prominent in such countries or the data is unavailable for all such locations; which is not clear from the infographic.

The visualization could be improved by making it more interactive. A highly visual dashboard like this should enable the audience to perform basic analytical tasks such as drill down and examine the underlying data. For example, if one wants to zoom in and see the number of the friendships within the country or with another particular country; one should be able to do that.

Immigration and banned countries

Recently President Trump released an executive order to ban immigrants from seven countries. This visualization is simple yet powerful in conveying how it will impact the immigrants who already are living in USA and what is their education, salaries, etc.

Demographics for immigrants from banned countries

The immigrants from these seven countries constitute to about 2% of the total population of USA. The dashboard shows the percentage of the immigrants and their level of education and comparing them to the US national average. It can be seen that immigrants from Iran, Libya and Syria with advanced degrees is higher than the US national average in this domain.

Further analysis shows that residents from Iran and Syria are more likely than the population to be engineers, managers and teachers. These immigrants are also scattered in almost every state. With the US Median salary for such blue collar jobs is $54,645 pa; the salary of Iranians in the same job bracket is over $65,000. 

The dashboard also shows that the figures for Iran residents is higher than the other six banned countries, because the number immigrants from Iran prospered from 1980’s to 2010, which means that the higher the number of immigrants; higher with be the absolute number of managers, engineers and people holding blue collar jobs. As discussed in lectures, in this case, enumerating the figures in ‘percentage’ or ‘average’ is a better representation of these statistics.

Further, the representation of now-citizens has been appropriately depicted in percentages, most immigrants have now become residents of the United States. Further, about 10,000 of these immigrants have also served in the US Army. Also. the residents are also scattered geographically, with no specific area of concentration.

As per news reports from NY Times, more than 856,000 people have been affected by this ban but only 3 of countries were known to be in violent attacks since 2001.  Most accused have been from countries not listed in the ban and many were born in the United States.

We will have to wait and watch on how the ruling will actually affect the immigrants, visa holders and permanent residents.

Real time Web Monitor

Today’s blog is about the real time information about the traffic and web attacks worldwide.

This activity is performed by a company called Akamai. It constantly monitors the internet conditions on these two parameters worldwide and presents on the graph real time.

https://www.akamai.com/us/en/solutions/intelligent-platform/visualizing-akamai/real-time-web-monitor.jsp

These two graphs serve the following purposes:

  • Monitoring greatest web traffic
  • Cities with the slowest web connections also known as latency
  • Geographic areas with the most web traffic also known as traffic density

This visualization is interactive and one can look at the network traffic and attacks country wise.

Analyzing this visualization, one can observe that the highest network traffic is in the UK and European subcontinent. However, the maximum number of attacks is in California with an average of 1,423,212 attacks per 24 hours.

However, it seems like this monitoring tool focusses on only certain areas and does not provide a comprehensive overview of the attacks in countries like Canada; South American and African countries. Further, no information is available for the network traffic in the Indian subcontinent. This does not mean that there is no network traffic in those areas but it means that the comprehensive data is not available for all the countries.

Simple but Misleading

This is the graph showing valuation of Facebook, Inc. Though it is simple bar graph you can identify many problems in this visual.

http://blogs-images.forbes.com/naomirobbins/files/2011/11/press-005-021.jpg

The very first thing you will find is the truncated vertical axis. Normally we judge values of the bars in the bar graph by its length. Here the second bar appears to be twice as high as of the first bar. One can conclude that valuation is doubled from December to January. Every bar graph needs a zero on its scale.

Second, the horizontal axis is evenly spaced but dates are confusing the reader. There is one bar for December, one for January, none for February, two for March, and so on. Therefore, the trend that results from following the top of the bars is distorted. Mainly, the high valuation of $84 billion appears to hold for a long period, when in fact the total time at this value was less than a month (June 22 to July 19).

Last thing is excessive use of dollar sign on the vertical axis and on data labels. Instead of showing it twenty times in graph, they could have mentioned the scale as ‘dollar units’. These are not serious problems but it does give misleading information.

Source: http://www.forbes.com/sites/naomirobbins/2011/11/17/whats-wrong-with-this-graph/#7acf99f199d4

 

Should we trust what we see?

The following graph is from Bloomberg (2013); which for many is a trusted source. Unfortunately, even this trusted source has misused power of statistics to deceive people.

Looking at this graph, a common man would be highly concerned with the slope depicting sharp decline in median income for U.S. men but in true sense there are more flaws with the graph than with the fact depicted.
The first flaw is regarding incomplete information. The designer has only shown 2 data points and no information is depicted about what happened in middle years. On investigating more from U.S Census data one can see that median income was actually stable between 1972 and 1999 which is contrary to what designer has depicted. Also, for age 45-54 there was actually an increase in median income till 2000 and only after that there was a decline in the income.

The second flaw is with the y-axis. The designer has deliberated truncated the y-axis so as to magnify the gap. If the same graph is seen making y-axis start from zero, the decline doesn’t feel much and our perspective about the problem changes.

Lastly on investigating on the data more, we find that from 1947 to 1972 there was steady increase in median income and since 1972 (end of Gold standard) there has been a slow decline in the number. The designer has deliberately chosen 1972 and 2012 to catch attention of its readers. The same news can be changed to “Income for men has risen” by giving 1947 and 2012 as new data points.

References:
Image & Article Source: https://www.bloomberg.com/news/articles/2013-12-31/for-u-s-men-40-years-of-falling-income
Other Source: https://medium.com/i-data/misleading-with-statistics-c63780efa928#.qaw475rwg

Interactive graphs

Interactive graphs are very useful in interpreting different size and portions of different subdivisions, enabling users to explore the data themselves.

To take an example,  figure 1 is Google’s work of displaying a variety of music genres waxing and waning in popularity for the recent 5 years. Based on the user data collected from Google Play Music about the number and genres in their library. The covered genres include from folks and country to R&B and rock. Every genre was presented by a stripe. What interesting is that user can click on the stripe so that we will be able to have a deeper insight of the single genre.

6-music_timeline

The combination of data and timeline is worth noticing. similarly, Twitter’s engineers are good at this. The graph depicts the hottest keywords on twitter of President Obama’s 2014 State of the Union. Very powerful.

7-SOTU_2014

reference: http://blog.udacity.com/2015/01/15-data-visualizations-will-blow-mind.html