Simple is not always better!

https://www.census.gov/dataviz/visualizations/035/

Analysis

The Description

This graph explores variations in high school education attainment within selected race and Hispanic origin groups by gender and nativity between regions within US.

Purpose of the visualization

Compare and Contrast: Attainment of a high school diploma (or equivalent level of education) is generally very high in the U.S., so this graph focuses on the percentage of the population 25 and older who do not have a high school education.

What’s good?

  • Clear and concise heading and legends, and no unnecessary embellishments.

What’s Not-so-good?

Aesthetics:

  • There is no consistent color palette that has been used.
  • The visualization is static and shows a lot of information but does not highlight any insight or actionable information. Basic Annotations and highlights can help hit the message home.
  • A lot of white space

Information:

  • There are a bunch of additional points that aren’t readily view-able in the data but become visible once the data is presented in a more detail oriented format. For example, there are notable differences between foreign-born and native population among many groups, in the West, and 57 percent of Hispanic foreign-born males had less than a high school education compared with 19 percent of Hispanic native-born males. Nineteen percent of Asian foreign-born females had less than a high school diploma compared with 5 percent of Asian native-born females. However this is not easily understood from the visual. In short, this visualization could use a fair amount of details so that the intended purpose of the data can be achieved.
  • A lot of information is spread out which doesn’t allow for easy comparing and contrasting between different classifications.

A better Version:

 

http://i32.photobucket.com/albums/d27/nsrivastava/Blog4_zpsqo6uvgfr.png

Cleaning up the data to create new category that identify gender and nativity helps combining the charts into a single graph which helps in presenting a consolidated view of the data.

Using Filters a lot more insights can be gained which are not possible from the original visual. A few such insights are:

  1. Foreign Born females and Males from the west region has the highest percentage of people without High School Education.
  2. Among Native Born, North East lags behind with the highest percentage of population without High School Education.
  3. Overall, there is difference in avg percentage of people without high school education between Native and Foreign Born

What can be improved further?

Spatial Context

Depending on the information that is to be conveyed, it makes sense to display a map for visualization depicting geographical information. Visual cues are always easy to read and understand. Since the data is for US regions, showing this information on a US map divided into 4 regions helps the audience connect and identify with the information.

Icons, shapes, and symbols

A picture tells a thousand words. The use of icons, shapes, and/or symbols can improve visualization’s readability and also helps in capturing the attention of the audience. There’s a thin line between graphics that enhance a data visualization and junk, but when done tastefully, graphics have the ability to provide much more information than words alone.

Symbols, icons not only make the visualization more engaging, but they also provide the advantage of reducing, and often eliminating, language barriers.  In the above visualization, using the universal symbol for male and female can help even those with language issues to identify and compare the percentages for male and female.

Conclusion

It should be carefully considered as to what is the best type of visualization for the piece of information or data set that needs to be presented. While ease of understanding should always be a consideration, ensuring that the visual conveys all the relevant information and provides the gist of what the underlying data is trying to showcase is also extremely important.

References:

https://extension.org/2017/04/11/7-elements-of-good-data-visualization/

 

 

 

 

 

Visual Problem Solving

 

https://www.globaldatavault.com/blog/information-destruction-history/

Analysis

Let us analyze the above visualization against the objective and the subjective dimensions of visual problem solving.

OBJECTIVE DIMENSION:

What is purpose of the visualization?

This visualization is created by a company that offers digital Backup and Disaster Recovery Services. The purpose of this visualization is to show the significant information losses suffered by the human civilization throughout History. The intended purpose of this visual is to convey to the audience the need of protecting information loss from disasters such as wars, floods, fire etc.

Who is the audience?

The audience for this visualization would be the service providers’ target customer base which could be any large corporation that stores or possess a huge amount of data/information.

How will the visualization help the audience?

The intended purpose of the visualization is to emphasize that disasters are a big threat to data and the importance of having some backup and data recovery plan. However, the service provider aims to use this  visualization to get the attention of its potential customers and make them interested in its offerings. However the visualization fails to achieve this purpose. This visualization can be good only as a simple representation of certain facts and as a way of increasing general knowledge but it is not relevant in the current digital age context.If you don’t offer the right context to the users they can’t do anything with data visualization.The way information is stored in the current digital world is entirely different from that of the early days when libraries were the only source of storing and accessing information. With Internet, cloud technology and everything virtual, war and fire are not the biggest threat to information. Today’s data is vulnerable to being stolen, destroyed or compromised by disgruntled employees, competitors, terrorists, criminals and malicious hackers and the above visualization does not show any of these aspects.

SUBJECTIVE DIMENSION

Is it Truthful? No, there is always a certain amount of subjectivity that goes into any visualization as one chooses what data to show and how to show it. By focusing on one part of the data, one might inadvertently obscure another. The above visualization presents destruction of libraries (main source of information/data storage) and correspondingly loss of data across major cities from 600 BC till 2013. There are some questions that can be asked about what’s been shown and what’s not  :Was information truly lost in those fires? What about copies of the books destroyed which were kept elsewhere? Are these the only major incidents of data destruction’s due to disasters? What about the loss of information due to other disasters such as earthquakes, floods etc.?

Is it Functional? No, the visual looks a bit cluttered and busy. One major flaw that comes in the way of the visual being functional is that when one reads the title, “Information destruction through History”, one expects a visual that shows time progression whereas the visual displaying the world map just adds to the confusion. The lines connecting the location on the map to the corresponding information also create a clumsy look.

Is it Beautiful?  Yes and No, on first glance, the visualization looks interesting and may capture the attention of the “corporate audience” but it can also backfire as it may not look serious enough . The symbol used to show the destruction by fire is clearly understood. However, the symbol used for “bombing” looks more like a torch which again can represent fire. ‘Aesthetics’ depend on the specific audience to whom the visual is targeted and their preferences should always be kept in mind while choosing look and feel of the visualization.

Is it Insightful? No. The important criterion for visualization is whether through its use we can see something that would have been harder to see otherwise or that could not have been seen at all. A simple representation in form of numbers could have provided the same insight that a lot of historical data was lost during wars and due to wars.

Is it enlightening? No.  The visualization does not help in answering any specific question and neither does it unearths any new information that could not have been found, had the data not been presented in the way its depicted in the above visualization.

Conclusion

Good data visualization should enable decision makers to grasp difficult concepts or identify new patterns. There are many ways to visualize data, new tools and chart types appear constantly, and each strives to create more attractive and informative charts than before. However, focusing on the principle that a visualization should clarify and summarize the main message rather than confusing and overloading the reader with superfluous information is the key to make an effective visualization.

 

References:

http://www.datapine.com/blog/misleading-data-visualization-examples/#

https://flowingdata.com/2011/09/23/5-misconceptions-about-visualization/

https://www.elsevier.com/connect/a-5-step-guide-to-data-visualization

From Clutter to Clarity…

Visualization is a powerful tool that can help tell a story, simplify a complicated data set and make it easy to identify patterns behind those complicated numbers through visual representation. However, if not used judiciously it can very easily over complicate simple things. Visualization is a means to an end and not an end in itself. The goal is not create a stunning visualization but to create a visualization that conveys the intended meaning and in an effective way. The  key word is: “EFFECTIVE”

Designing an effective data visualization comes down to a lot of small details that can be the difference between an effective or a lousy visualization. Attention to detail, identifying and understanding your audience and making sure the various elements are aligned and consistent are some such details.

http://s32.photobucket.com/user/nsrivastava/media/Blog2_image_zpsypuxv39s.png.html

figure 1. Viz_1

The above example can perfectly describe the meaning of overuse of visualization tools available at one’s disposal. The above image intends to show the number of paid paternity leaves guaranteed to people in a given set of countries relative to US, which has none.

However, there are multiple issues with the above visualization which makes it ineffective. Let’s examine the following three major issues:

  1. Clutter

Clutter means over complicating things when there is no need. In the above example the different sized pie chart pieces do not add any value to the visualization as they do not provide any new insights that are not otherwise available. Its just adds to confusion diverting the audience’s attention.

  1. Color

Color can be used in a number of ways to convey a point, provide emphasis or compare and contrast different data points. It can also be employed to direct your audience’s eyes to where you want them to go. Color should be used strategically to drive across your point and  not to simply beautify the visualization. In the above example, the color instead of making things simpler is complicating it. On first glance, the orange color representing Australia, Venezuela, Kenya and Denmark makes them look like a single country. If one goes by color, it looks like there are only 6 countries being compared.

  1. Consistency

The data points regarding the guaranteed paternity leave changes from days to weeks. For half the countries the number represents weeks and for the rest, the number represents days. This inconsistency can lead to confusion. For example what does the zero in the center of circle with map of US indicates? Is it 0 weeks or 0 days? Also, representing US as a circle in the center while the rest of the countries are represented as pie chart pieces also indicates inconsistencies. It may confuse the audience into thinking that US is not a country or that it is in some way different from the rest of the countries. However, all that the visualization intends is to represent the numbers relative to US.

Following is an attempt to solve the above issues with a different interpretation of the same data.

http://i32.photobucket.com/albums/d27/nsrivastava/Blog2_img2_zpswgfbq5ip.png

Figure 2: Viz_2

This second visualization better represents the data for the following reasons:

  1. Distinct color for each country and a clear legends allows the audience to clearly distinguish each country
  2. The sorted bar graph clearly indicates US at the lowest level with 0 days of guaranteed paternity leaves with Iceland leading the pack with the highest number of paid leaves at 90 days
  3. The paid paternity leaves are represented in number of days for all the countries so that the information is consistent across the visualization making it easy to compare.
  4. The numbers on the bars clearly indicate the actual figures leaving no place for ambiguity or confusion.

 

 

References:

https://icharts.net/blog/data-expert-spotlight/data-visualization-essential-info-industry-thought-leader

http://viz.wtf/image/158594346945

http://www.flexmanage.com/2017/03/15/5-ways-for-powerful-data-visualizations/

 

Data Visualization: A HIT OR A MISS!

Data visualization allows us all to see and understand our data more deeply. That understanding breeds good decisions. It can be a great way to drive numbers home and give them a visual weight mere statistics don’t have. At least, that’s what happens when they make sense. However, sometimes visualizations may look good but are simply unnecessary and miss the point completely. To take an example, we have the following visualization from the Washington post article, showing 100 years of hurricanes hitting and missing Florida.

https://www.washingtonpost.com/graphics/national/one-hundred-years-of-hurricanes/

The above visual aims to depict every single hurricane over a period of 100 years that had hit or missed Florida. Each line in the above visual represents a hurricane.  However, it is unclear as what is it that the visual is trying to achieve as it doesn’t show the number of  storms that missed of hit Florida over the past 100 years. Let’s analyze the given visual on two main visualization criteria that it completely fails on:

CLAIM: All visualizations must answer a question, make a claim or provide some insight that wasn’t available or accessible without the visual representation. The article using the above visualization claims that Florida is the landmass of choice for storms. However the visual doesn’t provide any support for the claim. It doesn’t tell the number of storms that have hit or missed Florida and nor any correlation with time or location.

VISUAL AESTHETICS:  From a customary look it simply looks like a child was let loose with a pen in his hand and was told to have a go at it. “Florida” which is the center of the discussion is not even visible with the white base and white lines demarcating Florida and its neighbors. The lines depicting the path each hurricane followed are all overlapped and do not provide any helpful information to predict the path of any future hurricanes. The darkened line depicts the latest

A Better Depiction… “Tracking the Paths”

http://pparker.org/hurricanes/hurricane_history.htm

The above visualization depicting similar information regarding storms  that have hit Florida over a period of time and the path they followed, despite being visually unappealing is still much better than the previous one as it provides useful information that can be acted upon to make certain decisions. The above visual clearly represents the year of the storm (category 3 and above) and the path it took. The highlighted region in the center of the map depicts the counties most affected by the storms and thus provides useful information. For example, while developing evacuation plans the counties highlighted can be prioritized.

In conclusion, while it is important for the visualization to be appealing with pretty colors, fancy charts, and cool pictures in order to capture the interest of its audience, but if the visualization doesn’t give quick insights that aid decision making, it’s not really very effective and defeats its very purpose.