Why visual exploration needs mostly experts to create visualizations?

Visual exploration is dealing with open-ended data-driven visualizations that needs experts like Data Scientists, and Business Intelligence analysts. Although new tools have begun to engage general managers in visual exploration. It’s exciting to try, because it often produces insights that can’t be gleaned any other way.

During this exploration we don’t know what we are looking for, these visuals tend to plot data more inclusively. In extreme cases, this kind of project may combine multiple data sets or load dynamic, real-time data into a system that updates automatically. Statistical modeling benefits from visual exploration.

Exploration also lends itself to interactivity: Managers can adjust parameters, inject new data sources, and continually revisualizes. Complex data sometimes also suits specialized and unusual visualization, such as force-directed diagrams that show how networks cluster, or topographical plots.

Skills like analytical, programming, data management, and business intelligence are more crucial than the ability to create more presentable charts .These skills are crucial for managers to help setup systems to wrangle data and create visualizations that fit their analytic goals, and therefore it mostly needs experts to create visualizations.

Source: https://hbr.org/2016/06/visualizations-that-really-work

International Animal Trade

This visualization reveals the secrets of animal trading around the world in 2013. It was created by National Geographic and Fathom Information Design, which believe the visualization is useful for researchers and could helping policy makers view animal trade in a different light.

The visualization uses packed circles, which is similar to the classical idiom of Bubble Chart, but the difference is it has interactivity. The data is shown in a hierarchical order, with large animal groups, such as bird and mammals shown initially. When you click on circles, it will drill down to specific species.

The graph only uses one mark and three channels. Each mark presents a specific specie, and surrounded by larger mark which represents a specific group of species. The channel of size encodes the volume of trade and the color is to differentiating the animal species. The channel of hue is applied to represent the purpose of trade. However, one important information, the change rate of trade was not encoded here. I believe that element will show up in its next version, which could provide their audiences with more insights.

Reference:

http://news.nationalgeographic.com/2015/06/150615-data-points-infographic-animal-trade/

 

Pastries, Eat them but done use them for Charts

Donut charts and pie charts are very similar and some would even say they are the same type of chart. I am in the camp that says that they are the same and therefore I have approached donut charts the same way I have approached pie charts. I do not use it.

The things about donut charts and pie charts is that pie charts are in actuality better than donut charts. Donut charts are basically pie charts with a hole in the middle. While this might not seem that big of a deal, it however reduces the amount of information that you are seeing and makes it even more confusing. For example in the figure above, by removing the center of each circle, the chart is presenting the information with only a tiny portion of information. It becomes harder to see the size of each section and how they compare with each other. Of course while the information about each sections size is represented as a number, this could have easily been done in a bar chart instead. So for future references, stay away from all charts that have a name related to baked goods. Its far more satisfying to eat them then to use them as a visualization tool.

Reference:

http://www.vizwiz.com/2012/06/donut-charts-are-worse-than-pie-charts.html

 

Telling Stories or Solving Problems

The data visualization landscape can be divided into two broad categories:

Hedonistic Visualization: It only shows how cool something is or represents an “Interesting to know”  information. Here is an example – 

https://fivethirtyeight.com/features/when-donald-trump-attacks-gop/

Narrative Visualization: It supports a narrative (often journalistic, sometime scientific). Data Journalism is all about telling a simple story in the most attractive way to entertain the audience. But the problem here is there is a high chance of over exaggeration to bring out an interesting story. While data journalism is a tough art, the other extreme of narrating a scientific story is harder. In order to demonstrate the hypothesis scientists have generated, they face the challenge of where to stop overloading the audience with too much information. Do they have an option? Will the scientists be able to convince the audience if they condense their visualizations?

How about Problem Solving Visualizations?

This is a less commonly known category that is gaining importance. Singapore’s MRT Circle Line was hit by a spate of mysterious disruptions in recent months, causing much confusion and distress to thousands of commuters. Data scientists at GovTech’s Data Science Division in Singapore used visualization to discover the origin of this recurring problem. The below article discusses about the interesting case:

https://blog.data.gov.sg/how-we-caught-the-circle-line-rogue-train-with-data-79405c86ab6a#.sebeshx7o

Important point to note here is for most of the problems in automated systems, solutions really need to come from human reasoning(in this case through a series of visualizations). Hence visualizations are very powerful problem solving tools.

References:

http://fellinlovewithdata.com/

https://blog.data.gov.sg/how-we-caught-the-circle-line-rogue-train-with-data-79405c86ab6a#.sebeshx7o

When Trump Attacks!

 

Interactive Investor Dashboard

 

CrunchBase is one of the most widely-used databases of technology companies, people, and investors.  Financial companies and entrepreneurs use this dashboard to analyze investments, acquisitions, and start-ups all in real-time, and make the smartest investments possible.

Good points about the dashboard.

  1. The dashboard conveys all the important KPIs for the audience (Investors) in a very simplified manner.
  2. It solves the purpose of being interactive. Investors can filter by country, categories, type of investment, investment rounds, etc for global crunchbase data in a comprehensive manner.
  3. It has a summary on top of the dashboard which means the user gets an overall view of what is happening in the data. For e.g. Here, the important factors like total money raised, the number of investments, Median raised and No. of startups and their percentage growth which gives us an overview of the yearly change in the data.
  4. One color legend for the entire dashboard.

Improvements:

  1. Instead of representing the Funding by round graph in a pie chart, the user can use a different idiom like a bar chart which gives a comparative idea for each funding by round.
  2. Instead of mentioning the color legends at the bottom, the user could have defined color legends at the right-hand side so that the user of the dashboard is not required to scroll every time he wants to see what color is represented by what category.

 

Reference:  https://www.sisense.com/dashboard-examples/investor/

 

Most Popular Programming Languages

Github has recently release a graph showing the changes in popularity of different programming languages on its platform from 2008-2015. Github is an online repository offering revision control and source code management on the web for its users. Most of the budding developers have been using Github over the years to host their project source codes. Looking at Github’s popularity, various coding boot-camps and academies are now using this graph to decide on their course curriculum.

The graph helps us to understand the various languages which have been on Github since 2008 and their changing trends. We can see that, Ruby and JavaScript have been amongst the top since 2008, with JavaScript still ruling the charts. Also, we can see that the languages like Perl and Objective-C have become extinct on this platform over the last few years, while CSS and C# have gained their place on the chart. Java has also gained a lot of popularity of Github, starting at 7Th position in 2008 to becoming the second most popular language by 2015.

The graph, in my opinion, is very informative and can be a helpful resource for its audience. It is simple, easy to understand and very clear in its claim. Students who want to learn a programming language with the end goal of securing a good job can use this graph to choose amongst the top few languages.

Reference: https://mybroadband.co.za/news/software/136148-most-popular-programming-languages-on-github-2008-to-2015.html

https://www.fullstackacademy.com/blog/is-the-programming-language-taught-at-a-coding-bootcamp-important

Too Late to Start?

Human life span is considered as an average of 79-80 years, so what is the right age to begin a new venture to achieve success. When I was looking for ages at which entrepreneurs started their companies and made it big I found this very interesting visualization.

From the biographies of top 100 founders on the Forbes List they have found that 35 is the most common age to start one of the top companies in the world. The result is a bell curve, just like in school most people get grades somewhere in the middle, in life most people succeed mid-life, that is about 35, for the current generation.

 

 

The above visualization is interactive and puts together all the right things that we need in an interactive viz. The circles highlighted on click shows us the age, name and the company started by the entrepreneur. The most impressive thing conveyed from the viz is the claim that it poses. It is the right way to target your audience and deliver your message, and it does everything right to the dot.

 

This was just an interesting find that I wanted to share with the class. With the quarter almost coming to an end, we have almost figured out the dos and don’ts for visualizations. This is one such viz that made me think of how far I have got from where we started.

 

http://fundersandfounders.com/too-late-to-start-life-crisis/

Data visualization and its analysis – Descriptive, Predictive and Prescriptive.

As data science students, all of us have heard the term predictive analysis which is basically forecasting or predetermining data for the future based on the trends and patterns of the past few years.As decision makers, stakeholders want to know what next lies in store for them in terms of the company’s future. Data visualizations on company’s performance, market value, stock prices are all indicators of what could happen next.

But there are two other dimensions to the analysis of any data or visualizations. Both are seldom heard and mentioned in market, and yet happen to have an immutable importance in the field of data analytics. These are descriptive and prescriptive analytics.

Descriptive analytics uses data aggregation and data mining to dig into the past data and understand “what exactly happened”. Prescriptive analytics on the other hand, use simulation to find alternatives and possible outcomes and answer “what can we do”.

Most companies conduct descriptive or predictive analytics on their data, mostly because they are trying to figure out what went wrong and what will be the future effects of it. They also hire professional experts to do the job of suggesting different recovery strategies, recommendations, and providing them with the best advice. However, the field of prescriptive analytics is relatively new and slowly getting its long-due attention. We rely on professionals and years of their knowledge and experience to prescribe what’s the best move for our companies. Predictive models, computational modeling and algorithms however are getting their long-due share of recognition as a more reliable and congenial way of approaching a business problem solution. They say to err is human and it has been proven right time to time! With all the amazing progress in data analytics, it is time now to move over human expertise and use prescriptive analysis.

References:https://channels.theinnovationenterprise.com/articles/data-analytics-top-trends-in-2017

The American Workday in One graph-When are they really working?

This is the visualization based on the survey conducted by the government about American Time Use. It shows how people spend their days means exactly at what time they work.

I found this interesting mainly because of the use of interactivity. Also, the distribution of work schedule is rightly displayed by histogram. Using two filters user can analyze how much is the difference in work schedules for different occupations.It seems overcrowded at first sight but use of highlighters and shading have made it easy to perceive.

We can see most of these occupations fall under conventional work shift 9 a.m.- 5 p.m. But emergency services (police officers, fire fighters) have higher share of work till midnight.

Another interesting thing is we can see who takes lunch break most seriously and who are workaholic. And obviously, this is peak time for chefs and food services.

I think this graph can be made more appealing if it shows comparison between countries as well. That will be interesting to know cultural differences in work time. Another limitation of this data is for white collar work, the line between life and work can be blurred. For example, lunch or dinner with client can be considered as part of work. This throw a wrench into how work hours are measured overall.

Source: http://www.npr.org/sections/money/2014/08/27/343415569/whos-in-the-office-the-american-workday-in-one-graph?/templates/story/story_php=

Aesthetics or Content; What is Important?

For today’s blog, I have picked up an Info-graphic by TIME which was published close to Women’s Day in 2015. The graph shows how women are represented in politics after 95 years of getting the right to vote.

To some, this visualization might be very engaging but I see many pitfalls in this graph.
Firstly, I feel is the designer targeting the right audience? Is a reader who reads this article just to enjoy pictures with little concern for content and information the appropriate audience for this visualization?
Secondly, the info-graphic shows eight different measures of women’s participation in government and each of the measure is expressed as percentage of female v/s male participation. If they are all same, I do not understand the need to plot these differently.
Thirdly, in the process of making the chart engaging, the designer has exceeded the boundaries of single screen. Information is more powerful when seen together at the same time; this not only saves viewer’s valuable time but also paints complete picture and important connections that may not be visible otherwise.
Fourthly, there is inappropriate choice of media, just to create a variety designer has added pie charts which are a bad choice as already discussed in class.
Lastly, the color choice is misleading. At first glance it makes you think it has something to do with Democrats versus Republicans, while the graph has nothing related to it.

In the end, I feel a simple bar chart with all eight measures would have been an excellent visualization choice. Also, sorting data in order would actually make visualization more meaningful, as the viewer can then judge areas where women representation is best or worst.

References:
http://time.com/4010645/womens-equality-day/
http://www.datarevelations.com/tag/stephen-few
https://www.perceptualedge.com/articles/Whitepapers/Common_Pitfalls.pdf