History of popular Music- A Musicophile’s treat

Bikram Patnaik

Google Play Music timeline 

What would you do if you want to listen to an old classic country music on Friday evening after a hectic day at work?

Generally we turn up to various websites, blogs, TV channels and radio which lists the top 20 music tracks/albums of a particular year but, Google Play Music’s makes it easy with it’s new Music Timeline visualization which gives us a bird eye view of our past musical favorites and gives us a chance to revisit them. It helps us visualize which music has stood the test of time, and how genres and artists have risen and fallen in popularity.

The Music Timeline uses data from Google Play Music users’ libraries to categorize artists by genre, and the genres are then subdivided. What it provides, then, is a rough-and-ready map of the popularity of genres and artists over the years.The X-axis shows us the transition of time from mid-50’s to 2010 while the Y-axis scales the popularity of that particular genre. Here, the visualization uses stacked area chart which are usually used in situations when we need to display some changes in time, when it’s important to show that those values in a sum form a whole. For all the music loving audience there can’t be a better way of representation than this.

UNDERSTANDING THE DATA:

Let’s dive deeper by understanding it’s working dynamics. When you glance across the timeline it gives a soothing treat to the eyes with it’s subtle color combinations. You can figure out straight away that during the early 50’s ‘Jazz’ & ‘Vocal/Easy-Listening’ genre were very popular among people. But as time elapsed ‘Rock’ and ‘Pop’ culture picked up the pace and overshadowed other genre. In the early 80’s ‘Jazz’ along with other genre reached their threshold and people developed a different taste of music. Emergence of new artists like Snoop dog,Eminem and Nirvana resulted in the up rise of ‘Alternative/Indie’ and ‘Hip-hop/Rap’ culture and made it a craze.

Once you’ve drilled down to your selected genre, the timeline takes the form distinctive audio wave showing the flow of popular individual artists/bands and displays a short bio and relevant albums. For example, by clicking on the Pop stripe, we can see the combination from ’50s Pop to ’60s Pop to Adult contemporary within the growth of the overall genre, as well as some of the most popular artists that composed each sub genre. This helps audiophiles to choose the artist of their choice and buy relevant songs.

Another interesting feature is that we can search for a particular artist to see the trajectory of their career through the decades. Let’s say, Michael Jackson who started his music career in 70’s but didn’t hit his stardom until the release of his famous album Thriller and Bad in 80’s after which his legacy still continues till date. The same feature also applies to music albums.

DRAWBACKS:

Now talking about the loop holes in this visualization, the data collected is only restricted to Google Play Music user’s libraries and doesn’t take into account users from other music friendly platforms like Apple music, Spotify,Pandora or Sound-cloud. If we collate the data from all sources, there is a high probability that it might give us a different picture altogether. More ever, the very existence of Google Play Music in 1950 or it’s use by the old generation people, who mostly prefer traditional ways (cassette players and cd-player) of listening to music is questionable. I leave that up to you to decide. But certainly the variety and easy usability of Music Timeline over powers it’s flaws.

FROM A CRITIQUE’S VIEWPOINT:

While exploring a particular genre say ‘ROCK’ and diving deep into it’s sub-genres, if you examine closely the word ‘ROCK’ is embossed in the background throughout the entire audio wave format. This seems to be overwhelming specially when you have multiple sub-divisions each highlighting it’s own name. A color contrast is much needed to distinctly identify the sub-categories. Lastly, the font size needs to be standardized across the timeline to reduce the probabilities of missing out on words/texts.

ALTERNATIVE APPROACH:

Though it’s visually appealing there are certain hiccups with stacked area charts as well. Ideally one should be able to interpret each individual series by its height, but unfortunately most interpret the curve of the top of the area as indicating quantity ( In line graphs). So, as an alternative I would recommend individual line charts, with an additional line in a stark color for the total or we can use interactivity and gray out series in the background, such as in this amazing visualization of housing prices from THE New York Times 

CONCLUSION:

I feel that this kind of visualization is really helpful  when we have to organize huge data sets (live data) across a defined time-line. Similar approach/viz can be an advantage for organizations to analyze their product sales along with their respective popularity. It will allow them to come up with business metrics targeting valuable customers.

Reference:

https://www.theguardian.com/music/musicblog/2014/jan/17/google-play-music-timeline-punksoul

The NYT

Mapping Migration in the United States

Due to recent discussions regarding immigrants, I wanted to know about the percentage of immigrants in each state and in the due course stumbled upon this 2014 report on mapping migration in New York Times. Here is a visualization that tells us “where people who lived in each state in 2012 were born” through a Voronoi treemap.

Color represents percentage of people born and living in the same in the same state. For example, 61% of Texans were born in Texas.

Treemaps are a well-known method for the visualization of hierarchical data but are limited to rectangular shapes. Additional challenges with tree maps are zero values and size distortion as the number of pixels (subdivisions) go up. These issues are eliminated using Voronoi treemaps which facilitate  subdivisions within polygons and creating treemap visualizations within areas of arbitrary shape.

Critique:  The objective is to convey the proportion of different categories of origins of people in each state. This is communicated using of geographical Voronoi treemap. Geographical maps are generally useful to represent spatially intensive data (the variable being aggregated over a region).  Since geographic size has no correlation to population, a geographic map is not an ideal way to represent population in general. The current chart however goes a step ahead to show proportions of population origin within state. For example, New Jersey has 10 times the population of Montana though the sizes the states on the map are contrary.  I am aware it is  not always possible to make each shape exactly the right size, however I think it is misleading to use the state shape and relative area to communicate these results. Also, 2 shades of gray – dark and light is used to represent people born outside US and people born in the state (and currently living in same state) respectively. In my opinion different colors should have been used contrary to different shades of a same color.  

How would I change it:  From the objective of the chart, we need a visualization that maintains regional relevance along with showcasing proportion of population origins in each state. To indicate population-origin proportions, a tree map with subdivided rectangular area (not Voronoi) works. To incorporate regional relevance with this, each state could be represented with a size proportional to its population with an area cartogram, instead of using the geographical shape of each state. An area cartogram is a map that alters an entire physical location by scaling a chosen economic, social, political, or environmental factor.  An example of an area cartogram of electoral results map (scaling factor being elections results of democrats and republicans) is shown below.

If a cartogram is used to represent migration map for USA, the visualization would consist a largest square for California, followed by Texas and smaller squares for other states. Each square (representing a state) would be divided into rectangular areas with size proportional to population origin category. I would use 2 distinct colors to to represent people born outside US and people born in the state (and currently living in same state)

PS : If the objective was to solely visualize population origin category with no regards no geography, bar graph would suffice. I would have the state names along the X axis, the percentage along the Y axis and bar graphs stacked side by side for each state. Thus, each state would have 6 bars (of different colors) representing population origin.

References:

Mapping Migration in the Unites States ( NYTimes, By Gregor A & Robert G, 08/15/2014)

Voronoi Treemaps

Migrations maps critique

Engaging audience with better graphs

http://www.dailymail.co.uk/news/article-2062634/One-American-women-medication-mental-disorder.html

While surfing on internet, I came across a news article which claimed that more number of American women were taking medication to combat mental disorders. The article also included some statistics depicting the percentage of men, women, boys and girls taking medications in 2001 vs 2010.  Later, it also quoted some famous personalities who fought some serious mental ailments.

I had the following thoughts while reading this article. (P.S This blog only talks about the first bar graph in the article – Percent of Population using Mental Health Medications)

  • It would have been more engaging if the data was time-sequenced from 2001 to 2010. Seeing the change in the numbers from 2001 through 2010 gives me a deeper understanding of the situation.
  • I really liked the way in which they segregated the data into four categories – men, women, girls and boys. But having said that, just by looking at the graph, I cannot interpret the age groups being considered to form those divisions. For e.g. what category would a female who is 23 years in age fall into? It would have made more sense to me, if the X axis also incorporated this information below each of those labels.
  • The bars being labelled with numbers helps in understanding the percent increase/decrease in one glance. However, this makes the Y axis and the guidelines redundant. Besides, the colors being used are too dull to catch any attention. Using bright colors and a bigger font would help reach out to a larger audience. As we learnt in the class, putting in 3D effects don’t really help in conveying a message apart from making the graph look hodgepodge.
  • The correct placement of the legend also has its role to play for the readers to quickly learn what the visualization is about. Thus, having the legend placed at the top right corner where there is sufficient space would have been appropriate.
  • Lastly, incorporating some additional data perhaps the occupation, relationship status, would also help readers gauge the causes of increase in number of people and especially women, moving towards medication due to mental illness.

While it’s clear that the article focuses on a serious subject, the graphs included did not seem to be very intuitive and informative to me. Applying better visualization techniques would have been more effective in proving their point!

Weak graph for a strong claim- Washington Post on Global Warming!

Sneha Krishnan

https://www.washingtonpost.com/news/wonk/wp/2013/07/09/you-cant-deny-global-warming-after-seeing-this-graph/?utm_term=.ab92a5adbdd7

 

If there is one imminent threat for mankind that worries me, then that would be Global Warming. For years now, we have seen scientists talk about its perils and consequences for our planet and in particular about the jeopardy to life on Earth. There is proof enough to believe that our actions are leading to the rising Global temperature. I often find it to be a huge irony, that we complain about the rising temperatures, drought and hot summers while relaxing in air conditioned houses or often driving comfortable air conditioned cars and listening to the news on the radio. In reality though, it is our actions that are causing the gradual Global temperature rise, thanks to increased CO2 emission levels due fossil fuel burning.

I stumbled upon this Washington Post article and I was instantly attracted by its title. Though a little old, I read this article to understand the findings shared by the World Meteorological Association. The graph in the link describes the Global temperature in degree Celsius for decades (a period of 10 years) from 1881 to 2010. As explained in the article, they have considered decades as the time to measure the Global temperature rather than individual years, as temperatures in a single year could also rise due to reasons other than climate change.

What I liked about the graph is that they have considered the time span of nearly 15 decades to show the changing Global temperature. This gives us an idea about Global temperature not just in the past few decades, but rather its progressive increase over several decades. The fact that they have measured temperatures over decades rather than individual years is also very helpful, as climate changes and Global temperature rises are evident over a longer period than over short intervals. Had this been for individual years, the difference in temperature between successive years would not be quite evident, which would have led us to believe that the temperature rise is not significant and we have anything to worry about. The color coding for each five decades also helps us understand the relative changes in temperature in a span of five decades. It helps us notice the steep increase in temperature from one color to the other, signifying a rapid increase in temperatures between those decades. The graph is simple and informative, it helps convey the relevant information to a novice reader who needs to know the impending dangers of global warming and be a responsible citizen. The hope is that they would consider carpooling the next time he takes their car to work or decide to bike to get groceries from the nearby grocery store.

While I liked the graph for depicting the increasing temperatures over nearly the last 15 decades, I found certain aspects to be confusing and not adding value to the graph. For instance, I did not understand why did the Y-axis start from 13.400 degree Celsius. It is not clear to me as a novice reader, whether it indicates the temperature before the starting decade of 1881 (i.e. 1871 to 1880) or is it considered the normal Global temperature. Also, the purpose for the center dashed line that goes through the graph (a little above the temperature on Y-axis of 13.950) is not clear. I am not sure if it is present in order to indicate something significant like the mean temperature (though the mean temperature is 13.95) in the graph or some other significant aspect. There is also inconsistency in using numbers in the graph, where the temperatures indicated on Y- axis have 3 digits post decimal and the temperatures on top of the bars have just 2 digits after the decimal point.

A significant drawback according to me, is that the graph does not convey a lot of information to support the claim made by the title. There is not enough information to explain the relationship between rising in Global temperature and factors contributing to Global Warming. If it had more information on the factors contributing to global warming (like CO2 emission levels) for decades in the graph showing significant increase in these factors as well, then the graph would support the claim of the article better. I also feel the horizontal dashed lines running to mark individual temperatures on the Y-axis do not add much value, as the temperature for each decade is mentioned on top of the bars.

In my opinion, the graph could be more useful and serve the claim better if certain things were done differently. The graph does not convey much information, especially regarding the correlation between rising temperatures and the factors contributing to it. Also, a layman may not know whether how much of an increase in the Global temperature is too much to damage the planet. The total temperature difference (between the starting and ending decade in the graph) is less than 1 degree Celsius, and this may seem not too significant, if not explained about the consequences with that increase. I would have included all the above information in the visualization. I would have used better visualization techniques and added a second graph to show the correlation between increasing Global temperature and factors contributing to rising temperatures. I would have also explained the scientific relation between those factors and the global temperature, so as to not leave room for doubt in the reader’s mind.

 

Don’t make your executives do the math !

With abundance in data, it can be a tedious task to look through your numbers to interpret data and take important decisions. Visualization is an effective way of describing the patterns in data. When visualizations are created for top level executives, Sales/Marketing heads, they will not play around it, and will accept (or reject) whatever visualization is created for them.

One of the most common scenario’s for visualization is sales related information. The below “line graph” shows Sales and Target figures for a given year (monthly)

Year wise Sales performance

What I like about the graph

1] Time wise trends in Sales achieved and the expected Target (comparison)

2] Different colors to distinguish between Sales and Target

3] A good scale which covers the data properly

What I don’t like about the graph

1] No details about numbers. Though above line graph gives a general idea about the months in which target was reached/missed, it fails to give the numbers (Executives will be interested in “how much” rather than a general idea)

How can we add value to above visualization

1] Include details. Show numbers. Add bar graphs

Crisp and clear numbers.

The above visualization makes use of “Bar graph” and “line graph” into single picture and gives us information about

1] How much was sales as compared to targets (Percentage up or down) – For CEO

2] Actual sale figures (line graph) – For Sales head

3] Red and Green color combination tells us good/bad news

Conclusion – It is important to identify the target audience and include details accordingly. Do not make your executives do the math ! The more questions they ask from looking at visualization, the more scope of improvement. 

References  – https://www.klipfolio.com/blog/dashboard-design-mistake-forcing-users-to-do-the-math

Thames-Pulse : Such Live Data Artwork – Much Sophisticated to Understand 

Akshar Takle

Organised by river charity Thames21 and made by artist Jason Bruges, the artwork shows different displays according to whether the water quality in Thames is improving, declining, or stable compared with the previous day’s data.

Checkout the video below:

The external lights on Sea Container building, next to Thames in London are meant to change according to live data on the water quality of the river Thames.  It provides a striking visual display of health of the water flowing past, by using data from samples that are taken daily.

But how is one supposed to interpret this sophisticated live action data viz?

Lets see what data is meant to be conveyed about the health of the river:

The artwork displays one of three patterns based on whether the water quality is improving, static or declining compared to the previous week’s data reading.

  • Declining water quality: lighting is largely green and static
  • Static: lighting becomes more animated: a blue ‘wave’ sweeps across building
  • Improving: pink and blue lights pulsate furiously up and down the frontage.

I think the visualization fails to convey the intended message to the audience largely because of wrong choice of colors.

Color plays a crucial role in transmitting a psychological message to the audience.

  • Green is a color is generally linked to nature, peace, well-being and freshness. If used to represent declining water quality, people can easily misinterpret it to be the opposite.
  • Similarly, a shade of red color speeds up the heart rate and conveys a message of danger.

I conclusion, the user experience should be also taken into consideration and not merely the aesthetics for a visualization to be effective.

References:

http://londonist.com/london/best-of-london/now-you-can-see-how-healthy-the-thames-is

http://www.thames21.org.uk/Pulse/

 

 

The Devil of the Data Visualization World

A pie chart is one of the inefficient ways of communicating data to the audience. Generally speaking, pie charts can be used to show how one part is related to the whole, however they are often misleading, and inaccurate. Let us evaluate such a chart.

The above chart conveys an idea about the total percentage of active and inactive bitcoin addresses. About 65% of all bitcoins in existence are associated with an addresses that had some activity within the past year and have been categorized into separate group as per their last usage. This data is quite interesting and relevant to the hot electronic payment system domain, however the visualization violates several aspect of good design and unable to make an effective communication. Below are the details of my evaluation and I will conclude by providing an ideal chart for this data.

Point 1: There are multiple(10) categories in the pie, and I found it difficult to identify the proportions correctly and compare across these categories. Now one might argue that these portions have been labelled by the values of each slice, but we are forced to look across different sections to make any comparison. The text levels are also small and overly complex.

Point 2: Making the chart in 3D is like adding insult to injury. Humans are bad at judging relative sizes and adding a 3rd dimension makes it almost impossible. I found the section near to me(my screen) “Last six months to one year”(15%) to be the biggest. However the “Last 1-3 months” section has a bigger value(24%).

                               3D vs 2D pie chart

Point 3: Color and Aesthetics are not good. The choice of color is very amateurish, and gives no real relation to each other. Also pie chart’s circular structure use up too much space while not allowing their labels to line up. The time scales of each slice are completely different which confuses the audience more.

How would I change it:

A plain and simple representation of data is the most effective way of communication. We tend to over-engineer a chart which changes the essence of the information that needs to be communicated.

A table can be used instead to communicate the information. Even a bar graph is a useful tool for this data set. When we want to compare two things, in this case the use of bitcoin address, we typically should put those two categories as close together as possible and align them along a common baseline to make this comparison easy. A bar graph can help us do this comparison easily because our eyes compare the end points and it’s very easy to assess relative size.

The look and the feel of the dashboard can be modified by changing the font and simplifying the text label. The text label should not be very long. Regarding the color scheme, ideally a darker color creates more impact. Both the 24% and the 2% section in the pie chart have darker shades, which doesn’t adheres to the selection of the best color scheme. So instead of multiple colors, a single color of varying hue should be used.

References:

http://www.dashboardinsight.com/articles/digital-dashboards/building-dashboards/the-case-against-3d-charts-in-dashboards.aspx

 

Blog 1 — Hits for the search term ‘Obesity’

I found this graph in the article could “Obesity: Are We Food Obsessed?”. The author wants to use this graph to prove a statement written by Professor Greg Whyte that, when facing with obesity, people are more focused on the diet rather than the physical activity, but they share equal importance.

This graph shows the search rate for the words related to obesity. It is very clean and simple, which gives the audience a direct view at the first glance. For example, we can easily see that diet has the highest rate. However, this graph is not so clearly, which will mislead the audiences.

First, there are two items on the top right – Pubmed and Google. I think these are the two search engines the author wants to focus on. But these two items have two different units, seconds and millions. In my view, million is the for the hit rate. What does seconds stand for confused me a lot.

Second, there are four colors in the graph. Two for Pubmed, two for Google. I’m also wondering, what’s the differences between two colors for the same item. It’s unclear. In other words, the author didn’t give enough information in explaining the meaning of colors.

Third, the title of this graph is missing some parts. The author uses ellipsis after the words ‘Obesity’, which will confuse the audience whether the author is focusing on anything else.

Overall, this is a good graph that clearly shows the results on hits for the search terms. It tells the audiences what the author is looking for, why he thinks in that way and how the results prove the statement. This graph also leads the audience to think in a different way that they focus more on the diet when losing the weight than doing the physical exercises, which might not be right.

Source : http://blogs.discovermagazine.com/neuroskeptic/2012/03/24/obesity-are-we-food-obsessed/#.WOwDslPys_W

Post-Drought Employment of Santa Cruz County

By: Jacob McConnell

In the local Santa Cruz newspapers, journalists express their joy as the winter storms slow and Spring season begins. After a multiyear- historic drought in the bay area, the winter of 2016-2017 brought massive amounts of much needed rain and within a few short months, officially put an end to the worst water shortage in state history. I, along with the rest of Santa Cruz county am grateful for the relieving effects of the successful winter. While I do not intend to undermine the severity of the drought, I have began noticing articles and statistics that seem skeptical in regards to the actual positive impact the rain has had on our community.

What caught my attention were a few graphs of the labor statistics of Santa Cruz County intended to highlight the increase in labor force at the end of the drought. Below is a snapshot of the Santa Cruz County Labor force depicted over the first two months of 2017.

At a quick glance, the graph displays a steep and steady increase in employment. The line actually increases 4-fold within the presented graph. However, simply looking at the numbers on the y-axis one can notice that the actual number of employees that joined the workforce was less than 1000. Out of the 131,250 workers, this increase is actually less than an 1% increase in employment. Admittedly however, if kept steady, this would result in an over 6% increase in employment for 2017, which would be very impressive.

After calculating this potential rise in numbers of the workforce, I too began believing the heavy rain quite possibly could be creating this increase in thousands of local jobs. The news mentions how our agriculture, flood and beach clean-ups, creek visits, landscaping, construction, and tourism were all benefiting from this winter. Though it is hard deny these claims, I noticed a trend in all the jobs being highlighted: they are all seasonal.

As a beach town centered around tourism and agriculture, some of Santa Cruz’s biggest producers of jobs are the beach boardwalk, the university, and the local farms. These organizations base large portions of their business on part-time and young workers. In order to analyze a trend in part-time seasonal work within Santa Cruz County, I pulled a graph of the 2016 labor force over the entire year.

By simply viewing this fully depicted graph, it can easily be concluded that there is a large spike in employment during the summer months. The start of 2016 is identical to 2017’s start shown in the original graph. It is hard to simply rule out the end of the drought as a job creator, but history shows it is more likely to be a seasonal trend in local employment rather than an actual positive impact the excessive rain had on the community.

Sources

https://data.bls.gov/pdq/SurveyOutputServlet

http://www.cityofsantacruz.com/departments/water/drought/weekly-water-conditions

http://www.cityofsantacruz.com/departments/water/drought/2015-water-supply-outlook

http://www.santacruzsentinel.com/article/NE/20170126/NEWS/170129769

Blog Post 1 – cchen2 – Poking a Monster Graph

Pokememory
How memory is used in each Pokemon generation

Overview

My first blog post is about this visualization of Pokemon Generation 1-6 data usage. For each generation, this graph broke down various actions a user can do and the data it consumes when in different modes (e.g. Playing, Battling, Catching…)

Impression

The trend of data usage in Pokemon generations is fairly consistent. Except Gen 5, all other generations use more data than its previous generation. At first glance, this visualization reminded me of Tetris, the classic video game a lot and it took me a while to grasp the interpretation of the graph. There are couple reasons why I had difficulty understanding the graph:

  1. Some boxes with the same color could appear next to each other or stack on top some other colored boxes. For example, in every Generation, blue boxes like EVs (bottom row) are also at the very top (HP, Att, etc.)
  2. There are too many segments in the graph. From 18 segments in Gen 1 to 70 segments in Gen 6, this graph contains a lot of data for its readers to process.
Tetris

Possible Improvement

I would address the two points above.

  1. Arrange the order of the blocks so different color boxes are displayed together.
  2. I will group similar segments in one category (e.g. Group Nickname, OT Name, and OT ID into one category called Name, and group unused and unknown to one block). And have another graph to show the percentage breakdown of each process in the categories (e.g. Name group: Nickname  – 60%, OT Name – 30%, OT ID – 10%).

Sources:

https://www.reddit.com/r/pokemon/comments/4cndbn/how_memory_is_used_in_each_pokemon_generation/

http://www.tetris24.com/