Discover beer and say cheers!

– Ekta Ratanpara

I am very fond of beers and like to try out different kind of beers. While doing some research on which beer I should try next, I came across “Beerviz” site which is created by students of UC Berkley.

Chord Graph showing similarity relations between beers
Chord Graph showing similarity relations between beers

The site displays interesting Chord graph showing similarities between different brands and types available throughout the world. It also has some graphs displaying how the data is distributed and top five beers by type i.e. Dark, Medium and Light. While I loved a lot of features they incorporated in the visualization but few factors are misleading as well. Below image shows the high-level analysis shown on beer popularity.

High level analysis of beer popularity
High-level analysis of beer popularity

I will try to summarize what I believed works well and what could be improved.

What works well:

  1. Choice of the graph to display similarities between beers: Chord graph works pretty well when inter-relationships between values of multiple types of data points needs to be visualized. It makes easy for the viewer to see relationships between different types of beers and their popularity.
  2. Categorization and Filters: Two level categorization of beers is really helpful to narrow down the exact kind and type of beer you want to explore and its similarities. The website asks the user to select the malt of beer and type of beer is shown in the legend for a user to identify which color is related to which type. And to further narrow it down, they have given filters as attributes of beer like appearance, taste and aroma of the beer.
  3. Graphs showing high-level analysis of data: In addition to showing similar beers in chord graph, they also have few graphs showing ratings by attributes, popularities, and top beers which adds further value to the overall analysis by providing user instant choices and help explore the similarity graph.

What can be improved and how:

  1. Factors to decide popularity: Instead of only number of ratings, a combination of number of ratings and average rating should be used based on which user can make an informed decision. There are a couple of problems when showing popularity based on the number of ratings or average rating. This post on xkcd website sums it up best (click here). For example, beer A has 10,000 ratings but average rating is 1.5/5 and beer B has 5 ratings but average rating is 4.9/5. In both cases, using only number of ratings or average rating will lead to an incorrect conclusion. In this case, if number of ratings is used, beer A is better while if average rating is used, beer B is better. Instead of this approach, I would use a weighted average in addition to adding smoothing factor or a constraint of having a minimum number of ratings that can reduce the misleading factors of a ‘5-star’ rating system.
  2. Top 5 beers based on a combination of the number of ratings and average ratings. In the “About the Data” section, the top 5 beer graphs are based on a number of ratings. As I mentioned in point 1, number or ratings can not be a deciding factor to identify popularity and the same alternative can be applied here as well.
  3. The size of chord graph: Some of the names on the graph are not displayed in full and is cut in the UI which creates the negative user experience. When doing testing, this issue should have been resolved or a drill up – drill down approach should be taken where on selecting a beer, a new graph will show relations of only selected beer with other beers.

Overall, the visualization is quite attractive but if above-mentioned things are implemented, it can drastically increase the usability of the dataset and information provided through the graphs.

Reference:

  1. Beerviz | Discover Beer & Say Cheers!
  2. Beerviz – Work Report 
  3. XKCD Comic on Problems with averaging star ratings