After effects of redesign and deception.

Over the weeks, we have understood the ways to approach data, what to use, when to use and how to use charts to highlight your process, data and visualizations. So, this would be the last-run over the basics of what I learnt from this course, what I understand and how we could approach data and in result get the best visualizations to address it.

There are at least three key concepts we need to understand when starting a data project:

  • Data acquiring should begin with a list of questions you want to answer.

This is the part where we decide on the audience, the claim and the warrants that could prove your claim. Then it is advised to gather all the variables and records, rather than the subset that could answer the questions for the immediate story.

  • Data often is messy and needs to be cleaned.

Usually when we acquire data from a source we cannot confirm all the data is clean so there might a lot of time spent on cleaning the data. Also in cases of having multiple data sources, we could encounter fields that need to be joined have a mix of values, are misspelt or have variations that could get the standard down.

  • Data may have undocumented features

When we start to understand the data the first thing required is the data dictionary. There could be fields in the data that are addressed in different ways than the normal convention. For example, with gender addressed as M=1 and F=0. Also, there might be cases when a couple of fields in the data is not in the dictionary making it harder to understand it.

After the time spent on research and analysis of data to get the visualizations of the individual projects, this was the lesson I have learnt and will bear for all the future projects related to data.

 

http://datajournalismhandbook.org/1.0/en/understanding_data_2.html