Having found this weather set for an area not so far from where I live, I decided to test a couple of long held theories I had (totaly unscientific) and illustrate my findings:
- the weather had got generally hotter since the early 90's
- the weather had got generally wetter since the early 90's
- there had been a seasonal shift (as if the seasons had 'left shifted by a couple of months')
The data itself contains monthly data points for every year since the 1850's showing (amongst other things):
- total rainfall (mm)
- average max temperature per day (degrees celcius)
- average min temperature per day (degrees celcius)
see:
- data cleansing - trivail data cleansing activities
- mann_whitney - pre-calculation of mann whitney u test scores
I was very keen to provide a full context chart showing data points for every single month/year. This was to enable the viewer to see for themselves whether there appeared to be a trend or just curious outliers
I was also keen to show rain vs temperature on the same chart (naievely) thinking that 'if my hypotheses held' then we send see a tendancy for more recent data points to appear towards the top right. As it turns out not all my hypotheses appear to hold but the same visualisation technique shows this.
Finaly, partly for fun and partly because I wanted to be able to consider each month independently (which might help with the seasonal shift idea), I wanted to animate the visualisation across months of the year
Armed with these aspirations I looked for a baseline chart to specialise, selecting the following from the dimplejs examples:
http://dimplejs.org/advanced_examples_viewer.html?id=advanced_storyboard_control
I took a crudely iterative approach taking feedback (in person) at key stages from family and friends (a software engineer, an electrical engineer, a nurse and a retired insurance salesman for context).
I gave them minimal context to understand the visualisation and asked for both general feedback and a few direct questions:
- what stands out?
- what extra context do you need to understand the chart?
- what do you think of the layout, what could make it clearer?
- does the visualisation convey the intended message?
-
This was a crude attempt to get something up and running. At this stage there was a scatter plot of rain against max temperature (intention was to later add a drop down select box to switch to min temperature).
-
Additionally there was an 'interactive legend - bar chart' highlighting the month by month animation. I also used this to display monthly average rainfall as I didn't think the main scatter plot represented this well.
- what stands out?
-
Pleasingly people did notice a tendancy for my darker codifying of recent years data points as typically having higher temperature
-
By contrast there was no such observation for rainfall, values seemed almost random (no obvious trend for recent years to be wetter). Which as it turns out is true, that is to say my original hypothesis is unfounded - the data never lies!!!!!
- what extra context do you need to understand the chart?
- A more explanatory title.
- Better axis titles.
- Show recent vs historical figures in the rainfall bar chart.
- what do you think of the layout, what could make it clearer?
- Show min temperature values alongside max temperature
- does the visualisation convey the intended message?
- kind of shows a rise in max temperature
- doesn't say anything about changing seasons
- general comments
- not sure what the bar chart is supposed to mean
- it's a bit slow (I think they were being kind)
- I introduced the minimum temperature alongside maximum temperature on the scatter plot.
- I experimented with a super imposed line chart showing temperature change over the years
- what stands out?
- awfull line chart - get rid
- what extra context do you need to understand the chart?
- as before
- a legend showing which years were which colour
- what do you think of the layout, what could make it clearer?
- better now min temperature shown too
- why are there different shades of colour (at this stage I was using a colour scale to denote the year of each data point - i.e. it wasn't just one colour for historical and one for recent)
- general comments
none
- Introduced new bar charts for max and min temperature respectively.
- Introduced side by side bars for historical vs recent data points
- Removed line chart
- Revised axis labels
- what stands out?
- Now clearer to see (because of bar charts) the increase in temperature month on month
- No obvious evidence of seasonal shift, cos recent and historical have similar distribution (me para-phrasing). Again the data never lies!!!
- what extra context do you need to understand the chart?
- Relate the colours in the bar chart to the series in the scatter plot
- Still haven't improved the title
- Would be better with month names rather than numbers on bar chart x-axis
- Some 'stats' to prove there has/hasn't been change in temp/rainfall
- what do you think of the layout, what could make it clearer?
- put the max temp bar chart above min chart to match the scatter plot
- some debate as to whether the rain bar chart confused the picture and whether to remove or move elsewhere (ultimately decided to leave)
- labels difficult to read
- show values on hover
- increased label size
- changed ordering and colour of bar charts to better match scatter plot added Mann Whitney U Test values to give a more objective view of
- whether there has been change
- changed title
- Changed colour scale to binary colour scheme for recent vs historical encoding
By this stage I think I'd probably exhausted 'my focus group', so feedback was more limited!!!!
- There was general acceptance that the visualisation was much better
- Still very slow on transitions
- Couldn't get popups on hover to work, I think this would have helped
- Couldn't change bar chart x-axis to month names over numbers and still have the animation work properly
- Didn't investigate performance issues, don't know whether this is just 'there are a lot of data points' or something I am doing wrong
- Couldn't find out how to change shapes on the scatter plot. Wanted to show min vs max temperature with a different shape
- Code refactoring - far too much code in one file
- Code refactoring - should have extracted styling from d3 to css
- Ideally I would have been allow the user to choose their own cutoff point (e.g. compare last 50 years to the rest or last 10 years...)
- dimplejs.org - for api documentation
- d3.js - for api documentation
- http://dimplejs.org/advanced_examples_viewer.html?id=advanced_storyboard_control - template visualisation
- udacity training materials
- http://stackoverflow.com/questions/31892129/d3-js-rotate-text-to-vertical - rotate text labels in d3
- http://www.w3schools.com/colors/colors_hex.asp - colour codes
- https://data.gov.uk/dataset/historic-monthly-meteorological-station-data - data set