Pages Menu
TwitterRss
Categories Menu

Posted by on Nov 6, 2014 in London, R Spatial, Visualisation | 6 comments

Improving R Data Visualisations Through Design

When I start an R class, one of my opening lines is nearly always that the software is now used by the likes of the New York Times graphics department or Facebook to manipulate their data and produce great visualisations. After saying this, however, I have always struggled to give tangible examples of how an R output blossoms into a stunning and informative graphic. That is until now…

I spent the past year working hard with an amazing designer – Oliver Uberti – to create a book of 100+ maps and graphics about London. The majority of graphics we produced for London: The Information Capital required R code in some shape or form. This was used to do anything from simplifying millions of GPS tracks, to creating bubble charts or simply drawing a load of straight lines. We had to produce a graphic every three days to hit the publication deadline so without the efficiencies of copying and pasting old R code, or the flexibility to do almost any kind of plot, the book would not have been possible.  So for those of you out there interested in the process of creating great graphics with R, here are 5 graphics shown from the moment they came out of R to the moment they were printed.

commute_flows_before_after
This graphic shows the origin-destination flows of commuters in Southern England. In R I used the geom_segment() command from the brilliant ggplot2 package to draw slightly transparent white lines between the centroids of the origins and destinations. I thought my R export looked pretty good on black, but we then imported it into Adobe Illustrator and Oliver applied a series of additional transparency effects to the lines to make them glow against the dark blue background (a colour we use throughout the book).
day_night_before_after
This is a crop from a graphic we produced to show the differences between the daytime and nighttime population of London (we are showing nighttime here). It copies the code I used to produce my Population Lines print, but Oliver went to the effort of manually cleaning the edges of the plot (I couldn’t work out how to automatically clip the lines in ggplot2!) by following the red-line I over-plotted. Colours were tweaked and labels added, all in Illustrator.
treasures_before_after
One of my favourite graphics in the book shows the number if pieces of work by each artist in the Tate galleries.  We can only show a small section here, but full-sized it looks spectacular as it features a Turner painting at its centre. The graphic started life as a treemap that simply scaled the squares by the number of artists. R has a very easy to use treemap() function in the treemap package. Oliver then painstakingly broke the exported graphic to bits, converted the squares to picture frames and arranged them on “the wall”.
cycle_before_after
This map, showing cyclists in London by time of day, was created from code similar to this graphic. It is an example where very little needed to be done to plot once exported – we only really needed to add the River Thames (this could have been done in R), some labels and then optimise the colours for printing. Hundreds of thousands of line segments are plotted here and the graphic is an excellent illustration of R’s power to plot large volumes of data.
relationship_status_before_after
The graphic above (full size here) has been the most popular from the book so far. It takes 2011 Census data and maps people by marital status as well as showing the absolute numbers as a streamgraph. ggplot2 was used to create both the maps and the plot. We stuck to the exported colours for the maps and then manually edited the streamgraph colours. The streamgraph was created with the geom_ribbon() function in ggplot2.
london_inspired_before_after

All the graphics shown so far started life as databases containing, as a minimum, several thousand rows of data. In this final example we show a “small data” example – the lives of 100 Londoners who have earned a blue plaque on one of London’s buildings. The data were manually compiled with each person having 3 attributes against their name: the age they lived to, the age when they created their most significant work, the period of their life they lived in London. Thanks to ggplot2 I was able to use the code below to generate the coarse looking plot above. Oliver could then take this and flip it before restyling and adding labels in Illustrator. They key thing here was that a couple of lines of code in R saved a day of manually drawing lines.

#We order by age of when the person started living in London, this is the order field.

ggplot(Data,aes(order,origin))+geom_segment(aes(xend=order, yend=Age))+geom_segment(aes(x=order,y=st_age, xend=order, yend=end_age), col="red")+geom_segment(aes(x=order,y=st_age2, xend=order, yend=end_age2), col="yellow")+ coord_polar()

 

Purchase London: The Information Capital.

6 Comments

  1. Great Stuff! I’ll surely buy the book.
    Just checking … Is the R Code available with the book?

    • I’d like to know that too. If not in the book, on the companion website?

      • I hope to post some examples here soon. My code is a bit of a mess :-S

  2. Well done, this is what really need with R. As an applied statistician, scientist, engineer & small hi-tech business owner, I shall be using this when I talk about statistical graphics. Many thanks.

  3. Some nice examples of bring beauty to R output. Great to see someone championing improvements to the aesthetic of R output – a great tool, but beauty has never been its strong point.

    I look forward to reading the book.

  4. Nice work! What is the spec of the machine you used to generate this graphic and how long did it take to render? My R session freezes over when I try to reproduce this example. I have a Core i7 machine too with 16 gigs of ram…

Trackbacks/Pingbacks

  1. Graphics: automate, then individualise | Stats Chat - […] James Cheshire, a lecturer in geography in […]
  2. Casual visualization books for the coffee table | Illustrated Monthly Blog - […] The Information Capital by geographer James Cheshire and designer Oliver Uberti is all about London. I’ve never been to…
  3. Data Viz News [75] | Visualoop - […] Improving R Data Visualisations Through Design | Spatia.ly […]
  4. Pretty Visualizations From R, Explained | Finer Focus - […] post, Improving R Data Visualisations Through Design, is especially worth reading. It has a handful of well-explained examples from geographer Dr […]
  5. Steps toward recreating The Facebook IPO plot | NYC Data Science Program - […] While working on this project, however, I discovered a really nice blog post over at Spatial.ly, Improving R Data…

Post a Reply

Your email address will not be published. Required fields are marked *