Great Maps with ggplot2

The above map (and this one) was produced using R and ggplot2 and serve to demonstrate just how sophisticated R visualisations can be. We are used to seeing similar maps produced with conventional GIS platforms or software such as Processing but I hadn’t yet seen one from the R community (feel free to suggest some in the comments). The map contains three layers: buildings, water and the journey segments. The most challenging aspect was to change the standard line ends in geom_segment from “butt” to “round” in order that the lines appeared continuous and not with “cracks” in, see below.

I am grateful to Hadley and the rest of the ggplot2 Google Group for the solution. You can see it here. From this point I layered the plots using the geom_polygon() command for the buildings and water bodies and my new function geom_segment2() for the journey segments- these were simply the start and end latitudes and longitudes for each node in the road network and the number of times a cyclist passed between them. I have included the code below

#Code supplied by james cheshire Feb 2012
#load packages and enter development mode
library('devtools')
dev_mode()
library(ggplot2)
library(proto)

#if your map data is a shapefile use maptools
library(maptools)
gpclibPermit()

#create GeomSegment2 function
GeomSegment2 objname <- “geom_segment2″
draw if (is.linear(coordinates)) {
return(with(coord_transform(coordinates, data, scales),
segmentsGrob(x, y, xend, yend, default.units=”native”,
gp = gpar(col=alpha(colour, alpha), lwd=size * .pt,
lty=linetype, lineend = “round”),
arrow = arrow)
))
}
}})

geom_segment2 “identity”, position = “identity”, arrow = NULL, …) {
GeomSegment2$new(mapping = mapping, data = data, stat = stat,
position = position, arrow = arrow, …)
}

#load data stlat/stlong are the start points elat/elong are the end points of the lines
lon names(lon)<-c(“stlat”, “stlon”, “elat”, “elong”, “count”)

#load spatial data. You need to fortify if loaded as a shapefile
water built

#This step removes the axes labels etc when called in the plot.
xquiet yquiet<-scale_y_continuous(“”, breaks=NA)
quiet<-list(xquiet, yquiet)

#create base plot
plon1

#ready the plot layers
pbuilt<-c(geom_polygon(data=built, aes(x=long, y=lat, group=group), colour= “#4B4B4B”, fill=”#4F4F4F”, lwd=0.2))
pwater<-c(geom_polygon(data=water, aes(x=long, y=lat, group=group), colour= “#708090″, fill=”#708090”))

#create plot
plon2

#done
plon2

40 Comments

  1. andrew dempsey

    WOW, that is an amazing map from R.

    Would it be too rude to ask you to post the code, or at least a tutorial / example version of it? Especially how you go about getting data for each of the layers and using it within the plot

    Do you query openstreetmap dynamically, or manually? etc.

    I can see doing some US postal analysis using a mix of open street map and our own internal data.

      1. andrew dempsey

        Thanks!

        Bear with me, I am a very junior R person (but excessively senior with BI tools and SQL)

        Your inputs are….
        bikes_london, which literally has each straight line from point A to point B with the count of cycles hired (which controls line weight). My equivalent to this would be a file of routes, likely replacing count with a route ID and having different colors per route.

        I would also need a similar file if I wanted to plot roads

        Shapefiles for water and buildings – silly question, where do you get such files? I would be looking for US data.

        Does the order of the layers matter? It looks like they are specified from the top down.

        Once again, this is a fantastic map.

      2. Raghu

        Great work! Is the “bikes_london.csv” dataset available to check how data is stored? How is stlat and stlon different from elat and elon? I want to see how the data is to be stored.

  2. Roger Bivand

    Very nice! Is the use of gpclibPermit() strictly necessary, and if so for what? It has a very restrictive license, which you see if you look at the source.

    1. James Author

      Thanks Roger, I’m glad you like it. gpclibPermit() is required for fortify()- I find it a convenient way to extract the polygon boundary coordinates into a format that works with ggplot(). There are obviously ways around this but I didn’t want to complicate the code too much here.

      1. Roger Bivand

        In fact, gpclibPermit() may only be needed on systems without rgeos installed, as unionSpatialPolygons() inside fortify.sp() will use rgeos rather than gpclib if available, even if gpclibPermit() has been issued. And ggplot2 could drop suggesting gpclib, because maptools already suggests it; it is only used in that one place.

        So typically it is only needed on OSX systems where the user doesn’t setRepositories(ind=1:2) before installing spatial things. My guess is that fortify.sp will use rgeos if available. Could you contact me off-page if you could let me check the call logic avoiding gpclib in your case?

        The concern about avoiding gpclib is motivated by movement towards licence-based filtering on CRAN, because institutions will want positive confirmation that their software train is not encumbered – rgeos is not encumbered, gpclib is, unfortunately.

  3. Tyler Rinker

    Thanks for sharing. This is terrific data visualization. It really speaks for itself. You’ve really captured the data side of things while appealing to the aesthetics. It’s inspired me to delve deeper into ggplot2 in combination with various mapping resources. Thanks again.

  4. Jeff

    Thanks for sharing.
    I’m trying to execute your code but I’m always getting this error when I try to print out the result:

    Error : Discrete value supplied to continuous scale

    Do you have any clue?

    Thanks

    1. Jeff

      Found!
      I had a header at the first line of my CSV file and according to this line of code:
      lon<- read.csv("bikes_london.csv", header=F, sep=";")
      header was set to (F)alse…

      Thanks

  5. maj

    Looks great!

    I tried to run your code, but apparently R does not know how to interpret the space between ‘GeomSegment2’ and ‘objname'(and also between draw and if, I would guess) in the lines

    #create GeomSegment2 function
    GeomSegment2 objname <- “geom_segment2″
    draw if (is.linear(coordinates)) {

    Do I have to change anything about the code to make it run?

    I would appreciate your help.

  6. Rachel

    This is amazing, this has opened new doors as to how R can be used even in spatial visualisations. Appreciated the sharing of knowledge, thank you!

Comments are closed.