In this practical you’ll practice customizing plots created using the ggplot2 package. By the end of this practical you will know how to:

  1. Use facetting to create multiple plots.
  2. Use scaling to alter the plots dimensions.
  3. Alter and store themes to adjust a plots appearance.
  4. Create multiple plots in one using grid.arrange().
  5. Create image files using ggsave().


A - Setup

  1. Open your BernRBootcamp R project. It should already have the folders 1_Data and 2_Code. Make sure that the data files listed in the Datasets section above are in your 1_Data folder.
  1. Open a new R script. At the top of the script, using comments, write your name and the date. Save it as a new file called plottingII_practical.R in the 2_Code folder.

  2. Using library() load the set of packages for this practical listed in the functions section above.

## Plotting Practical

  1. For this practical, we’ll use the crime.csv data set, containing crime data of US counties across various states. Using read_csv(), load the data into R and store it as a new object called crime.
crime <- read_csv("1_Data/crime.csv")
  1. Take a look at the first few rows of the data set(s) by printing them to the console.
# A tibble: 1,071 x 36
   communityname state population householdsize pctUrban medIncome pctWSocSec
   <chr>         <chr>      <dbl>         <dbl>    <dbl>     <dbl>      <dbl>
 1 BerkeleyHeig… NJ         11980          3.1       100     75122       23.6
 2 Marpletownsh… PA         23123          2.82      100     47917       35.5
 3 Norwoodtown   MA         28700          2.6       100     42805       30.2
 4 Wacocity      TX        103590          2.62      100     17852       29.1
 5 Shermancity   TX         31601          2.54      100     24763       32.7
 6 SanPablocity  CA         25158          2.89      100     25479       23.0
 7 Glendalecity  CA        180038          2.62      100     34372       20.3
 8 Worthingtonc… OH         14869          2.67      100     49851       23.8
 9 Arlingtoncity TX        261721          2.6       100     35048       11.0
10 Marinacity    CA         26436          3.34      100     29043       10.7
# … with 1,061 more rows, and 29 more variables: pctWRetire <dbl>,
#   whitePerCap <dbl>, blackPerCap <dbl>, AsianPerCap <dbl>, HispPerCap <dbl>,
#   PctPopUnderPov <dbl>, PctNotHSGrad <dbl>, PctUnemployed <dbl>,
#   TotalPctDiv <dbl>, PersPerFam <dbl>, PctWorkMom <dbl>, NumImmig <dbl>,
#   PctImmigRecent <dbl>, PctNotSpeakEnglWell <dbl>, RentMedian <dbl>,
#   NumInShelters <dbl>, NumStreet <dbl>, PctForeignBorn <dbl>,
#   PctBornSameState <dbl>, LandArea <dbl>, PopDens <dbl>,
#   PctUsePubTrans <dbl>, murders <dbl>, robberies <dbl>, assaults <dbl>,
#   burglaries <dbl>, larcenies <dbl>, autoTheft <dbl>, arsons <dbl>

B - Create facets

  1. To begin with, create a basic scatter plot (using geom_points()) pitting each of the 7 crime indicators (murders, robberies, assaults, burglaries, larcenies, autoTheft, arsons) against the percentage of people using public transportation (PctUsePubTrans).
ggplot(data = crime,
       mapping = aes(x = XX, y = XX)) + 

# and so on
  1. None of these looked very informative, right? This is because the counts of the each of the crime measures is heavily right skewed. You will learn more about scaling later, but for now add scale_y_log10() to fix this. Run the plots again.
ggplot(data = crime,
       mapping = aes(x = XX, y = XX)) + 
  geom_point() + 

# and so on
  1. This should have been more telling. Pretty much all crimes seem to have been positively related to the percentage of individuals using public transportation. Interesting! But wasn’t it a bit of a pain to derive this insight by created 7 separate plots. Let’s fix this using facets. To do this, first, create a long version of the crime data set called crime_long, using the code below. (Note the use of crime_vars as a positive selector for gather()).
# vector of crime variables
crime_vars = c("murders","robberies","assaults","burglaries","larcenies","autoTheft","arsons")

# transform to long
crime_long <- crime %>% 
  pivot_longer(names_to = "crime_var",
               values_to = "frequency",
               cols = crime_vars)
  1. Using the the crime_long data set, you can now make use of the amazing power of ggplot2’s facet functions, such as facet_wrap(). Use facet_wrap() to automatically plot crime frequency against the percentage of people using public transportation for each of the crime variables.
ggplot(data = crime_long,
       mapping = aes(x = XX, y = XX)) + 
  geom_point() + 
  scale_y_log10() + 
  facet_wrap(~ XX)
ggplot(data = crime_long,
       mapping = aes(x = PctUsePubTrans, y = frequency)) + 
  geom_point() + 
  scale_y_log10() + 
  facet_wrap(~ crime_var)

  1. This was much more efficient, right? Now explore the relationship of frequency to other variables, such as medIncome, TotalPctDiv, or PctNotHSGrad, for each of the crime measures. What variables do predict, which kind of crime? Explore!

C - Customize plots using theme()

Now that we have an informative plot, let’s focus on making it a bit more “pretty”", using ggplot’s theme() function. The goal is to create a plot that looks like the plot below.

crime_facets <- ggplot(data = crime_long,
       mapping = aes(x = PctUsePubTrans, y = frequency)) + 
  geom_point() + 
  scale_y_log10() + 
  facet_wrap(~ crime_var) +
    panel.background = element_rect(fill='white'),
    panel.grid.major = element_line(color = 'grey75',
                                    size = .25),
    panel.grid.minor = element_line(color = 'grey75',
                                    size = .1),
    strip.background = element_rect(fill='white'),
    strip.text = element_text(face='italic', size=12, hjust=1),
    axis.title.y = element_text(size=12,margin=margin(r = 10)),
    axis.title.x = element_text(size=12,margin=margin(t = 10)),
    panel.spacing = unit(1.1, "lines")) + 
  labs(x = '% public transportation', y = 'Crime frequency')

  1. To begin with store one of the facetted plots of section B as crime_facets.
crime_facets <- XX
  1. Now let’s begin changing its appearance. First, change the color of the background to "white" of the panel using the panel.background argument and the element_rect() function.
crime_facets + 
    panel.background = element_rect(fill = XX)
crime_facets + 
    panel.background = element_rect(fill = 'white')

  1. Next, change the major and minor grid lines to color "grey75" and sizes .25 and .1, respectively, using the panel.grid.major and panel.grid.minor arguments and the element_line() function.
crime_facets + 
    panel.background = element_rect(fill = XX),
    panel.grid.major = element_line(color = XX, size = XX),
    panel.grid.minor = element_line(color = XX, size = XX)
crime_facets + 
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1)

  1. Next, change the strip background - the background of the panel headers - to color "white" using the strip.background argument and the element_rect() function.
crime_facets + 
    panel.background = element_rect(fill = XX),
    panel.grid.major = element_line(color = XX, size = XX),
    panel.grid.minor = element_line(color = XX, size = XX),
    strip.background = element_rect(fill = XX),
crime_facets + 
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1),
    strip.background = element_rect(fill = 'white')

  1. Next, change the font in the strip to "italic", adjust it to the right side, and set size to 12 using the strip.text argument and the element_text() function. See ?element_text().
crime_facets + 
    panel.background = element_rect(fill = XX),
    panel.grid.major = element_line(color = XX, size = XX),
    panel.grid.minor = element_line(color = XX, size = XX),
    strip.background = element_rect(fill = XX),
    strip.text = element_text(face = XX, size = XX, hjust = XX)
crime_facets + 
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1),
    strip.background = element_rect(fill = 'white'),
    strip.text = element_text(face = 'italic', size = 12, hjust = 1)

  1. Next, set the font size of the axis labels also to 12 and add a margin of 10 to the top and right side, respectively, of the labels respectively, using axis.title.x and axis.title.y functions and the element_text() and margin functions. See ?margins().
crime_facets + 
    panel.background = element_rect(fill = XX),
    panel.grid.major = element_line(color = XX, size = XX),
    panel.grid.minor = element_line(color = XX, size = XX),
    strip.background = element_rect(fill = XX),
    strip.text = element_text(face = XX, size = XX, hjust = XX),
    axis.title.x = element_text(size = XX, margin = margin(t = XX)),
    axis.title.y = element_text(size = XX, margin = margin(r = XX)),
crime_facets + 
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1),
    strip.background = element_rect(fill = 'white'),
    strip.text = element_text(face = 'italic', size = 12, hjust = 1),
    axis.title.x = element_text(size = 12, margin = margin(t = 10)),
    axis.title.y = element_text(size = 12, margin = margin(r = 10))

  1. Finally, increase the spacing between the panels slightly by setting the space between to 1.1 "lines" using the panel.spacing argument and the unit function.
crime_facets + 
    panel.background = element_rect(fill = XX),
    panel.grid.major = element_line(color = XX, size = XX),
    panel.grid.minor = element_line(color = XX, size = XX),
    strip.background = element_rect(fill = XX),
    strip.text = element_text(face = XX, size = XX, hjust = XX),
    axis.title.x = element_text(size = XX, margin = margin(t = XX)),
    axis.title.y = element_text(size = XX, margin = margin(r = XX)),
    panel.spacing = unit(XX, units = XX)
crime_facets + 
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1),
    strip.background = element_rect(fill = 'white'),
    strip.text = element_text(face = 'italic', size = 12, hjust = 1),
    axis.title.x = element_text(size = 12, margin = margin(t = 10)),
    axis.title.y = element_text(size = 12, margin = margin(r = 10)),
    panel.spacing = unit(1.1, units = "lines")

  1. Did you manage to reproduce the plot above? One other thing seems missing. Add appropriate labels using the labs() function.
crime_facets + 
    panel.background = element_rect(fill = XX),
    panel.grid.major = element_line(color = XX, size = XX),
    panel.grid.minor = element_line(color = XX, size = XX),
    strip.background = element_rect(fill = XX),
    strip.text = element_text(face = XX, size = XX, hjust = XX),
    axis.title.x = element_text(size = XX, margin = margin(t = XX)),
    axis.title.y = element_text(size = XX, margin = margin(r = XX)),
    panel.spacing = unit(XX, units = XX)
    ) + 
  labs(x = XX, y = XX)
crime_facets + 
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1),
    strip.background = element_rect(fill = 'white'),
    strip.text = element_text(face = 'italic', size = 12, hjust = 1),
    axis.title.x = element_text(size = 12, margin = margin(t = 10)),
    axis.title.y = element_text(size = 12, margin = margin(r = 10)),
    panel.spacing = unit(1.1, units = "lines")
    ) + 
  labs(x = '% public transportation', y = 'Crime frequency')

D - Customize plots using theme()

  1. When you managed to reproduce the target theme, save all of the theme setting in an independent object called crime_theme.
crime_theme <- theme(
  XX = XX,
  XX = XX,
crime_theme <- theme(
    panel.background = element_rect(fill = 'white'),
    panel.grid.major = element_line(color = 'grey75', size = .25),
    panel.grid.minor = element_line(color = 'grey75', size = .1),
    strip.background = element_rect(fill = 'white'),
    strip.text = element_text(face = 'italic', size = 12, hjust = 1),
    axis.title.x = element_text(size = 12, margin = margin(t = 10)),
    axis.title.y = element_text(size = 12, margin = margin(r = 10)),
    panel.spacing = unit(1.1, units = "lines")
  1. Now create new plots with different variables on x-axis and simply add the crime_theme in order to apply your personalized theme.
new_crime_plot + crime_theme
  1. If you don’t like your theme, go back and make changes to it, and then apply your new theme onto your plots. Go explore! Try out other arguments of theme() (see ?theme), such as axis.ticks or strip.placement.

E - Scaling

When creating a plot ggplot automatically chooses sensible dimensions for your plot in terms of x- and y-axis limits, geom sizes, or colors. However, all of these aspects of the plot can also be controlled manually or semi-manually using various scale_* functions.

  1. Before playing around with them, add one more element to your plot, which will help you to realize the importance of scaling. That is, color the points according to state by mapping the state variable onto the color argument and size the points according to the county’s population by mapping the population variable onto the size argument. Store the resulting plot in an object called crime_plot.
crime_plot <- 
  ggplot(data = crime_long,
       mapping = aes(x = XX, y = XX,
                     color = XX, size = XX)) + 
  geom_point() + 
  scale_y_log10() + 
  facet_wrap(~ XX) + 
crime_plot <- ggplot(data = crime_long,
       mapping = aes(x = PctUsePubTrans, y = frequency,
                     color = state, size = population)) + 
  geom_point() + 
  scale_y_log10() + 
  facet_wrap(~ crime_var) + 

  1. First, using scale_size() and the range argument, change the scaling of the points to reduce the degree of overlap among the points (see ?scale_size). Try out a few numbers (smaller than 10) to create a version of the plot with a decent trade-off between point size and point overlap.
crime_plot + scale_size(range = c(XX, XX))
crime_plot + scale_size(range = c(.5, 3))

  1. You may find that still some of the larger points are cropped off at the upper end of the panels. Fix this by increasing the y-axis limits using the scale_y_log10() function. Set the limits to 0 and 2e+5 (i.e., 200,000). (Note that R will tell you that this will overwrite the previous use of scale_y_log10(), which is what we intend to do).
crime_plot + 
  scale_size(range = c(XX, XX)) + 
  scale_y_continuous(limits = c(XX, XX))
crime_plot + 
  scale_size(range = c(.5, 3)) + 
  scale_y_log10(limits = c(1, 2e+5))

  1. Next, change the colors to a different, possibly more appropriate color scheme. One way to this is via the scale_color_gradient() or similar functions. Another is to use a specific, pre-defined scheme, such as scale_color_colorblind(). Use the latter. You will see that the colors have much more contrast making it distinguishing the colors based on luminescence alone easier.
crime_plot + 
  scale_size(range = c(XX, XX)) + 
  scale_y_log10(limits = c(1, XX)) + 
crime_plot + 
  scale_size(range = c(.5, 3)) + 
  scale_y_log10(limits = c(1, 2e+5)) + 

  1. Another approach to changing colors is to supply them manually, e.g., using scale_color_manual(). Try assigning your own choice of colors. You may pick them from colors() or generate them using, for instance, the viridis function from the viridis package (you may need to run install.packages('viridis') before using it), which provides an optimized set of colors designed to be (1) colorful, (2) perceptually uniform, (3) robust to colorblindness, (4) and pretty. Take the latter approach, i.e., use the viridis() function to generate colors, in the context of the scale_color_manual() function.
crime_plot + 
  scale_size(range = c(XX, XX)) + 
  scale_y_log10(limits = c(1, XX)) + 
  scale_color_manual(values = viridis(7))
crime_plot + 
  scale_size(range = c(.5, 3)) + 
  scale_y_log10(limits = c(1, 2e+5)) + 
  scale_color_manual(values = viridis(7))

  1. Alright the plot fairly pretty and readable now. But there is always more to be done and tastes differ, of course. Go explore!

F - Creating image files

  1. When you have found a plot that suits your taste, it’s time to save it as an image file. Store your favorite plot in a new object called crime_final.
crime_final <- ggplot(...) + ... # Include your plotting code here
  1. Run your crime_final object to see that it does indeed contain your plot.

  2. Save your plot to a .pdf-file called crime_final using ggsave(). When you finish, find your plot in 3_Figures and open it to see how it looks!

# Save crime_final to a pdf file
ggsave(filename = "crime_plot", 
       plot = crime_final,
       device = "pdf", 
       path = '3_Figures',
       width = 4, 
       height = 4, 
       units = "in")
  1. Play around with the width and height arguments to change the dimensions of the plot.

  2. Customize the code to create a .png image.

X - Advanced: Maps

  1. ggplot2 also allows you to work with maps. The code below, plots a simple globe representation of the world.
world <- map_data("world")
map <- ggplot() +
  geom_polygon(data = world,  aes(long, lat, group = group), fill = "grey") +
  coord_map("ortho", orientation = c(30, 55, 0)) +

  1. You can add points to the map by creating a locations tibble containing longitude (lon) and latitude (lat) values and then using it in, e.g., geom_points().
# define locations tibble
locations <- tibble(
  city = c('Basel', 'New Delhi'),
  lon = c(7.58,  77.21),
  lat = c(47.55, 28.64))

# add locations to map
map + 
  geom_point(data = locations, 
             mapping = aes(x = lon, y = lat), 
             color = "steelblue", size = 4)

  1. In a similar fashion you can also add lines. Though, for lines it is required to plot the map in flat manner.
# define locations tibble
itineraries <- tibble(
  city = c('Basel-New Delhi'),
  lon_start = 7.58,
  lon_end = 77.21,
  lat_start = 47.55,
  lat_end = 28.64)

# flat map
map <- ggplot() +
  geom_polygon(data = world,  aes(long, lat, group = group), fill = "grey") +

# add locations to map
map + 
  geom_curve(data = itineraries,
             mapping = aes(x = lon_start, y = lat_start,
                           xend = lon_end, yend = lat_end),
                           col = 'steelblue', lty = 2, lwd = 1.2) +
  geom_point(data = locations, 
             mapping = aes(x = lon, y = lat), 
             color = "steelblue", size = 4)

X - Advanced: Interactive with plotly::ggplotly()

  1. With the ggplotly()-function from the plotly package, you can turn any ggplot object into an interactive plot like the one below! Run the following code to see it in action.
# Create a standard ggplot object
crime_plot <- ggplot(data = crime_long,
                     mapping = aes(x = PctUsePubTrans, y = frequency,
                     color = state, size = population)) + 
  geom_point() + 
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
# Make it interactive with ggplotly()!
  1. Play around with your plot! See what happens when you hover over the points with your mouse. You can even zoom in by dragging your mouse.

  2. Try turning one of your favorite previous plots into an interactive plotly plot using the ggplotly() function!


# ggplot2 -----------------------

library(tidyverse) # Load tidyverse (contains ggplot2!)

# create a scatter plot of highway miles per gallon against engine displacement
ggplot(data = mpg, 
       mapping = aes(x = displ, y = hwy)) +

# Store plot objects ------------

# store
my_plot <- ggplot(data = mpg, 
       mapping = aes(x = displ, y = hwy)) +

# evaluate (aka plot)

# Facets ------------

# create separate plots for each car class
my_plot <- my_plot + facet_wrap(~class)

# plot

# Customize themes ------------

# change panel background to 'green'
my_plot +
    panel.background = element_rect(fill='green')

# change grid lines
my_plot +
    panel.grid.major = element_line(color = 'red', size = 2),
    panel.grid.minor = element_line(color = 'blue', size = 1)

# change strip background and text
my_plot +
    strip.background = element_rect(fill = 'blue'),
    strip.text = element_text(face = 'bold', size = 12)

# change axis titles
my_plot +
    axis.title.y = element_text(size = 12, margin = margin(r = 10)),
    axis.title.x = element_text(size = 12, margin = margin(t = 10))

# change panel spacing
my_plot +
    panel.spacing = unit(2, "lines")

# Store themes ------------

# create theme
my_theme <- theme(
  panel.background = element_rect(fill='green'),
  panel.grid.major = element_line(color = 'red', size = 2),
  panel.grid.minor = element_line(color = 'blue', size = 1),
  strip.background = element_rect(fill = 'blue'),
  strip.text = element_text(face = 'bold', size = 12),
  strip.background = element_rect(fill = 'blue'),
  strip.text = element_text(face = 'bold', size = 12),
  axis.title.y = element_text(size = 12, margin = margin(r = 10)),
  axis.title.x = element_text(size = 12, margin = margin(t = 10)),
  panel.spacing = unit(2, "lines")

# apply theme
my_plot + my_theme # no parentheses

# Scaling ------------

# change x-axis scaling
my_plot + scale_x_continuous(limits = c(0, 10))

# change coloring
ggplot(data = mpg, 
       mapping = aes(x = displ, y = hwy,
                     color = class)) +
       geom_point() +
  scale_color_manual(values = viridis(7))
# Create image files ------------

# create pdf of my_plot
ggsave(filename = "my_plot_name", 
       plot = my_plot,
       device = "pdf", 
       path = 'plotting_folder',
       width = 4, 
       height = 4, 
       units = "in")


File Rows Columns
crime.csv 1071 36

The crime data set is subsets of the Communities and Crime Unnormalized Data Set data set from the UCI Machine Learning Repository. Find variable descriptions below or at Communities and Crime Unnormalized Data Set

Variable descriptions

Variable Description
communityname Community name
state US state (by 2 letter postal abbreviation)
population population for community
householdsize mean people per household
pctUrban number of people living in areas classified as urban
medIncome median household income
pctWSocSec percentage of households with social security income in 1989
pctWRetire percentage of households with retirement income in 1989
whitePerCap per capita income for caucasians
blackPerCap per capita income for african americans
AsianPerCap per capita income for people with asian
HispPerCap per capita income for people with hispanic heritage
PctPopUnderPov percentage of people under the poverty level
PctNotHSGrad percentage of people 25 and over that are not high school graduates
PctUnemployed percentage of people 16 and over, in the labor force, and unemployed
TotalPctDiv percentage of population who are divorced
PersPerFam mean number of people per family
PctWorkMom percentage of moms of kids under 18 in labor force
NumImmig total number of people known to be foreign born
PctImmigRecent percentage of immigrants who immigated within last 3 years
PctNotSpeakEnglWell percent of people who do not speak English well
RentMedian rental housing - median rent
NumInShelters number of people in homeless shelters
NumStreet number of homeless people counted in the street
PctForeignBorn percent of people foreign born
PctBornSameState percent of people born in the same state as currently living
LandArea land area in square miles
PopDens population density in persons per square mile
PctUsePubTrans percent of people using public transit for commuting
murders number of murders in 1995
robberies number of robberies in 1995
assaults number of assaults in 1995
burglaries number of burglaries in 1995
larcenies number of larcenies in 1995
autoTheft number of auto thefts in 1995
arsons number of arsons in 1995



Package Installation
tidyverse install.packages("tidyverse")


Package Installation
viridis install.packages("viridis")
ggmap install.packages("ggmap")
plotly install.packages("plotly")



Function Package Description
facet_wrap() ggplot2 Create facets that wrap to fit the screen
facet_grid() ggplot2 Create facets along one or more variables in a grid


Function Package Description
theme() ggplot2 Customize theme (see ?theme)
element_rect() ggplot2 Customize rect elements of theme
element_line() ggplot2 Customize line elements of theme
element_text() ggplot2 Customize text elements of theme
element_blank() ggplot2 Remove elements from theme


Function Package Description
scale_x_*(), scale_y_*() ggplot2 Various functions to control the x- and y-axes
scale_size_*() ggplot2 Various functions to control sizes
scale_color_*() ggplot2 Various functions to control colors
scale_fill_*() ggplot2 Various functions to control fill colors
scale_alpha_*() ggplot2 Various functions to control color transparency


Function Package Description
viridis() viridis Generate colors from the viridis palette


Function Package Description
geom_polygon() ggplot2 Geom used to draw map elements
register_google() ggmap Register Google API
geocode() ggmap Extract geocode for location (e.g., city)


Function Package Description
ggplotly() plotly Plotlify any ggplot plot (i.e., make it interactive)




from R Studio