class: center, middle, inverse, title-slide # Plotting I ### Introduction to R
Bern R Bootcamp
### June 2020 --- layout: true <div class="my-footer"> <span style="text-align:center"> <span> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/by-sa.png" height=14 style="vertical-align: middle"/> </span> <a href="https://therbootcamp.github.io/"> <span style="padding-left:82px"> <font color="#7E7E7E"> www.therbootcamp.com </font> </span> </a> <a href="https://therbootcamp.github.io/"> <font color="#7E7E7E"> R Bootcamp Bern | June 2020 </font> </a> </span> </div> --- .pull-left4[ <br><br><br> > ### As good as R is for statistics, it's as good if not better for data visualisation > ### Nathaniel D. Phillips ] .pull-right6[ <br> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/ggplotgallery.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left45[ # Base R Plotting The <high>classic framework</high> of plotting. Contains separate <high>function for each 'type'</high> of plot. E.g. `barplot()` for a bar plot, `boxplot()` for a box plot, and `plot()` for a scatterplot. <br> ```r # Histogram in base R hist(x = baselers$age, xlab = "Age", ylab = "Frequency", main = "Baselers Age") ``` ] .pull-right5[ <br><br><br> <img src="PlottingI_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ # Base R Plotting The <high>classic framework</high> of plotting. Contains separate <high>function for each 'type'</high> of plot. E.g. `barplot()` for a bar plot, `boxplot()` for a box plot, and `plot()` for a scatterplot. <br> ```r # Boxplot in base R boxplot(formula = height ~ sex, data = baselers, xlab = "Sex", ylab = "Height", main = "Box plot") ``` ] .pull-right45[ <br><br><br> <img src="PlottingI_files/figure-html/unnamed-chunk-7-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ # Base R Plotting The <high>classic framework</high> of plotting. Contains separate <high>function for each 'type'</high> of plot. E.g. `barplot()` for a bar plot, `boxplot()` for a box plot, and `plot()` for a scatterplot. <br> ```r # Scatterplot in base R plot(x = baselers$height, y = baselers$income, xlab = "Height", ylab = "Income", main = "Scatterplot") ``` ] .pull-right45[ <br><br><br> <img src="PlottingI_files/figure-html/unnamed-chunk-9-1.png" style="display: block; margin: auto;" /> ] --- # Problems with Base R plotting .pull-left35[ - Default plots look pretty <high>outdated</high>.<br> - Plots can quickly require a <high>LOT of code</high>.<br> - Can't store plots as <high>objects</high> to reference and update later<br> <p align="center"><high>Solution: `ggplot2` </high></p> <img src="https://raw.githubusercontent.com/rstudio/hex-stickers/master/PNG/ggplot2.png" width="45%" style="display: block; margin: auto;" /> ] .pull-right55[ This plot would take <high>a lot of code in Base R</high> but <high>just 10 lines of code</high>, 5 of which controlling the labels, in `ggplot2`. <img src="PlottingI_files/figure-html/unnamed-chunk-11-1.png" style="display: block; margin: auto;" /> ] --- # Grammar of Graphics in `ggplot2` .pull-left45[ The <high>Grammar of graphics</high> breaks down plots into several key pieces: | Aesthetics| Description| |:------|:----| | Data| What dataframe contains the data?| | axes| What does the x-axis, y-axis, color (etc) represent?| | color| What does color represent? | | size | What does size represent? | | geometries| What kind of geometric object do you want to plot?| | facets| Should there be groups of plots?| ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-12-1.png" style="display: block; margin: auto;" /> ] --- # Our goal: Creating this plot .pull-left45[ <high>Data</high> - Use the `mpg` tibble <high>Aesthetics</high> - Engine displacement (`disp`) on the x axis - Highway miles per gallon (`hwy`) on the y-axis - Color plotting elements by the `class` of car <high>Geometric objects</high> - Show data as points - Add a regression line <high>Labels and themes</high> - Add plotting labels - Use a black and white plotting theme ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-13-1.png" style="display: block; margin: auto;" /> ] --- # `ggplot()` .pull-left45[ To <high>create a ggplot2 object</high>, use the `ggplot()` function `ggplot()` has two main arguments: - `data` - A data frame (aka `tibble`) - `mapping` - A call to `aes()` ] .pull-right45[ ```r ggplot(data = mpg) ``` <img src="PlottingI_files/figure-html/unnamed-chunk-14-1.png" style="display: block; margin: auto;" /> ] --- # `ggplot()` .pull-left45[ An <high>aesthetic mapping</high> is a visual property of the objects in your plot. Use `aes()` to assign columns in your dataframe to properties in your plot. Common aesthetics are... | aesthetics| Description| |:------|:----| | `x`, `y`| Data mapped to coordinates| | `color`, `fill`| Border and fill colors| | `alpha`| Transparency| | `size`| Size| | `shape`| Shape| ] .pull-right45[ ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) ``` <img src="PlottingI_files/figure-html/unnamed-chunk-15-1.png" style="display: block; margin: auto;" /> ] --- # Adding elements to plots with `+` .pull-left45[ Once you have specified the `data` argument, and global aesthetics with `mapping = aes()`, <high>add additional elements to the plot with `+`</high>. The `+` operator works just like the pipe `%>%` in `dplyr`. <high>It just means "and then..."</high> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + #and then geom_point() ``` ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-17-1.png" style="display: block; margin: auto;" /> ] --- # Geometric objects (`geom`) .pull-left4[ A <high>`geom`</high> is a geometric object in a plot that represents data To add a geom to a plot, just include ` + geom_X()` where X is the type of geom. Common geoms are... | geom| output| |:------|:----| | `geom_point()`| Points| | `geom_bar()`| Bar| | `geom_boxplot()`| Boxplot | `geom_count()`| Points with size reflecting frequency| | `geom_smooth()`| Smoothed line| ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-18-1.png" style="display: block; margin: auto;" /> ] --- .pull-left45[ <br> ## `geom_boxplot()` <br> ```r ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_boxplot() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-19-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <br> ## `geom_violin()` <br> ```r ggplot(data = mpg, mapping = aes(x = class, y = hwy)) + geom_violin() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-20-1.png" width="100%" style="display: block; margin: auto;" /> ] --- .pull-left45[ <br> ## `geom_bar()` <br> ```r ggplot(data = mpg, mapping = aes(x = class)) + geom_bar() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-21-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right45[ <br> ## `geom_count()` <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) + geom_count() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-22-1.png" width="100%" style="display: block; margin: auto;" /> ] --- # `aes()` .pull-left45[ <high>`color`</high> geoms according to a variable. ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, color = class)) + geom_point() ``` <p align="center"> `mpg`</p> | displ| hwy|class | year| |-----:|---:|:-------|----:| | 3.0| 26|compact | 1999| | 4.7| 12|suv | 2008| | 2.8| 26|compact | 1999| | 4.7| 15|suv | 1999| | 4.0| 19|suv | 1999| ] .pull-right45[ <br> <img src="PlottingI_files/figure-html/unnamed-chunk-25-1.png" style="display: block; margin: auto;" /> ] --- # What's next? .pull-left45[ <img src="PlottingI_files/figure-html/unnamed-chunk-26-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-27-1.png" style="display: block; margin: auto;" /> ] --- # `geom_smooth()` .pull-left45[ `geom_smooth()` adds a <high>smoothed line</high>. Change how the line is created with `method` (e.g., `method = lm`). Color the line with `col`. <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue") ``` ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-29-1.png" style="display: block; margin: auto;" /> ] --- # `geom_smooth()` .pull-left45[ `geom_smooth()` adds a <high>smoothed line</high>. Change how the line is created with `method` (e.g., `method = lm`). Color the line with `col` <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue", method = "lm") ``` ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-31-1.png" style="display: block; margin: auto;" /> ] --- # Overriding aesthetics .pull-left45[ If you add additional plotting aesthetics, they will <high>override</high> the general plotting aesthetics. This is what happens, when you don't override... <br> ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth() # no overriding ``` ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-33-1.png" style="display: block; margin: auto;" /> ] --- # What's next? .pull-left45[ <img src="PlottingI_files/figure-html/unnamed-chunk-34-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-35-1.png" style="display: block; margin: auto;" /> ] --- # `labs()` .pull-left45[ You can add <high>labels</high> to a plot with the `labs()` function `labs()` arguments are ... - `title` - Main title - `subtitle` - Subtitle - `caption` - Caption below ```r ggplot(...) + labs(x = "Engine Displ...", y = "Highway miles...", title = "MPG data", subtitle = "Cars with ...", caption = "Source...") ``` ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-37-1.png" style="display: block; margin: auto;" /> ] --- # What's next? .pull-left45[ <img src="PlottingI_files/figure-html/unnamed-chunk-38-1.png" style="display: block; margin: auto;" /> ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-39-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX()` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> etc. ] .pull-right45[ ```r ggplot(...) + theme_gray() # The Default theme ``` <img src="PlottingI_files/figure-html/unnamed-chunk-41-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX()` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> etc. ] .pull-right45[ ```r ggplot(...) + theme_light() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-43-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX()` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> etc. ] .pull-right45[ ```r ggplot(...) + theme_void() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-45-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX()` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2` `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> etc. ] .pull-right45[ ```r ggplot(...) + theme_excel() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-47-1.png" style="display: block; margin: auto;" /> ] --- # Themes with `theme_XX()` .pull-left45[ A plotting <high>theme</high> controls many aspects of its <high>overall look</high>, from the background, to the grid lines, to the label font to the spacing between plot labels and the plotting space. Themes built into `ggplot2`: `theme_bw()`<br> `theme_minimal()`<br> `theme_classic()`<br> `theme_light()`<br> `theme_void()` Themes from the `ggthemes` package `theme_excel()`<br> `theme_economist()`<br> etc. ] .pull-right45[ ```r ggplot(...) + theme_economist() ``` <img src="PlottingI_files/figure-html/unnamed-chunk-49-1.png" style="display: block; margin: auto;" /> ] --- # Final result! .pull-left45[ ```r ggplot(data = mpg, mapping = aes(x = displ, y = hwy, col = class)) + geom_point() + geom_smooth(col = "blue", method = "lm")+ labs(x = "Engine Displ. in Liters", y = "Highway miles per gallon", title = "MPG data", subtitle = "Cars with higher...", caption = "Source: mpg data...") + theme_bw() ``` ] .pull-right45[ <img src="PlottingI_files/figure-html/unnamed-chunk-51-1.png" style="display: block; margin: auto;" /> ] --- class: middle, center <h1><a href="https://dwulff.github.io/Intro2R_Unibe/_sessions/PlottingI/PlottingI_practical.html">Practical</a></h1>