--- title: "afex_plot: Publication Ready Plots for Experimental Designs" author: "Henrik Singmann" date: "`r Sys.Date()`" show_toc: true output: rmarkdown:::html_vignette: toc: yes vignette: > %\VignetteIndexEntry{afex_plot: Publication Ready Plots for Experimental Designs} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r echo=FALSE} req_suggested_packages <- c("emmeans", "ggplot2", "cowplot", "ggbeeswarm", "ggpol") pcheck <- lapply(req_suggested_packages, requireNamespace, quietly = TRUE) if (any(!unlist(pcheck))) { message("Required package(s) for this vignette are not available/installed and code will not be executed.") knitr::opts_chunk$set(eval = FALSE) } ``` ```{r set-options, echo=FALSE, cache=FALSE} op <- options(width = 90) knitr::opts_chunk$set(dpi=72) ``` `afex_plot()` visualizes results from factorial experiments combining estimated marginal means and uncertainties associated with the estimated means in the foreground with a depiction of the raw data in the background. Currently, `afex_plot()` supports the following models: - ANOVAs estimated with `aov_car()`, `aov_ez()`, or `aov_4()` (i.e., objects of class `"afex_aov"`) - Linear mixed models estimated with `mixed()` (i.e., objects of class `"mixed"`) - Linear mixed models estimated with `lme4::lmer` (i.e., objects of class `"merMod"`) - Models with `emmeans` support. For some examples see `vignette("afex_plot_supported_models", package = "afex")` This document provides an overview of the plots possible with `afex_plot()`. It does so mostly using the `afex_plot()` examples, see `?afex_plot`. We begin by loading `afex` and [`ggplot2`](https://ggplot2.tidyverse.org/) which is the package `afex_plot()` uses for plotting. Loading `ggplot2` explicitly is not strictly necessary, but makes the following code nicer. Otherwise, we would need to prepend each call to a function from `ggplot2` needed for customization with `ggplot2::` (as is done in the examples in `?afex_plot`). We also load the [`cowplot`](https://cran.r-project.org/package=cowplot) package ([introduction](https://cran.r-project.org/package=cowplot/vignettes/introduction.html)) which makes combining plots (with functions `plot_grid()` and `legend()`) very easy. However, loading `cowplot` sets a different theme for `ggplot2` plots than the default grey one. Although I am not a big fan of the default theme with its grey background, we reset the theme globally using `theme_set(theme_grey())` to start with the default behavior if `cowplot` is not attached. Note that `cowplot` also has the cool `draw_plot()` function which allows embedding plots within other plots. We furthermore will need the following packages, however, we will not attach them directly, but only call a few selected functions using the `package::function` notation. - [`jtools`](https://cran.r-project.org/package=jtools) for `theme_apa()` - [`ggpubr`](https://cran.r-project.org/package=jtools) for `theme_pubr()` - [`ggbeeswarm`](https://cran.r-project.org/package=ggbeeswarm) for producing bee swarm plots with `geom_beeswarm` - [`ggpol`](https://cran.r-project.org/package=ggpol) for producing combined box plots and jitter plots using `geom_boxjitter` ```{r message=FALSE, warning=FALSE} library("afex") library("ggplot2") library("cowplot") theme_set(theme_grey()) ``` # Two-Way Within-Subjects ANOVA We begin with a two-way within-subjects ANOVA using synthetic data from Maxwell and Delaney (2004, p. 547). The data are hypothetical reaction times from a 2 x 3 Perceptual Experiment with factors `angle` with 3 levels and factor `noise` with 2 levels (see `?md_12.1` for a longer description). We first load the data and then fit the corresponding ANOVA. ```{r} data(md_12.1) (aw <- aov_ez("id", "rt", md_12.1, within = c("angle", "noise"))) ``` The ANOVA shows that both, the two main effect as well as the interaction, are significant. We therefore inspect the pattern underlying the interaction. There exist two different ways of plotting a 2-way interaction. Either of the two variables can be depicted on the x-axis. And before having looked at both cases, it is often not clear which visualization of the interaction is more instructive. Consequently, we plot both next to each other. For this we simply need to exchange which variable is the `x` factor and which is the `trace` factor. We then use `plot_grid()` to plot them next to each other. ## Basic Plot ```{r fig.width=9, fig.height=4} p_an <- afex_plot(aw, x = "angle", trace = "noise") p_na <- afex_plot(aw, x = "noise", trace = "angle") plot_grid(p_an, p_na) ## try adding: labels = "AUTO" ``` Before we can even take a look at the plot, we notice that creating the plots has produced two warnings. These warnings complain that the plots depict within-subject factors, but do not use within-subject error bars. However, the warnings also tell us the solution (i.e., adding `error = "within"`), which we will do in the following. The help page `?afex_plot` contains more information on which type of error bars are appropriate in which situation and how to interpret different type of error bars. For ANOVAs, `afex_plot()` will emit warnings if it thinks the error bars are not appropriate for the chosen factors. Comparing both plots, my impression is that the plot with `angle` on the `x`-axis tells the clearer story. We can see that when `noise` is `absent` there is hardly any effect of the increase of `angle`. However, if `noise` is `present` an increasing `angle` clearly leads to increased RTs. We therefore use this plot in the following. ## Exploring Graphical Options and Themes We now produce a new variant of the left plot using more appropriate error bars and change several other graphical details which make the plot publication ready. We use the `factor_levels` argument to `afex_plot()` for renaming the factor levels (for technical reasons the ANOVA functions in `afex` transform all factor levels to proper `R` variable names using `make.names()` which changed the labels from e.g., `4` to `X4`) and the `legend_title` argument for changing the title of the legend. We also change the labels on the `x` and `y` axis. ```{r} p_an <- afex_plot(aw, x = "angle", trace = "noise", error = "within", factor_levels = list(angle = c("0°", "4°", "8°"), noise = c("Absent", "Present")), legend_title = "Noise") + labs(y = "RTs (in ms)", x = "Angle (in degrees)") ``` As the additional output shows, changing the factor levels via `factor_levels` emits a `message` detailing old and new factor levels in the form `old -> new`. This message can be suppressed by wrapping the `afex_plot()` call into a `suppressMessages()` call or via `RMarkdown` settings. Note that we could have also used the `factor_levels` argument for changing the order of the factor levels by passing a named character vector (e.g., `factor_levels = list(angle = c(X8 = "8°", X4 = "4°", X0 = "0°"))`). This would change the order either on the x-axis or in the legend. As said above, I am not a big fan of the default grey theme of `ggplot2` plots. Consequently, we compare a number of different themes for this plot in the following. For all but `ggpubr::theme_pubr()`, we also move the legend to the bottom as this better allows the plot to cover only a single column in a two-column layout. `ggpubr::theme_pubr()` automatically plots the legend on top. ```{r fig.width=8.5, fig.height=6, dpi = 150} plot_grid( p_an + theme_bw() + theme(legend.position="bottom"), p_an + theme_light() + theme(legend.position="bottom"), p_an + theme_minimal() + theme(legend.position="bottom"), p_an + jtools::theme_nice() + theme(legend.position="bottom"), p_an + ggpubr::theme_pubr(), p_an + theme_cowplot() + theme(legend.position="bottom"), labels = "AUTO" ) ``` The first row, panels A to C, shows themes coming with `ggplot2` and the second row, panels D to F, shows themes from additional packages. In my opinion all of these plots are an improvement above the default grey theme. For the themes coming with `ggplot2`, I really like that those shown here have a reference grid in the background. This often makes it easier to judge the actual values the shown data points have. I know that many people find this distracting, so many of the contributed themes do not have this grid. One thing I really like about the last two themes is that they per default use larger font sizes for the axes labels. One way to achieve something similar for most themes is to change `base_size` (see example below). One general criticism I have with the current plots is that they show too many values on the y-axis. In the following I plot one more variant of this plot in which we change this to three values on the y-axis. We also increase the axes labels and remove the vertical grid lines. ```{r fig.width=3.5, fig.height=3, dpi = 100, out.width='50%'} p_an + scale_y_continuous(breaks=seq(400, 900, length.out = 3)) + theme_bw(base_size = 15) + theme(legend.position="bottom", panel.grid.major.x = element_blank()) ``` We can also set this theme for the remainder of the `R` session with `theme_set()`. ```{r} theme_set(theme_bw(base_size = 15) + theme(legend.position="bottom", panel.grid.major.x = element_blank())) ``` ## Saving Plots and Plot Sizes To get our plot into a publication, we need to export it as a graphics file. I would generally advise against exporting plots via the `RStudio` interface as this is not reproducible. Instead I would use some of the following functions which save the document in the current working directory. Note that following [Elsevier guidelines](https://www.elsevier.com/about/policies-and-standards/author/artwork-and-media-instructions/artwork-sizing), a single column figure should have a width of 9 cm (~ 3 inch) and a two column figure should have a width of 19 cm (~ 7.5 inch). For Word or similar documents I would export the plot as a `png` (never `jpg`): ```{r, eval=FALSE} ggsave("my_plot.png", device = "png", width = 9, height = 8, units = "cm", dpi = 600) ## the larger the dpi, the better the resolution ``` For `LaTeX` I would export as `pdf`: ```{r, eval=FALSE} ggsave("my_plot.pdf", device = "pdf", width = 9, height = 8, units = "cm") ``` ## Data in the Background `afex_plot()` per default plots the raw data in the background. It does so using an [alpha blending](https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending) of `0.5`. Thus, overlapping points appear darker. Examples of this can be seen in the previous graphs where some data points in the background appear clearly darker than others. The darker points indicate values for which several data points lie exactly on top of each other. `afex_plot()` provides the possibility to change or alter the graphical primitive, called `geom` in `ggplot2` parlance, used for plotting the points in the background. This offers a vast array of options for handling overlapping points or, more generally, how to display the raw data in the background. Let's take a look at some of these examples in the following. ### Single Geom The simplest way for showing the data in the background is using a single geom, like in the default setting. The following figure shows eight different variants of a single geom. The plot in the top left shows the default variant. The first alternative variant adds vertical jitter to the points to make the overplotting clearer. The other alternatives use other geoms. Note that depending on the specific variant we change a few further plot options to obtain a visually pleasing result. For example, the `dodge` argument controls the spread of points belonging to different levels of the `trace` factor at each x-axis position. 1. Default background geom. 1. Add jitter on the y-axis to points which avoids perfect overlap. 3. Size of points show number of data points at a given y-axis position: `geom_count`. For this geom, adding a call to `scale_size_area()` can sometimes be beneficial. 3. Violin plot: `geom_violin` 4. Box plot: `geom_boxplot`. Note that for this plot we have added `linetype = 1` to `data_arg`, which avoids that the outline of the box plots is affected by the `linetype` mapping (compare this with the violin plot where the outline of the violin differs across levels of the angle factor). 2. Display points using a bee swarm plot, which displaces overlapping points on the x-axis: `ggbeeswarm::geom_beeswarm` 2. Display points using a variant of the bee swarm plot in which points are jittered horizontally to show the shape of the distribution in the same way as the violin plot: `ggbeeswarm::geom_quasirandom` 5. Combine box plot with jittered points: `ggpol::geom_boxjitter` ```{r fig.width=8.5, fig.height=16, dpi = 125} p0 <- afex_plot(aw, x = "noise", trace = "angle", error = "within") p1 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.3, data_arg = list( position = ggplot2::position_jitterdodge( jitter.width = 0, jitter.height = 25, dodge.width = 0.3 ## needs to be same as dodge ))) p2 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.5, data_geom = geom_count) p3 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", data_geom = geom_violin, data_arg = list(width = 0.5)) p4 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", data_geom = geom_boxplot, data_arg = list(width = 0.3, linetype = 1)) p5 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.5, data_geom = ggbeeswarm::geom_beeswarm, data_arg = list( dodge.width = 0.5, ## needs to be same as dodge cex = 0.8)) p6 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.5, data_geom = ggbeeswarm::geom_quasirandom, data_arg = list( dodge.width = 0.5, ## needs to be same as dodge cex = 0.8, width = 0.05 ## choose small value so data points are not overlapping )) p7 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.7, data_geom = ggpol::geom_boxjitter, data_arg = list( width = 0.5, jitter.params = list(width = 0, height = 10), outlier.intersect = TRUE), point_arg = list(size = 2.5), error_arg = list(linewidth = 1.5, width = 0)) plot_grid(p0, p1, p2, p3, p4, p5, p6, p7, ncol = 2, labels = 1:8) ``` ### Multiple Geoms We can also use multiple geoms to plot the data in the background. To do so, we need to pass a list of geoms to `data_geom`. We can then also set by-geom additional arguments by passing a list of arguments to `data_arg`. For example, we can combine a violin plot, drawing the outline of the shape of the distribution, with `geom_quasirandom`, showing each individual data point in the same shape. ```{r fig.width=3.5, fig.height=3, dpi = 100, out.width='50%'} afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.5, data_geom = list( geom_violin, ggbeeswarm::geom_quasirandom ), data_arg = list( list(draw_quantiles = c(0.25, 0.5, 0.75)), list(dodge.width = 0.5, width = 0.05) )) ``` ## Adding Color to Plots So far, all plots were shown in black and white only. However, it is easy to include color. We do so for some of the plots from above. To achieve this, we have to change the value of the `mapping` argument. ```{r fig.width=8.5, fig.height=8, dpi = 125} p2 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.5, mapping = c("shape", "color"), data_geom = ggbeeswarm::geom_beeswarm, data_arg = list( dodge.width = 0.5, ## needs to be same as dodge cex = 0.8)) p3 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", mapping = c("linetype", "shape", "fill"), data_geom = ggplot2::geom_violin, data_arg = list(width = 0.5)) p4 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", mapping = c("shape", "fill"), data_geom = ggplot2::geom_boxplot, data_arg = list(width = 0.3)) p5 <- afex_plot(aw, x = "noise", trace = "angle", error = "within", dodge = 0.7, mapping = c("shape", "fill"), data_geom = ggpol::geom_boxjitter, data_arg = list( width = 0.5, jitter.params = list(width = 0, height = 10), outlier.intersect = TRUE), point_arg = list(size = 2.5), line_arg = list(linetype = 0), error_arg = list(linewidth = 1.5, width = 0)) plot_grid(p2, p3, p4, p5, ncol = 2) ``` ## Plotting Order and Error Bars For graphical elements in the foreground, `afex_plot()` first plots all graphical elements belonging to the same factor level before plotting graphical elements belonging to different factor levels. This provides a consistent graphical impression for each factor level that is particularly relevant in case color is mapped. In case we have overlapping lines and error bars or use thick lines, we sometimes do not want that the error bars also receive different line types. In this case, we can simply pass `linetype = 1` to `error_arg` to overwrite the corresponding mapping. This is shown in the right plot. ```{r fig.width=8.5, fig.height=4, dpi = 150} p1 <- afex_plot(aw, x = "noise", trace = "angle", mapping = c("color"), error = "within", point_arg = list(size = 5), line_arg = list(size = 2), error_arg = list(linewidth = 2)) p2 <- afex_plot(aw, x = "noise", trace = "angle", mapping = c("color", "shape", "linetype"), error = "within", point_arg = list(size = 5), line_arg = list(size = 2), error_arg = list(linewidth = 2, width = 0, linetype = 1)) plot_grid(p1, p2, ncol = 2) ``` ## One-way Plots Without Trace Factor If `afex_plot()` is called without a trace factor, a one-way plot is created. We can customize this plot in very much the same way. Per default a one-way plot contains a legend if `mapping` is not empty (i.e., `""`). We show this legend for the left plot, but suppress it for the right one via `theme(legend.position="none")`. ```{r fig.width=7, fig.height=3.5, message=FALSE} po1 <- afex_plot(aw, x = "angle", mapping = "color", error = "within", point_arg = list(size = 2.5), error_arg = list(linewidth = 1.5, width = 0.05)) po2 <- afex_plot(aw, x = "angle", error = "within", data_geom = ggpol::geom_boxjitter, mapping = "fill", data_alpha = 0.7, data_arg = list( width = 0.6, jitter.params = list(width = 0.05, height = 10), outlier.intersect = TRUE ), point_arg = list(size = 2.5), error_arg = list(linewidth = 1.5, width = 0.05)) + theme(legend.position="none") plot_grid(po1, po2) ``` One-way plots can also be split across different panels by specifying a `panel` factor: ```{r fig.width=7, fig.height=3.5, message=FALSE} afex_plot(aw, x = "angle", panel = "noise", error = "within", data_geom = ggpol::geom_boxjitter, mapping = "fill", data_alpha = 0.7, data_arg = list( width = 0.6, jitter.params = list(width = 0.05, height = 10), outlier.intersect = TRUE ), point_arg = list(size = 2.5), error_arg = list(linewidth = 1.5, width = 0.05)) + theme(legend.position="none") ``` Sometimes we still want to add a line connecting the estimated marginal means. As `afex_plot()` returns a `ggplot2` object, we can do this easily by adding a `geom_line` object to the call. As we want to add a line through all of the shown points in the foreground, we need to add the corresponding groups aesthetics to this call: `geom_line(aes(group = 1))`. We can add further arguments to this call, as shown in the left panel below. ```{r fig.width=7, fig.height=3.5, message=FALSE} plot_grid( po1 + geom_line(aes(group = 1), color = "darkgrey", size = 1.5), po2 + geom_line(aes(group = 1)) ) ``` # 3-Way Mixed Model ## Data and Model ```{r, echo=FALSE} load(system.file("extdata/", "output_afex_plot_mixed_vignette_model.rda", package = "afex")) ``` To exemplify the support for linear mixed models, we will use the data from Freeman and colleagues also discussed in the [mixed model vignette](https://cran.r-project.org/package=afex/vignettes/afex_mixed_example.html). These data are lexical decision and word naming latencies for 300 words and 300 nonwords from 45 participants presented in Freeman et al. (2010). The dependent variable we are interested in is `log` RTs. We look at the same model also discussed in the vignette, with factors `task` (between participants, but within items), `stimulus` (within participants, but between items), `density` (within participants, but between items), and `frequency` (within participants, but between items), for a total of almost 13,000 observations. We fit the model with crossed-random effects for participants (`id`) and `item`s using the final model, `m9s`, as discussed in the mixed model vignette. ```{r, eval=FALSE} data("fhch2010") # load fhch <- droplevels(fhch2010[ fhch2010$correct,]) # remove errors m9s <- mixed(log_rt ~ task*stimulus*density*frequency + (stimulus+frequency||id)+ (task|item), fhch, method = "S", control = lmerControl(optCtrl = list(maxfun = 1e6)), expand_re = TRUE) ``` Note that going forward, we disable calculation of degrees of freedom for `emmeans` as this speeds up computation and/or avoids messages we are currently not interested in. ```{r} emmeans::emm_options(lmer.df = "asymptotic") ``` The ANOVA table of the mixed model indicates that the three-way interaction `task:stimulus:frequency` is significant on which we focus in the following. ```{r, eval=TRUE} m9s ``` ## Which Data to Plot in the Background For mixed models, one important decision is the random-effects grouping factor(s) based on which the raw data plotted in the background is aggregated. This decision is necessary, because without such a factor, there would only be one observation for each cell of the design (unless the full design is considered). In the default setting, with `id` missing, the combination of all random-effects grouping factors is used. ```{r fig.width=7, fig.height=3.5, eval=TRUE} afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task") ``` In the present case, a message informs us that the data is aggregated over both random-effects grouping factors. However, this leads to way too many data points in the background. Let us compare this plot with plots in which we use each of the two random-effects grouping factors in turn. ```{r fig.width=7, fig.height=3.5, eval=TRUE} plot_grid( afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task", id = "id"), afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task", id = "item"), labels = c("ID", "Item") ) ``` The by-id plot looks usable. However, the by-item plot has still way too many data-points to be informative. Some other ways of displaying the raw data, such as violin plots or box plots, seems preferable for it. ## Ways of Plotting Data in the Background We compare violin plots or box plots for the by-item data in the next plot. For the box plot, we increase the width of the error bars and use a consistent line type to distinguish them more easily from the graphical elements of the box plot. We could probably further improve these plots by, for example, adding colors or using some of the other customizations discussed above for the ANOVA example. ```{r fig.width=7, fig.height=3.5, eval=TRUE} plot_grid( afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task", id = "item", dodge = 0.8, data_geom = geom_violin, data_arg = list(width = 0.5)), afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task", id = "item", dodge = 0.8, data_geom = geom_boxplot, data_arg = list(width = 0.5), error_arg = list(linewidth = 1.5, width = 0, linetype = 1)) ) ``` ## Error Bars for Mixed Models The default error bars for `afex_plot()` are based on the statistical model (i.e., the mixed model in the present case). These error bars can only be used to judge whether or not two means differ from each other, if the corresponding factor (or factors) are independent samples factors (i.e., not repeated-measures factors for any of the random-effects grouping factors). Of course, in addition to this the requirement of approximately equal sample size and variance also needs to hold. In the present case, all of the factors are repeated-measures factors with respect to one of the random-effects grouping factors. Consequently, the default error bars cannot be used for "inference by eye" for any of the factors. This is also easy to see when looking at all pairwise comparisons between means for each of the panels/tasks. This shows that for the `naming` task all comparisons are significant. In visual contrast with that, the two error bars for the `low` versus `high` `word`s are overlapping very strongly. ```{r, eval=FALSE} pairs(emmeans::emmeans(mrt, c("stimulus", "frequency"), by = "task")) ``` ```{r, echo=FALSE} cat(aout_2$output, sep = "\n") ``` An alternative in the present situation would be using within-subjects error bars and aggregating the data by-id (i.e., `error = "within"`), as done in the left panel below. This is somewhat appropriate here as the factors within each panel are all within-subject factors. In contrast, using by-item within-subjects error bars, as done in the right panel below, seems not appropriate as the only within-item factor, `task`, is spread across panels. Unfortunately, it is not immediately clear if these error bars allow one to correctly detect which means do not differ from each other. ```{r fig.width=7, fig.height=3.5, eval=FALSE} plot_grid( afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task", id = "id", error = "within"), afex_plot(m9s, x = "stimulus", trace = "frequency", panel = "task", id = "item", dodge = 0.8, error = "within", data_geom = geom_violin, data_arg = list(width = 0.5)) ) ``` In sum, using error bars for performing "inference by eye" - that is, using overlap or non-overlap of error bars to judge which means differ or do not differ from each other - is highly problematic for mixed models, due to the potentially complex dependency structures between the means. It would be best to avoid comparisons between means altogether. Instead, it is perhaps a good idea to plot the model-based error bars (which is the default) and use them for their intended purpose; judging which values of the estimated means are likely given what we have learned from the model (however, note that one cannot interpret a 95% confidence interval as having a 95% probability of containing the population mean). The help page `?afex_plot` contains further information and references on how to interpret confidence intervals and other error bars. # References * Freeman, E., Heathcote, A., Chalmers, K., & Hockley, W. (2010). Item effects in recognition memory for words. _Journal of Memory and Language_, 62(1), 1-18. https://doi.org/10.1016/j.jml.2009.09.004 * Maxwell, S. E., & Delaney, H. D. (2004). _Designing Experiments and Analyzing Data: A Model-Comparisons Perspective._ Mahwah, N.J.: Lawrence Erlbaum Associates. ```{r, include=FALSE} options(op) ```