Boxplot with respect to two factors using ggplot2 in R

  • I am very new to R and to any packages in R. I looked at the ggplot2 documentation but could not find this. I want a box plot of variable boxthis with respect to two factors f1 and f2. That is suppose both f1 and f2 are factor variables and each of them takes two values and boxthis is a continuous variable. I want to get 4 boxplots on a graph, each corresponding to one combination from the possible combinations that f1 and f2 can take. I think using the basic functionality in R, this can be done by

    > boxplot(boxthis ~ f1 * f2 , data = datasetname) 
    

    Thanks in advance for any help.

    Please provide sample data, to in order to get precise answers.

    This question would almost certainly be a better fit for stackoverflow.com, as there is little specifically statistical here.

  • Bernd Weiss

    Bernd Weiss Correct answer

    9 years ago

    I can think of two ways to accomplish this:

    1. Create all combinations of f1 and f2 outside of the ggplot-function

    library(ggplot2)
    
    df <- data.frame(f1=factor(rbinom(100, 1, 0.45), label=c("m","w")), 
                     f2=factor(rbinom(100, 1, 0.45), label=c("young","old")),
                     boxthis=rnorm(100))
    
    df$f1f2 <- interaction(df$f1, df$f2)
    
    ggplot(aes(y = boxthis, x = f1f2), data = df) + geom_boxplot()
    

    enter image description here

    2. use colour/fill/etc.

    ggplot(aes(y = boxthis, x = f2, fill = f1), data = df) + geom_boxplot()
    

    enter image description here

    (+1) I like the use of `interaction()`. Of note, we can specify `geom_boxplot(position = position_dodge(width = .9))` to add extra space between boxplots.

    You can also use `dodge` argument in `ggplot` function - `ggplot(aes(y = boxthis, x = f2, fill = f1, dodge=f1), data = df) + geom_boxplot()`

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM