0

I have generated the following summary of means and SDs for two parameters, SAR and ER, for several multi-column categories:

structure(list(category = c("DIT GROUP\nCR 16.0005\nADB", "DIT GROUP\nCR 16.0005\nUMB", 
"LATE TIMED GROUP\nCR 16.0005\nUMB", "LATE TIMED GROUP\nR -NF (16)\nUMB", 
"LATE TIMED GROUP\nCR 16.0013\nUMB"), SAR_m = c(0.0124685047055857, 
0.0116929321704855, 0.0107502700349996, 0.00237895138055938, 
0.00231425848742098), SAR_sd = c(0.00556156907619075, 0.00515705033822913, 
0.00485214336952352, 0.00121107790805573, 0.00140776861061631
), ER_m = c(0.413555948857483, 0.318170537834018, 0.271963089630801, 
0.389281391815171, 0.368785595691807), ER_sd = c(0.115277368081429, 
0.109433733193877, 0.112935566690964, 0.181796976516952, 0.126749040405446
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L))

I have been able to produce a bar graph with error bars in Excel, but I'm having trouble moving this step to R. I have not found a way to use ggplot2 to create the categories needed from the first 3 columns.

enter image description here

Ideally, I would even like to facet the SAR and ER data, with the previously mentioned categories on the horizontal, and the SAR and ER labels on the facets, but I realize that would require pivoting long once I get past the issue of combining columns to categories. Thanks.

2
  • 4
    Do you want to basically collapse the 3 columns? You could use unite() from {tidyr} : unite(data, category, c(Rel_Subgroup, Release_Site, MarkTag), sep = "\n") . This should give you a single column with values from the 3 columns each in a new line when plotted. If you need more than just collapsing the 3 columns (e.g. you don't want to repeat "DIT GROUP" in the first 2 labels) you can take a look at nested faceting in {ggh4x}. Commented Nov 26 at 7:41
  • 1
    Wouldn't it be more appropriate to show means as points and sd as "bars"? Commented Nov 27 at 17:57

1 Answer 1

2

Assuming we do not need to split category, this might get you started

X |>
  tidyr::pivot_longer(cols=-1, names_to=c('g', '.value'), names_sep='_') |>
  ggplot2::ggplot(ggplot2::aes(x=category, y=m, fill=g)) +
  ggplot2::geom_col(position='dodge') +                        
  ggplot2::geom_errorbar(ggplot2::aes(ymin=m-sd, ymax=m+sd)) +
  ggplot2::facet_wrap(~g, scales='free_y') +
  ggplot2::theme(axis.text.x=ggplot2::element_text(size=5))


# # A tibble: 10 × 4
# category                            g           m      sd
# <chr>                               <chr>   <dbl>   <dbl>
# 1 "DIT GROUP\nCR 16.0005\nADB"        SAR   0.0125  0.00556
# 2 "DIT GROUP\nCR 16.0005\nADB"        ER    0.414   0.115  
# 3 "DIT GROUP\nCR 16.0005\nUMB"        SAR   0.0117  0.00516
# 4 "DIT GROUP\nCR 16.0005\nUMB"        ER    0.318   0.109  
# 5 "LATE TIMED GROUP\nCR 16.0005\nUMB" SAR   0.0108  0.00485
# 6 "LATE TIMED GROUP\nCR 16.0005\nUMB" ER    0.272   0.113  
# 7 "LATE TIMED GROUP\nR -NF (16)\nUMB" SAR   0.00238 0.00121
# 8 "LATE TIMED GROUP\nR -NF (16)\nUMB" ER    0.389   0.182  
# 9 "LATE TIMED GROUP\nCR 16.0013\nUMB" SAR   0.00231 0.00141
# 10 "LATE TIMED GROUP\nCR 16.0013\nUMB" ER    0.369   0.127  

Not too sure we fully grasp your data structure, you might need to change the pivoting.

enter image description here

Edit. We do not know the entire work flow. However, it might be appropriate to display data like

X |>
  tidyr::pivot_longer(cols=-1, names_to=c('g', '.value'), names_sep='_') |>
  ggplot2::ggplot(ggplot2::aes(x=category, y=m, fill=g)) +
  ggplot2::geom_point() +                        
  ggplot2::geom_errorbar(ggplot2::aes(ymin=m-sd, ymax=m+sd)) +
  ggplot2::facet_wrap(~g, scales='free_y') +
  ggplot2::theme(axis.text.x=ggplot2::element_text(size=5)) +
  ggplot2::labs(y='means') + ggplot2::guides(fill='none') + ggplot2::theme_minimal()

enter image description here


Data

# > dput(X)
structure(list(category = c("DIT GROUP\nCR 16.0005\nADB", "DIT GROUP\nCR 16.0005\nUMB", 
"LATE TIMED GROUP\nCR 16.0005\nUMB", "LATE TIMED GROUP\nR -NF (16)\nUMB", 
"LATE TIMED GROUP\nCR 16.0013\nUMB"), SAR_m = c(0.0124685047055857, 
0.0116929321704855, 0.0107502700349996, 0.00237895138055938, 
0.00231425848742098), SAR_sd = c(0.00556156907619075, 0.00515705033822913, 
0.00485214336952352, 0.00121107790805573, 0.00140776861061631
), ER_m = c(0.413555948857483, 0.318170537834018, 0.271963089630801, 
0.389281391815171, 0.368785595691807), ER_sd = c(0.115277368081429, 
0.109433733193877, 0.112935566690964, 0.181796976516952, 0.126749040405446
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, 
-5L))
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.