Quarto: Part II

Making useful things reproducibly

Rick Gilmore

2026-05-10

Preliminaries

Follow-along

Figure 1: https://penn-state-open-science.github.io/bootcamp-2026-quarto-II/

Agenda

  • Why reproducibility?
  • Why we should care
  • Requirements for reproducibility
  • What useful things?
  • Case studies
  • What’s next?

Why reproducibility

What R we talking about?

  • Findings should be reproducible
    • Same data + same code -> same results
  • Findings should be robust
    • Same data + new analysis -> comparable results & conclusions
  • Findings should be replicable
    • New data -> comparable results & conclusions
Figure 2: Miske et al. (2026)
Figure 3: Miske et al. (2026) Figure 1
Figure 4: Miske et al. (2026) Figure 2
Figure 5: Miske et al. (2026) Figure 5

Miske et al. (2026)

We assessed 143 out of the 182 available datasets and found that 76.6 (53.6%, 95% CI=45.8–60.7%) papers were rated as precisely reproducible

Miske et al. (2026)

…and 105.0 (73.5%, 95% CI=66.4–80.0%) were rated as at least approximately reproducible

Miske et al. (2026)

…Implementation of measures to verify that research is reproducible is needed to support trustworthiness in the complex enterprise of knowledge production.

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

…The initial aim of the project was to repeat 193 experiments from 53 high-impact papers…However, the various barriers and challenges we encountered while designing and conducting the experiments meant that we were only able to repeat 50 experiments from 23 papers…

Errington, Denis, Perfito, Iorns, & Nosek (2021)

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

…the data needed to compute effect sizes and conduct power analyses was publicly accessible for just 4 of 193 experimentsnone of the 193 experiments were described in sufficient detail in the original paper to enable us to design protocols to repeat the experiments…

Errington et al. (2021)

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

…While authors were extremely or very helpful for 41% of experiments, they were minimally helpful for 9% of experiments, and not at all helpful (or did not respond to us) for 32% of experiments

Errington et al. (2021)

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

Tim Errington, Center for Open Science at the Open Science Bootcamp 2023

…This experience draws attention to a basic and fundamental concern about replication – it is hard to assess whether reported findings are credible.

Errington et al. (2021)

  • Reproducibility in many fields is poor

giphy.com

giphy.com

Why we should care

“The first principle is that you must not fool yourself—and you are the easiest person to fool. So you have to be very careful about that. After you’ve not fooled yourself, it’s easy not to fool other scientists.”

Feynman (1974)

Richard P. Feynman, Wikipedia

Richard P. Feynman, Wikipedia

Excerpt from “Monty Python & the Holy Grail”

Excerpt from “Monty Python & the Holy Grail”

Houses of straw, sticks, or

Three Little Pigs

Three Little Pigs

Stone?

Pantheon

Pantheon

What’s your “bus number”?

  • Could your colleagues pick up where you left off?
  • Could your adviser?

Building for reproducibility

Capture the workflow

Figure 6: Play & Learning Across a Year (PLAY) project workflow, Soska et al. (2021) Figure 3
  • Humans
    • Checklists, Standard Operating Procedures (SOPs), Protocols
    • Quality Assurance (QA) checks
Figure 7: Gawande (2011)

Requirements for reproducibility

  • Humans
    • Checklists, Standard Operating Procedures (SOPs), Protocols
    • Quality Assurance (QA) checks
  • Computers
    • Data + code
    • Package/library version management
    • Testing (unit, regression, etc.)
    • Consistent random number seeds, e.g., set.seed()

Principles for reproducibility

  • DRY WIT
    • Don’t Repeat Yourself
    • Write It Down

Share your workflow

What useful things?

flowchart LR
  A(["Idea"]) --> B["Proposal"]
  B --> C("Project_1")
  B --> D("Project_2")
  C --> E["IRB_protocol"]
  D --> E
  C --> F["Lab_protocol"]
  D --> F
  C --> G["Data_pipeline_1"]
  D --> H["Data_pipeline_2"]
  B --> I["Conference_presentation"]
  F --> I
  G --> I
  C --> I
  I --> J["Journal_manuscript"]
  H --> K["Lab_mtg_report"]
  K --> J
  F --> J
Figure 10: A sample scholarly workflow.
flowchart TD
  A["Grant proposal"] --> B["IRB Protocol"]
  B --> C["Lab Protocol"]
  C -->|"updates"| B
  C --> D["Data pipeline"]
  A --> E["Data management & sharing plan"]
  E --> B & C & D
Figure 11: Dependencies among some standard research components.
flowchart TD
  A["Grant proposal"] --> B["IRB Protocol"]
  B --> C["Lab Protocol"]
  C --> B
  C --> D["Data pipeline"]
  A --> E["Data management & sharing plan"]
  E --> B & C & D
  A & E --> F["*.docx"]
  B --> F
  C --> F
  C --> G["*.pdf"]
  D --> nodeX:::hidden
  
  classDef hidden display:none
Figure 12: Dependencies among some standard research components with typical output formats.

One framework to rule them all…

  • Write text files
  • Render to
    • HTML
    • PDF
    • Word (docx), Powerpoint (pptx)

What Quarto supports

Things I’ve built

  • Blogs
  • Data cleaning, analysis, & visualization workflows
  • Course/workshop sites, one-off talks
  • Posters
  • Interactive dashboards

Why HTML?

  • Easily shared
  • Cheaply shared1
  • DRY WIT
  • Meet emerging accessibility standards
  • Machine-readable
    • Might help other humans

Case studies

Data pipeline

  • Gathering
  • Cleaning
  • Visualizing
  • Analyzing
  • Documenting/sharing
flowchart TD
  A["Gathering"] --> B["Cleaning"]
  B --> C["Visualizing"]
  C --> B
  C --> D["Analyzing"]
  D --> B
Figure 18: Steps in a typical data processing pipeline. Note that the process is usually iterative.

Quarto data pipeline

  • One Quarto document (*.qmd)
    • Separate sections for each step.
  • Separate documents (Gather/Clean, Visualize).

Gathering: In general

  • Automate!
  • Application Programming Interfaces (APIs)
  • Leave source/raw data intact
  • Clean “downstream”
  • Automate cleaning

Gathering: In general

  • Credentials
    • Store securely
    • Not with data analyses!
    • Quarto projects: Locally in ~/.Renviron
  • Where to save/store data
  • Individual session/person or group

Gathering: Google Forms

  • Google Form exports to Google Sheet

Gathering: Google Forms

  • Google Sheet linked to Google account(s)

Gathering: Google Forms

  • Google Sheet has ID, file name, maybe multiple sheets

Gathering: Google Forms

  • Recommended practice:
    • Save raw data locally in data/raw_data
    • Save as tab- (.tsv) or comma-separated (.csv) ASCII or utf8 text format
    • Use snake-case (snake_case.csv) or hyphen-separated file names
      • NO SPACES OR SPECIAL CHARACTERS

Under the hood

Your turn

Project-level organization

Tips & tricks

  • Divide & conquer
    • Individual files
    • Small code chunks
    • Tell the story (be kind to your future, forgetful, self)
    • Functions!

Tips & tricks

  • Use cross-referencing
    • Quarto can link to figures you’ve already generated, like Figure 12.
    • Great for HTML, but also works for Word & LaTex/PDF docs.
  • Render; fix bugs; save.
  • Automate reference generation

FAQs

  • Do I have to use git or GitHub?
    • No
    • Especially not for identifiable data2
    • But free & easy web hosting is attractive
  • Can I use Quarto alongside…
    • Python via Jupyter notebooks: Yes
    • Shell scripts: Yes
    • LaTex: Yes

FAQs

  • Do I have to use git or GitHub?
    • No
    • Especially not for identifiable data
    • But free & easy web hosting is attractive
  • Can I use Quarto alongside…
    • Matlab: No, but consider Octave
    • SPSS: Run syntax files via command line/terminal: template

Considerations

  • {targets} package
  • Write once, reuse via embedded files
  • Version control
  • Integration with other parts of your project
  • RStudio vs. Positron vs. VSCode

Considerations

  • Alternatives to Quarto
    • Jupyter notebooks3
    • Word docs
    • Google docs
    • Qualtrics + RedCAP4

Wrap up

Principles

  • Don’t fool yourself!
  • DRY WIT
  • Automate
  • Show your work
  • Divide & conquer (bite-sized chunks)
  • Increase your bus number

Toward more reproducible scholarly workflows

  • Protocols
  • Data cleaning, analysis, visualization pipelines
  • Talks
  • Professional websites/blogs

Quarto (Part II): Making useful things reproducibly




Rick Gilmore
rog1 AT-SYMBOL psu PERIOD edu
114 Moore
github.com/gilmore-lab
github.com/psu-psychology
github.com/penn-state-open-science

Resources

About

This talk was produced using Quarto version 1.8.27, using the RStudio Integrated Development Environment (IDE), version 2026.4.0.526.

The source files are in R and R Markdown, then rendered to HTML using the revealJS framework. The HTML slides are hosted in a GitHub repo and served by GitHub pages: https://penn-state-open-science.github.io/bootcamp-2026-quarto-II/

Packages

We used R v. 4.6.0 (R Core Team, 2026) and the following R packages: gt v. 1.3.0 (Iannone et al., 2026), kableExtra v. 1.4.0 (Zhu, 2024), qrcode v. 0.3.0 (Onkelinx & Teh, 2024), qualtRics v. 3.2.2 (Ginn, O’Brien, & Silge, 2025), renv v. 1.2.2 (Ushey & Wickham, 2026), rmarkdown v. 2.31 (Allaire et al., 2026; Xie, Allaire, & Grolemund, 2018; Xie, Dervieux, & Riederer, 2020), tidyverse v. 2.0.0 (Wickham et al., 2019).

References

Allaire, J., Xie, Y., Dervieux, C., McPherson, J., Luraschi, J., Ushey, K., … Iannone, R. (2026). rmarkdown: Dynamic documents for r. Retrieved from https://github.com/rstudio/rmarkdown
Errington, T. M., Denis, A., Perfito, N., Iorns, E., & Nosek, B. A. (2021). Challenges for assessing replicability in preclinical cancer biology. eLife, 10, e67995. https://doi.org/10.7554/eLife.67995
Feynman, R. P. (1974). Cargo cult science. Retrieved from https://calteches.library.caltech.edu/51/2/CargoCult.htm
Gawande, A. (2011). The checklist manifesto: How to get things right. New York, NY: St Martin’s Press.
Ginn, J., O’Brien, J., & Silge, J. (2025). qualtRics: Download Qualtrics survey data. https://doi.org/10.32614/CRAN.package.qualtRics
Iannone, R., Cheng, J., Schloerke, B., Haughton, S., Hughes, E., Lauer, A., … Roy, O. (2026). gt: Easily create presentation-ready display tables. https://doi.org/10.32614/CRAN.package.gt
Miske, O., Abatayo, A. L., Daley, M., Dirzo, M., Fox, N., Haber, N., … Errington, T. M. (2026). Investigating the reproducibility of the social and behavioural sciences. Nature, 652, 126–134. https://doi.org/10.1038/s41586-026-10203-5
Onkelinx, T., & Teh, V. (2024). qrcode: Generate QRcodes with r. Version 0.3.0. https://doi.org/10.5281/zenodo.5040088
R Core Team. (2026). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://doi.org/10.32614/R.manuals
Soska, K. C., Xu, M., Gonzalez, S. L., Herzberg, O., Tamis-LeMonda, C. S., Gilmore, R. O., & Adolph, K. E. (2021). (Hyper)active data curation: A video case study from behavioral science. Journal of Escience Librarianship, 10. https://doi.org/10.7191/jeslib.2021.1208
Ushey, K., & Wickham, H. (2026). renv: Project environments. https://doi.org/10.32614/CRAN.package.renv
Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D., François, R., … Yutani, H. (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686. https://doi.org/10.21105/joss.01686
Xie, Y., Allaire, J. J., & Grolemund, G. (2018). R markdown: The definitive guide. Boca Raton, Florida: Chapman; Hall/CRC. Retrieved from https://yihui.org/rmarkdown/
Xie, Y., Dervieux, C., & Riederer, E. (2020). R markdown cookbook. Boca Raton, Florida: Chapman; Hall/CRC. Retrieved from https://yihui.org/rmarkdown-cookbook
Zhu, H. (2024). kableExtra: Construct complex table with kable and pipe syntax. https://doi.org/10.32614/CRAN.package.kableExtra

Footnotes

  1. GitHub pages is free for public projects.

  2. Add private files to .gitignore

  3. Python & Jupyter notebooks workshop Today @ 3 pm: details

  4. Alaina Pearce has implemented this.