SkillAgentSearch skills...

RWorkflow

:bookmark_tabs: My approach to an analysis or product produced with R

Install / Use

/learn @tallguyjenks/RWorkflow
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Table Of Contents

SETUP

Return To Table Of Contents

Create a new package file

Return To Table Of Contents

file --> New Project --> New/Existing Directory --> R Package

Fill out description file

Return To Table Of Contents

Package: workflow
Title: A Robust workflow for software driven data analysis
Version: 0.0.1
Authors@R:
    person(given = "Bryan",
           family = "Jenks",
           role = c("aut", "cre"),
           email = "bryanjenks@protonmail.com",
           comment = c(ORCID = "0000-0002-9604-3069"))
Description: A Robust document for discussing a great way to structure analysis.
License: MIT file LICENSE # The license can be written out in the 'LICENSE' File
Encoding: UTF-8
LazyData: true
# Roxygen: list(markdown = TRUE) if you want markdown support for the documentation use this option

use the package loading script

Return To Table Of Contents

This way it just loops over a vector of the packages and installs what isnt alread installed and loads what is installed so it is available for the RMarkdown product.

packages <- c("tidyverse", "here", "todor", "lintr", "DT", "kableExtra", "roxygen2", "testthat", "usethis", "devtools", "tidylog")
xfun::pkg_attach2(packages, message = FALSE)

If performing a reporoducable analysis utilize packrat for a snapshot of your utilized packages / libraries.

# setup packrat snapshot in your new package/project
packrat::init(here::here())
# To add package for use to your project in this snapshot environment you install as normal:
install.packages("runes")
# when you're ready to save your snapshot to packrat for your reproducable project:
packrat::snapshot()
# to check the status of your snapshot
packrat::status()
# to remove a package from your snapshot
remove.packages("runes")
# and to restore one
packrat::restore()
# if packages are not used:
# Use packrat::clean() to remove them. Or, if they are actually needed
# by your project, add `library(packagename)` calls to a .R file
# somewhere in your project.

There are also plenty of GUI options for working with packrat in RStudio

TODO management

Return To Table Of Contents

If you have multiple files or a large RMarkdown document and you use commented <!-- TODO/BUG/FIXME/HACK --> items and want to see where all of them are then use the todor package with the following snippet

# Create a vector of document paths in the current directory (use with HERE() package)
# This is great for multiple R markdown documents
docs <- dir(pattern = "*.Rmd") %>%
    as.vector()
todor::todor(file = docs)

# A less hacky way of checking a whole PACKAGE for TODO's is just the built in function:
todor::todor_package()

Create Data directory

Return To Table Of Contents

Create the Data/ directory to hold raw data files that will be cleaned and processed by R scripts in the R/ directory for the RMarkdown document when sourced.

To save tibbles or data from R that has already been tidy-ified to make sure they dont lose their specifications i.e. that a <chr> column is a factor, etc etc use the {feather} package.

library(feather)
feather::write_feather(<x>,<path>)
feather::read_feather(<path>)

Update Rbuildignore

Return To Table Of Contents

When building a package for installation and reproducablilty have the build process ignore certain files, driectories and other things that it shouldn't touch during the build process

OPTIONAL

if keeping the package in GIT version control then also update the .gitignore

Ethics

Return To Table Of Contents

"deon is a command line tool that allows you to easily add an ethics checklist to your data science projects. The conversation about ethics in data science, machine learning, and AI is increasingly important. The goal of deon is to push that conversation forward and provide concrete, actionable reminders to the developers that have influence over how data science gets done."

deon

Deon badge

ANALYSIS

Return To Table Of Contents

Begin Writing Your Content

Return To Table Of Contents

In your RMarkdown Document you can begin filling in your content with what ever template or way you prefer to write in your document. There are many ways to convey the results and workflow of your analysis, you have a package, a single stand alone RMarkdown document, a bookdown book, HTML output only, theres a million ways to perofrm an analysis and this is just going to be a document about some of the more common parts of the workflow with nuances left to personalization and preference.

Never use require() or library() in a packaged analysis, put these items in the DESCRIPTION file as imports or suggests to import them

For local file management in the .Rproj project directory, i and many many others prefer to use the here package that uses the project root directory as the relative root and use relative directory references to reference other files in your package.

Visualization

Return To Table Of Contents

Two very great addins in RStudio for graphically editing and creating initial plots and visualizations without having to type all the code from scratch:

  • esquisse --- Initial plot creation to minimize boiler plate writing
  • ggedit --- Editing created plots graphically
  • colourpicker --- Custom color code pickers for themes and general use

Create New R Function as needed

Return To Table Of Contents

functions into seperate R script files in R/ and if there are a lot of functions group their filenames with some sort of convention that groups them AAA_Function.R

Write Unit Tests

Return To Table Of Contents

To start using unit tests devtools::use_testthat()

to run all current tests Ctrl + Shift + T or devtools::test()

Test Fix Iterate

Return To Table Of Contents

Run your tests on your developing functions and fix any ERRORS, WARNINGS, or NOTES that come up

To find answers to your errors you can use the tracestack package to find the last error message on stackoverflow

Document Completed R Functions

Return To Table Of Contents

Use roxygen2 documentation on all functions script files in R/

  • First line: Title
  • Second line: Description
  • Subsequent lines: Details

A link to Cheat Sheet Documentation

Bare Bones Template:

#' @title       # This Is the Name of your funtion
#' @description # This is a good explanation of your function
#' @detail      # This is each granular detail of your function (there can be multiple of these sections)
#' @param       # This is a parameter of your function
#' @return      # This is what your function returns
#' @export      # This is how your function gets exported to the NAMESPACE and is available for use after library() otherwise you use :::

Documentation Info

Compile Your Documentation

Return To Table Of Contents

Run devtools::document() (or press Ctrl + Shift + D in RStudio) to compile your documents into function documentation that appears in the man/ directory and the NAMESPACE that contains all @export functions.

MODELING

beyond just the lm() function, you can make a model object by model <- lm(var1 ~ var2 + var4, data) and then wrap that model object with

performance::check_model(model)

and the output is graphical and awesomely useful. it is a bit slow though

<!-- TODO This section needs to be fleshed out more with more info on modeling -->

REFERENCES

Return To Table Of Contents

Writing a bibliography for your R packages

# automatically create a bib database for R packages
knitr::write_bib(c(
  .packages(), packages #this is made in the lib loading section
), 'packages.bib')

in your yaml portion of the RMarkdown document you can use a yaml array to contain multiple .bib files to have one solely for your R Packages that are generated from the code chunk above and also any other cited sources you wish to compile manually or otherwise. like so:

bibliography: [cited.bib, packages.bib]

and for packages, you can use this yaml trick to have all non-inline citations i.e. the R packages used, imm

View on GitHub
GitHub Stars35
CategoryEducation
Updated5mo ago
Forks5

Security Score

77/100

Audited on Oct 6, 2025

No findings