We can create automated HTML reports by using R Markdown. These reports are highly customisable and interactive. They can be viewed on any device with a browser. These reports present an opportunity to engage everyday people, staff, and decision makers with data through story telling.
You’ve probably noticed we don’t write on clay tablets anymore. That’s not to say there aren’t advantages to that medium (longevity being one of them), but we stopped using clay tablets because it was much easier to transfer written information using paper. Nowadays, we are increasingly turning from paper documents to digital documents for similar reasons.
One big advantage of digital documents over clay tablets and paper is that the information within such documents can be updated easily. Despite advances in technology, many (if not most), organisations are not taking advantage of the opportunity to create living, automated reports. Such reports have the potential to bridge the gap between the people undertaking the analyses and the people who make the decisions. In particular, because these reports can combine a short written narrative with interactive graphics we can employ a ‘show don’t tell’ form of storytelling.
We can, today, make such reports using R Markdown (a type of lightweight markup language available in RStudio). These reports are highly customisable, and can also be rendered in PDF or WORD formats. The downside of such formats is that they are static and not interactive. However, we can maximise interactivity by rendering these documents in an HTML format (like the one you are reading now). HTML can be read on any device with a browser. Not only that but they can also be inserted directly into a website. In the following sections I demonstrate some of the features of an HTML R Markdown document created in RStudio.
It is easy to include static images, interactive plots or tables in the report. Widgets (such as buttons and sliders) can also be attached to allow the user to explore or download the data. Given that graphic elements can often overwhelm the look of a report we also can organise graphic elements behind tabs:
Hover over or click top right of graph to activate menu
One of the great advantages of report making in R Markdown is that the analysis is actually embedded in the document. If you are undertaking a technical report you may want to show the underlying code and its output (such as the dinosaur animation below). Given that many users may be disinterested in the code this can be hidden from ordinary view but activated when required by clicking the code
button:
library(datasauRus)
library(ggplot2)
library(gganimate)
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point(size = 4, alpha = 0.5, colour = "#149414")+
theme_void() +
transition_states(dataset, 3, 1) +
ease_aes('cubic-in-out')
One important part of scientific reporting is the integration of outputs into text. For example, we may have worked out the average weights of penguins (and their variability) but want to present this as text rather in a table. The advantage with embedding these results using code is that we don’t have to constantly rewrite the passage if the data changes (as it will change automatically), it also eliminates the possibility of typographical mistakes.
library(palmerpenguins)
library(tidyverse)
peng <- na.omit(penguins) %>%
group_by(species) %>%
summarise(average.weight = round(mean(body_mass_g),-1),
sd = round(sd(body_mass_g),-1))
Gentoo penguins are the heaviest of the **`r nrow(peng)`** species with an average weight (mean $\pm$ standard deviation) of **`r prettyNum(peng[which(peng$species == "Gentoo"),"average.weight"], big.mark = ",", preserve.width="none")`** $\pm$ **`r peng[which(peng$species == "Gentoo"),"sd"]` g ** compared to an average weight of **`r prettyNum(peng[which(peng$species == "Adelie"),"average.weight"], big.mark=",", preserve.width = "none")`**$\pm$ **`r peng[which(peng$species == "Adelie"),"sd"]` g** for Adelie penguins and **`r prettyNum(peng[which(peng$species == "Chinstrap"),"average.weight"], big.mark = ",", preserve.width = "none")`** $\pm$ **`r peng[which(peng$species == "Chinstrap"),"sd"]` g** for Chinstrap penguins.
By using R code embedded in the R Markdown text we can automate the values (shown in bold):
“Gentoo penguins are the heaviest of the 3 species with an average weight (mean \(\pm\) standard deviation) of 5,090 \(\pm\) 500 g compared to an average weight of 3,710 \(\pm\) 460 g for Adelie penguins and 3,730 \(\pm\) 380 g for Chinstrap penguins.”
If you are preparing a technical document for a scientific audience (e.g. scientific paper) you may have to include some mathematical equations. Typically, these equations are manually typeset and prone to human error. But thanks to the R package equatiomatic
standard mathematical equations can be generated directly from your analysis. Below is a mathematical equation for determining the sex of penguins using an approach called logistic regression:
library(equatiomatic)
pr <- glm(sex ~ species + bill_length_mm,
data = penguins,
family = binomial(link = "logit"))
extract_eq(pr, wrap = TRUE, terms_per_line = 2)
\[ \begin{aligned} \log\left[ \frac { P( \operatorname{sex} = \operatorname{male} ) }{ 1 - P( \operatorname{sex} = \operatorname{male} ) } \right] &= \alpha + \beta_{1}(\operatorname{species}_{\operatorname{Chinstrap}})\ + \\ &\quad \beta_{2}(\operatorname{species}_{\operatorname{Gentoo}}) + \beta_{3}(\operatorname{bill\_length\_mm}) \end{aligned} \]
R Markdown is an easy way to automate the production of shareable, interactive reports. At the same time because the reports are written in code they are easily reproducible. Furthermore, as these reports can allow access to code and the parent data sources they allow for high levels of transparency.