R Markdown is a built-in feature of RStudio. It integrates plain text
with chunks of R code in to a single file, which is extremely useful
when constructing class notes or building a website. A .rmd
file can be compiled into nice-looking .html
,
.pdf
, and .docx
file. For example, this entire
guide is created using R Markdown. With RStudio, you can install R
Markdown from R console using the following code. Note that this should
be automatically done the first time you create and compile a
.rmd
file in RStudio.
# Install R Markdown from CRAN
install.packages("rmarkdown")
Again there are many online guides for R Markdown, and these may not be the best ones.
To get started, create an R Markdown template file by clicking
File
-> New File
->
R Markdown...
You can then Knit
the template file and start to explore
its features.
Please note that this guide is provided in the .html
format. However, your homework report should be in .pdf
format. This can be done by selecting the Knit to PDF
option from the Knit button.
You may using the Homework template files from Week 1 as an example to
to render .Rmd
file into .html
or
.pdf
files and modify that along the way. Alternatively,
you may use the .Rmd
file that
generated this document. It should be read alongside the rendered
.html
to best understand how everything works.
Formatting text is easy. Bold can be done using **
or
__
before and after the text. Italics can be done using
*
or _
before and after the text. For example,
This is bold. This is italics. and
this is bold italics.
This text appears as monospaced.
We could mix lists and links. Note that a link can be constructed in
the format [display text](http link)
. If colors are
desired, we can customize it using, for example,
[\textcolor{blue}{display text}](http link)
. But this only
works in .pdf
format. For .html
, use
<span style="color: red;">text</span>
.
Tables are sometimes tricky using Markdown. See the above link for a helpful Markdown table generator.
A | B | C |
---|---|---|
1 | 2 | 3 |
Do | Re | Mi |
R
So far we have only used Markdown to create .html
. This
is useful by itself, but the real power of RMarkdown comes when we add
R
. There are two ways we can do this. We can use
R
code chunks, or run R
inline. An R chuck
starts with ```{r}
and ends with ```
. Within
each code chunk, it is the same as writing and executing R code in the R
console. Keep in mind that the underlying environment across different R
chunks is shared, hence if you make changes to an object in one chunk,
it will be reflected in others (later ones if run the chunks in
sequence).
R
ChunksThe following is an example of an R
code chunk
# define function
get_sd = function(x, biased = FALSE) {
n = length(x) - 1 * !biased
sqrt((1 / n) * sum((x - mean(x)) ^ 2))
}
# generate random sample data
set.seed(42)
(test_sample = rnorm(n = 10, mean = 2, sd = 5))
## [1] 8.8547922 -0.8234909 3.8156421 5.1643130 4.0213416 1.4693774 9.5576100 1.5267048 12.0921186 1.6864295
# run function on generated data
get_sd(test_sample)
## [1] 4.177244
There is a lot going on here. In the .Rmd
file, notice
the syntax that creates and ends the chunk. Everything between the start
and end syntax must be valid R
code. In this example, we
define a function, generate some random data in a reproducible manner,
displayed the data, then ran our function.
R
R
can also be run in the middle of the exposition. For
example, the mean of the data we generated is 4.7364838.
Whenever R
code is run, there is always a current
working directory. This allows for relative references to external
files, in addition to absolute references. Since the working directory
when knitting a file is always the directory that contains the
.Rmd
file, it can be helpful to set the working directory
inside RStudio to match while working interactively.
If you are using the most recent version of RStudio, then the working
directory is automatically set at the folder that contains the
.Rmd
file that you lunched RStudio. Hence, you should not
worry about setting it manually.
However, if you ever need to change it, select
Session > Set Working Directory > To Source File Location
while editing a .Rmd
file. This will set the working
directory to the path that contains the .Rmd
. You can also
use getwd()
and setwd()
to manipulate your
working directory programmatically. These should only be used
interactively. Using them inside an RMarkdown document would likely
result in lessened reproducibility.
The following generates a boring plot, which displays the skin cancer mortality
library(readr)
example_data = read_table("https://teazrq.github.io/stat432/data/skincancer.txt")
plot(Mort ~ Lat, data = example_data)
In our R introduction, we used ggplot2
to create a more
interesting plot. You may also polish a plot with basic functions.
Notice it is huge in the resulting document, since we have
modified some chunk options
(fig.height = 3.5, fig.width = 3.5
) in the RMarkdown file
to manipulate its size.
plot(Mort ~ Lat, data = example_data,
xlab = "Latitude",
ylab = "Skin Cancer Mortality Rate",
main = "Skin Cancer Mortality vs. State Latitude",
pch = 19,
cex = 1.5,
col = "deepskyblue")
But you can also notice that the labels and the plots becomes
disproportional when the figure size is set too small. This can be
resolved using a scaling option such as out.width = '40%
,
but enlarge the original figure size. We also align the figure at the
center using fig.align = 'center'
You can also write formulas into the file using latex code. For example, \(Y = X \beta + \epsilon\). This requires installing TinyTex if you don’t have LaTex already:
# Install TinyTex
install.packages("tinytex")
The inline latex should be started and ended with $
sign, while an entire equation with its own line can be done using
$$
sign. For example:
\[ \widehat\beta = \underset{\beta}{\arg\min} \frac{1}{n} \sum_{i=1}^n (y_i - x_i^T \beta)^2\]
At the beginning of the document, there is a code which describes some metadata and settings of the document. For this file, the header is rather complicated. However, following chunk would be sufficient for you to start your own:
title: "RMarkdown Template"
author: "Your Name"
date: "`r format(Sys.time(), '%B %d, %Y')`"
output:
html_document:
toc: yes
This describes the output format as .html
, defines the
theme, and toc
tells R
to automatically create
a Table of Contents based on the headers and sub-headers you have
defined using #
. You can remove this line if that’s not
what you needed.
You can edit this yourself, or click the settings button at the top
of the document and select Output Options...
. Here you can
explore other themes and syntax highlighting options, as well as many
additional options. Using this method will automatically modify this
information in the document.
Be sure to play with this document! Change it. Break it. Fix it. The best way to learn RMarkdown (or really almost anything) is to try, fail, then find out what you did wrong.
RStudio has provided a number of beginner tutorials which have been greatly improved recently and detail many of the specifics potentially not covered in this document. RMarkdown is continually improving, and this document covers only the very basics.