If you are reading this document as a web page, not a pdf, and prefer to read in Dark Mode, click the slider button in the top-right corner.
Prelude
The “for All” in the title of this document is meant to imply what it means: For people who write. We are discussing not just how to write computer code for data analysis with R and Python, we are also discussing how to write documents in general. Quarto document processing is revolutionary and accessible, as we will see.
To use quarto
, within the RStudio environment on your computer instead of the cloud, you will need to download R and RStudio. You do not have to know much of anything about R to write documents with quarto
. However, if you wish to learn about R for data analysis with RStudio, begin with these directions. If you are interested in Python and you need Python downloaded to your computer, try the Anaconda distribution.
If you have no interest in Python for data analysis, skip the next section and go straight to Quarto Markdown
. There is no necessary reason to use Python or use R or use any data analysis system when writing documents with quarto
. Sometimes we write documents to display and describe our computer output and sometimes we just write documents to say stuff.
Python in RStudio
See integrated for more details, but the basics are described below. Python integration in R is accomplish with the reticulate
package, which provides the functions for interfacing Python with R and RStudio.
First, download the package from the R servers with the same procedure for any external, contributed R package.
install.packages("reticulate")
Once downloaded, retrieve the reticulate
package from the R library with library()
. Then, identify the location of the Python interpreter with the reticulate
function use_python
. The following example lists the default location of the anaconda download on a Mac.
library(reticulate)
use_python("~/anaconda3/bin")
If you are not an Anaconda Mac user, run the py_config()
function to locate your Python language interpreter and replace the file reference in the use_python()
function.
py_config()
When you know your Python location, you can save it indefinitely within RStudio.
Tools menu --> Global Options... --> Python
Enter the location. Then, there is no need to begin each Python session with the use_python()
function.
Once the Python interpreter has been located, one possibility initiates an interactive Python session with the following function call:
repl_python()
The corresponding interactive prompt for a Python instruction is >>>
. To have the Python instructions available for later use, store them in a file of Python code open in RStudio and copy lines of code into the interactive environment. See the next section for a more optimal strategy.
Quarto Markdown
Get Started
The modern way of writing, including creating documents that contain data analysis output, employs RStudio as a document preparation environment. This approach follows Donald Knuth’s concept of literate programming, in which a computer program becomes literature. This is the strategy used in creating this online reading. Use RStudio to write articles, books, slide shows, web sites, and blogs as text files formatted with markdown
, specifically R Markdown, using the quarto
document system. Create documents that not only contain static text but, if desired, analysis output from R or Python (and others) in text or slide format, directed to a web page, MS Word, or a pdf document.
Knuth, Donald E. 1984. Literate Programming. Comput. J. 27 (2): 97–111.
The term document in this context is ambiguous because there are two documents when working with markdown. The initial markdown document is a simple text file with embedded markdown codes. For example, in the in the markdown document if you want a word italicized, enclose the word within a beginning _
and ending _
. Transform this text file by clicking the RStudio Render
button into the corresponding rendered document used for presentation, which can be in various formats, such as a webpage or MS Word. Get a quick summary of R Markdown tags to embed in your document from the RStudio Help
menu, Markdown Quick Reference
.
To begin the preparation of any document, including one that contains R and/or Python code with included descriptive text, go to the Project
drop-down menu at the far top-right of an RStudio window. Then, choose an Existing Directory
or New Directory
as needed.
Project drop-down menu --> New Project... --> New Directory --> Quarto Project
View a video of the document creation process from doing this menu sequence to creating the final document.
To create the correct templates, choose Quarto Book
format if multiple markdown documents (chapters) and a reference section are needed.
This RStudio menu sequence creates a directory for a minimal Quarto Project
that renders a single document, complete with three files:
- Markdown document, a text file with file type
.qmd
for quarto markdown - RStudio project file with file type
.Rproj
, open this file to work on the document YAML
file named_quarto.yml
for configuring aspects of document layout, which can be left unmodified for the default configuration
The directory structure for a quarto
project for the document you are now reading, named RStudioPython, is illustrated below. The files listed in the folder are what occurred after the web output was rendered, the .html
file, which is the primary output. Although not necessary, to help organize your files, add the following information to the project folder if relevant.
- a
data
directory (folder) if one or more data files are referenced in the document - an
images
folder if.png
images are to be embedded in the document - a
.css
file if you wish for custom styling of web output, though the default styling works well without the need for customization.
Use this ready-to-run basic .qmd
sample document file created by RStudio as the basis for creating your markdown documents that can also run code. From this automatically created sample document, you can immediately begin revising and adapting to your own writing and analysis, either in Visual
mode, similar to working with the usual word processor format, or in Source
mode, where you can directly view and enter the markup instructions.
Push the
Render
button to generate a version of your document in the default web format, an.html
file, or MS Word or a pdf output.
If needed, create a new .qmd
document, within RStudio do File
–> New File
and then select either Quarto Document...
for a standard document or Quarto Presentation...
for a slide show.
As indicated, the YAML
file, _quarto.yml
, dictates much of the configuration of the input and output of the process of generating documents. The file does not need modification as you can immediately start working with it unmodified and to render quarto
web pages, Word docs, and (with some additional software) pdf documents.
The YAML
configuration instructions are typically distributed across the beginning of the .qmd
document and the corresponding .yml
configuration file. For example, the following three lines of YAML
code appear at the top of the .qmd
file from which this webpage was generated.
These same three lines appear at the top of the sample .qmd
document generated by RStudio when beginning a new project. Of course, the specific title here is different from the sample document.
By default, an .html
file is generated when the Render
button is pushed. To instead send the output to MS Word, add the YAML
instruction format: docx
line, such as at the beginning of your document.
Use a text editor to prepare your markdown document, such as RStudio’s built-in text editor or any other text editor. Text documents are free of any specific application and can be read and written by many applications. With this strategy, if desiring a MS Word document, MS Word is not used to write the document, only for outputting from your .qmd
file when rendered. The MS Word document becomes output instead of the usual input. Of course, the MS Word output document can be edited directly, but if you want to make changes, best to revise your .qmd
document and re-render the output.
Virtually all my writing is embedded within a markdown document, so I primarily use a specialized text cross-platform editor called VIM
to write and edit text. I also use RStudio’s editor, set in VIM
mode, often with the markdown document open in both apps. However, RStudio’s editor by itself works well. When beginning working with text documents there is no compelling need to learn other editors, if ever.
R and Python Code
If you wish to include R or Python code in your .qmd
document, define the code in chunks. The sample text document from following the above instruction illustrates document creation and programming. Following is a Python example that generates output each time this document is re-generated after new editing and revision from within RStudio by clicking the Render
button. The code reads an Excel data file into the data frame named d. The (37, 9)
that follows the Python code is Python output, which indicates the number of rows, 37, and number of columns, 9, of that data frame.
import pandas as pd
= pd.read_excel('data/employee.xlsx')
d d.shape
(37, 9)
Within the .qmd
document, the above code Python code chunk is written as follows.
Define a chunk of Python code with the line ```{python}
and follow the code chunk with the line ```
. As you would imagine, replace the python
with r
to define a chunk of R code immersed within your document.
Rendering the markup document generates an output such as a web page, complete with text and data visualizations output generated when running your embedded R or Python (or others) code. You can also include your own images, such as in the default .png
format. Below is an example of an R code chunk to include your image, stored in the created images
folder.
Setting echo
to FALSE
means that this code chunk will not appear in the output, so only the image appears. The out.width
parameter is what it says, it sets the width of the output relative to the page. The fig.asp
parameter sets the aspect of the figure, height relative to width. The fig.align
parameter here is to center
the image. The value of the fig.cap
parameter is the figure caption.
Your document is now simultaneously a word processing document and a computer program. Welcome to the modern world of data analysis.
Writing and data analysis now simultaneously occur within the same document.
There are many available options, such as hiding the code from being displayed, depending on user preferences, and including pre-existing images as well as data visualizations as part of the computer output. Quarto
was also recently made available from the Jupyter Notebook environment.
Appendix: Sample YAML
File
As indicated, beginning a new quarto
project from within RStudio automatically creates a default YAML
document configuration file, _quarto.yml
. This file works perfectly fine to configure your rendered document. At the same time, there are many, many customized configurations that can be specified with that file. Below is the YAML
file that was used to generate this document, so revise with your name as author. Of course, if you do not have a .css
file called style.css
in the same project directory as your .qmd
document, then eliminate the .css
line. Or, just copy my style.css
file that was used in the rendering of this document.
As previously indicated, any subset of these YAML
statements can also appear in the YAML
header at the beginning of the .qmd
document. However, having most of the YAML
instructions in a separate file facilitates sharing that file with other documents, leading to a consistent format across a variety of documents. The following instructions are for a single article. Creating multiple documents requires the book format, as outlined above.
format:
html:
css: style.css
callout-icon: false
date: now
date-format: "MMM D, YYYY hh:mm a"
toc: true
toc-depth: 4
link-external-newwindow: true
author: "David Gerbing"
title-block-banner: true
reference-location: margin
theme:
light: cosmo
dark: slate
editor: visual