R, RStudio, and the tidyverse

Part 1: The basics of R and dplyr

For this week’s problem set, you need to work through a few of RStudio’s introductory primers. You’ll do these in your browser and type code and see results there.

You’ll learn some of the basics of R, as well as some powerful methods for manipulating data with the {dplyr} package.

Complete these primers. It seems like there are a lot, but they’re short and go fairly quickly (especially as you get the hang of the syntax). Also, I have no way of seeing what you do or what you get wrong or right, and that’s totally fine! If you get stuck and want to skip some (or if it gets too easy), go right ahead and skip them!

The content from these primers comes from the (free and online!) book R for Data Science by Garrett Grolemund and Hadley Wickham. I highly recommend the book as a reference and for continuing to learn and use R in the future.

Part 2: Getting familiar with RStudio

The RStudio primers you just worked through are a great introduction to writing and running R code, but you typically won’t type code in a browser when you work with R. Instead, you’ll use a nicer programming environment like RStudio, which lets you type and save code in scripts, run code from those scripts, and see the output of that code, all in the same program.

To get familiar with RStudio, watch this video (it’s from PMAP 8921, but the content still applies here):

Part 3: RStudio Projects

One of the most powerful and useful aspects of RStudio is its ability to manage projects.

When you first open R, it is “pointed” at some folder on your computer, and anything you do will be relative to that folder. The technical term for this is a “working directory.”

When you first open RStudio, look in the area right at the top of the Console pane to see your current working directory. Most likely you’ll see something cryptic: ~/

That tilde sign (~) is a shortcut that stands for your user directory. On Windows this is C:\Users\your_user_name\; on macOS this is /Users/your_user_name/. With the working directory set to ~/, R is “pointed” at that folder, and anything you save will end up in that folder, and R will expect any data that you load to be there too.

It’s always best to point R at some other directory. If you don’t use RStudio, you need to manually set the working directory to where you want it with setwd(), and many R scripts in the wild include something like setwd("C:\\Users\\bill\\Desktop\\Important research project") at the beginning to change the directory. THIS IS BAD THOUGH (see here for an explanation). If you ever move that directory somewhere else, or run the script on a different computer, or share the project with someone, the path will be wrong and nothing will run and you will be sad.

The best way to deal with working directories with RStudio is to use RStudio Projects. These are special files that RStudio creates for you that end in a .Rproj extension. When you open one of these special files, a new RStudio instance will open up and be pointed at the correct directory automatically. If you move the folder later or open it on a different computer, it will work just fine and you will not be sad.

Read this super short chapter on RStudio projects to learn how to create and use them

In general, you can create a new project by going to File > New Project > New Directory > Empty Project, which will create a new folder on your computer that is empty except for a single .Rproj file. Double click on that file to open an RStudio instance that is pointed at the correct folder.

Part 4: Getting familiar with Quarto

To ensure that the analysis and graphics you make are reproducible, you’ll do the majority of your work in this class using Quarto files.

Do the following things:

  1. Watch this video. It’s all about R Markdown, which is Quarto’s predecessor. The same general principles apply.

 

  1. Skim through the content at these pages. Again, lots of these resources mention R Markdown, which is Quarto’s predecessor, but the same principles apply to Quarto

  2. Watch this video (again, it’s from PMAP 8921 and it’s about R Markdown instead of Quarto, but the content works for this class):