3 Meet our toolbox!

You can follow along with the slides here if they do not appear below.

3.1 R

R is both a programming language and software environment for statistical analysis. It differs from other software like SPSS in three key ways.

  • First, R is free, no strings (or warranties) attached. Download it at cran.r-project.org. The popular editor RStudio is also available for free at rstudio.com.

  • Second, R is open-source, with thousands of active contributors sharing add-on packages. See the full list at cran.r-project.org/web/packages (there are currently over 8,000 packages).

  • Third, R is accessed primarily through code, rather than by pointing and clicking through drop-down menus and dialog windows. This third point is a road block to some, but it ends up being a strength in the long run.

Note we’ll be using two pieces of software. R is the software that runs all your analyses. RStudio is an Integrated Development Environment or IDE that simplifies your interaction with R. RStudio isn’t essential, but it gives you nice features for saving your R code, organizing the output of your analyses, and managing your add-on packages.

3.1.1 Install R and RStudio

At this point you should download and install R and RStudio using the links below. The internet abounds with helpful tips on installation and getting started. I made a video walking windows-folks through the process.

embed_url("https://www.youtube.com/watch?v=kVIZGCT5p9U") %>%
  • Install R, a free software environment for statistical computing and graphics from CRAN, the Comprehensive R Archive Network. I highly recommend you install a precompiled binary distribution for your operating system – use the links up at the top of the CRAN page linked above!

  • Install RStudio’s IDE (stands for integrated development environment), a powerful user interface for R. Get the Open Source Edition of RStudio Desktop.

    • You can run either the Preview version or the official releases available here.
    • RStudio comes with a text editor, so there is no immediate need to install a separate stand-alone editor.

If you have a pre-existing installation of R and/or RStudio, I highly recommend that you reinstall both and get as current as possible. It can be considerably harder to run old software than new.

  • If you upgrade R, you will need to update any packages you have installed. The command below should get you started, though you may need to specify more arguments if, e.g., you have been using a non-default library for your packages.
update.packages(ask = FALSE, checkBuilt = TRUE)

Note: this code will only look for updates on CRAN. So if you use a package that lives only on GitHub or if you want a development version from GitHub, you will need to update manually, e.g. via devtools::install_github().

3.1.2 Testing testing

  • Do whatever is appropriate for your OS to launch RStudio. You should get a window similar to the screenshot you see here, but yours will be more boring because you haven’t written any code or made any figures yet!

  • Put your cursor in the pane labeled Console, which is where you interact with the live R process. Create a simple object with code like x <- 3 * 4 (followed by enter or return). Then inspect the x object by typing x followed by enter or return. You should see the value 12 print to screen. If yes, you’ve succeeded in installing R and RStudio.

  • As noted above, R is accessed via code, primarily in the form of commands that you’ll type or paste in the R console. The R console is simply the window where your R code goes, and where output will appear. Note that RStudio will present you with multiple windows, one of which will be the R console. That said, when instructions here say to run code in R, this applies to R via RStudio as well.

3.1.3 Add-on packages

R is an extensible system and many people share useful code they have developed as a package via CRAN and GitHub. To install a package from CRAN, for example the dplyr package for data manipulation, here is one way to do it in the R console (there are others).

install.packages("dplyr", dependencies = TRUE)

By including dependencies = TRUE, we are being explicit and extra-careful to install any additional packages the target package, dplyr in the example above, needs to have around.

You could use the above method to install the following packages, all of which we will use:

3.2 Getting Help with R

You can follow along with the slides here if they do not appear below.

Help files in R are easily accessed by entering a function name preceded by a question mark, for example, ?c, ?mean, or ?devtools::install_github. Parentheses aren’t necessary. The question mark approach is a shortcut for the help() function, where, for example, ?mean is the same as help("mean"). Either approach provides direct access to the R documentation for a function.

The documentation for a function should at least give you a brief description of the function, definitions for the arguments that the function accepts, and examples of how the function can be used.

At the console, you can also perform a search through all of the available R documentation using two question marks. For example, ??"regression" will search the R documentation for the term “regression.” This is a shortcut for the function help.search(). Finally, you can browse through the R documentation with help.start(). This will open a web browser with links to manuals and other resources.

If you’re unsatisfied with the documentation internal to R, an online search can be surprisingly effective for finding function names or instructions on running a certain procedure or analysis in R.

3.2.1 Further resources

The above will get your basic setup ready but here are some links if you are interested in reading a bit further.