2.1 Loading Data
R has many different functions for loading data, based on the format of the file. Since data often come in comma separated variables (CSV) format, we’ll use that as an example.
We will load some data on the weight, length, and age of great white sharks (Carcharodon carcharias). It is in a file called pups.csv
. You can download the file from the labs website on GitHub: pups.csv.
To load a csv file like this, there are two basic options.
2.1.2 The read_csv()
Function (in the readr
package)
You can either type this in directly to the R console command line (lower left pane of the Rstudio screen), or save it in a script that can be “sourced” (executed) whenever you like. Scripts are written/saved/sourced in the top left pane of the RStudio screen.
For this method you need to provide the location of the data, and the name you want the object to have when it is loaded into R. For this example, our copy of the data is online at https://uw-statistics.github.io/Stat311Tutorial/data/pups.csv, and we wish to store the data in the variable pups
. We would execute the code,
library(readr)
pups <- read_csv("https://uw-statistics.github.io/Stat311Tutorial/data/pups.csv")
## Parsed with column specification:
## cols(
## id = col_double(),
## weight = col_double(),
## length = col_double(),
## age = col_double(),
## clutch = col_double()
## )
Notice that this is very similar to what we see in the Code Preview pane in the Import Dataset method above.
Finally, if we only want to use read_csv()
once or twice, we do not have to load the entire readr
package with the library
command, we can just tell R that read_csv()
is in readr
using the ::
operator. And, if we have a copy of pups.csv
saved on our computer, in our “working directory”" (see next section for how R defines the current working directory), we can just use the name "pups.csv"
instead of the URL. Like so,
pups <- readr::read_csv("pups.csv")
2.1.3 The Working Directory
The working directory is the location on your computer where R looks for files and data first. You can run the command getwd()
to find out what the current working directory is. In RStudio, you can also see the current working directory directly under the console tab in the console pane.
To set the working directory, you can run the command setwd()
. In RStudio, you can also set the working directory by navigating somewhere in the Files
tab, clicking More
, and selecting Set As Working Directory
.