Cleaning SFO Weather Data
Helpful Data Wrangling Notes
-
month.abb
is a built-in object in R with 3-letter month abbreviations - You can create your own data frame with the
tibble()
function. Look up the documentation for this function by typing?tibble::tibble
in the Console. - You can create regular sequences in R with
:
, eg,3:5
generates the sequencec(3, 4, 5)
. - You can create regular sequences in R with
seq()
, eg,seq(from = 3, to = 5, by = 1)
generates the sequencec(3, 4, 5)
. Look up the documentation for this function by typing?seq
in the Console.
Practicing Keyboard Shortcuts
Try out the following as you work on this exercise:
- Tab completion (Try this out when writing your file paths! Typing out a partial path will pull up a mini file-explorer)
- Insert a code chunk
- Run a code chunk
- Navigating around words and lines (selecting and deleting them)
- Run selected lines (not a whole code chunk)
- Insert the assignment operator (
<-
) - Insert the pipe operator (
|>
)
Exercise
Carryout the following steps to clean and save the San Francisco Weather data. Make sure to download and add the data file to your portfolio repository as instructed.
- Read in the weather data in this file with the correct relative file path after you move it to the instructed location.
- There is a variable that has values that don’t make sense in the data context. Figure out which variable this is and clean it up by making those values missing using
na_if()
.
- Create a variable called
dateInYear
that indicates the day of the year (1-365) for each case. (Jan 1 should be 1, and Dec 31 should be 365).
- Create a variable called
month_name
that shows the 3-letter abbreviation for each case.
- Save the wrangled data to the
data/processed/
folder usingwrite_csv()
. Name this fileweather_clean.csv
. Look up the documentation for this function by typing?write_csv
in the Console. You’ll need to write an appropriate relative path.