A course in quantitative research workflow for students in the higher education administration program at the University of Florida
Use the IPEDS data sets, hd2007.csv
and ic2007mission.csv
, to
answer the questions below. You may need to look up and download the
data dictionaries for each file. Click the “continue” button on this
page
to see the data and accompanying dictionary files. You can also use
the supplementary lesson on getting higher education data for
help.
You do not need to save the final output as a data file: just having the final result print to the console is fine. For each question, I would like you to try to pipe all the commands together. Throughout, you should account for missing values to the best of your ability by dropping them.
For each question, show your data work and then answer the question in a short (1-2 sentence(s)) comment.
NB To answer the questions, you will need to join the two IPEDS
data sets using the common unitid
key. Note that column names in
hd2007.csv
are uppercase (UNITID
) while those in
ic2007mission.csv
are lowercase (unitid
). There are a few ways to
join when the keys don’t exactly match. One is to set all column names
to the same case. If you want to use left_join()
starting with
hd2007.csv
, you can first use the the dplyr verb
rename_all(tolower)
in your chain to lower all column names. See
the help file for
left_join()
for
other ways to join by
different variable names.
<lastname>_assignment_7.R
) in your scripts
directory.