Part 0: Read data
Download and read in the data. If you ever make use of the code solutions, the data for those are stored in the relative path of “data-raw/”. You can use the here()
function to set a relative path.
Hint 1: Click to reveal!
The data is packaged. You might need to use a tool to extract it.
Hint 2: Click to reveal!
Use ‘unzip()’ to extract the data.
Solution
# Unzip data
unzip(here("data-raw/r-murder-mystery.zip"),
exdir = here("data-raw/"))
Part 1: Investigate the crime scene report
Investigate the crime scene report for valuable information. For this task you will need to read in the crime scene file, and filter and select the column and that you need.
Hint 1: Click to reveal!
You want to look at the crime scene reports and look for a murder that happened on jan. 15, 2018 in R City
Hint 2: Click to reveal!
The data is stored in seperate data files. Can you recall a method to combine them?
Hint 3: Click to reveal!
It might be a good idea to use create a function to read in each data file.
Hint 4: Click to reveal!
You can use ‘dir_ls()’ and ‘map_dfr()’ to read the data
Hint 5: Click to reveal!
Data can be read with ‘vroom()’ or ‘read_csv()’
Hint 6: Click to reveal!
You can use dplyr verbs such as filter and select to find the data you are interested in.
Solution
# Function to read data
read_crime_scene_report <- function(file_path) {
vroom(
here(file_path),
col_types = cols(
date = col_character(),
type = col_character(),
description = col_character(),
city = col_character()
)
)
}
# Get a list of directories
crime_scene_report_files <- dir_ls(here("data-raw/crime_scene_report/"))
# Read data - This can take some time
crime_scene_report_df <- map_dfr(crime_scene_report_files, read_crime_scene_report)
# Find crime scene report
crime_scene_report_df %>%
filter(date == "20180115" &
type == "murder" &
city == "R City") %>%
select(description)
Part 2 - Find witnesses
Find the witness statement.
Hint 1: Click to reveal!
You will not be able to locate the witness statement before finding out which persons gave the statmenents.
Hint 2: Click to reveal!
Investigate the person data file, and look for a person that lives at the last house on Northwestern Dr and a person named Annabel that lives on Franklin Ave.
Hint 3: Click to reveal!
To find the person that lives on Northwestern dr you can use the dplyr function ‘arrange()’ to sort address numbers.
Hint 4: Click to reveal!
To find Annabel you can use a regular expresions and the stringr function ‘str_detect()’.
Solution
# Read person data
person <- vroom(here("data-raw/person.csv"))
# Find person living on the last house on Northwestern Dr
person %>%
filter(address_street_name == "Northwestern Dr") %>%
arrange(desc(address_number))
# Find the person named Annabel living on Franklin Ave
person %>%
filter(address_street_name == "Franklin Ave" &
str_detect(name, "Annabel"))
Part 3 - Look at witness statement
Find the witness statements.
Hint 1: Click to reveal!
Look for a witness statement with a matching ‘id’
Hint 2: Click to reveal!
Vroom might not be able to guess the file delimited. Check the ‘?vroom’ help file to see if there is any arguments you can use to read the file.
Hint 3: Click to reveal!
Annabel Miller has a ‘id’ of ‘16371’
Hint 4: Click to reveal!
Morty Schapiro has a ‘id’ of ‘14887’
Hint 5: Click to reveal!
The ‘id’ variable from the person table is the same as the ‘person_id’ variable in the inteview table
Hint 6: Click to reveal!
To find the statement you can filter on ‘person_id’
Solution
# Read interview file
interview <- vroom(here("data-raw/interview.csv"),
delim = ",")
# Find Annabel Miller's statement
interview %>%
filter(person_id == 16371) %>%
select(transcript)
# Find Morty Schapiro's statement
interview %>%
filter(person_id == 14887) %>%
select(transcript)
Part 4.1 - Who was at the gym? (Annabel)
Find out who Annabel saw
Hint 1: Click to reveal!
You need to find out who visited the gym at the time Annabel noticed someone.
Hint 2: Click to reveal!
Annabel also noticed that the last part of the members id was ‘48Z’.
Hint 3: Click to reveal!
You might need to use two datasets at the same time.
Hint 4: Click to reveal!
You need the ‘get_fit_now_member’ and ‘get_fit_now_check_in’ data sets.
Hint 5: Click to reveal!
You might be able to combine the data somehow.
Hint 6: Click to reveal!
Using a function such as ‘left_join’ could work.
Hint 7: Click to reveal!
The dplyr and stringr package has some great functions that can help you along.
Hint 8: Click to reveal!
Regex :)
Hint 9: Click to reveal!
A combination of ‘filter()’ and ‘str_detet()’ is always great to use when extracting data about gym members.
Solution
# Read data files
get_fit_now_member <- vroom(here("data-raw/get_fit_now_member.csv"))
get_fit_now_check_in <- vroom(here("data-raw/get_fit_now_check_in.csv"))
# Join data
fitness_person_checkin_combined <- left_join(x = get_fit_now_member, y = get_fit_now_check_in, by = c("id" = "membership_id"))
# Find people that were at the gym Annabel specified
fitness_person_checkin_combined %>%
filter(check_in_date == 20180109 &
str_detect(id, "48Z"))
Part 4.2 - Who owns the car that Morty saw?
Figure out who of the two gym rats that own the car that Morty saw.
Hint 1: Click to reveal!
You can perhaps use the ‘licence_id’ to identify the perpetrator.
Hint 2: Click to reveal!
You might need to combine some tables in order to find the person of interest.
Hint 3: Click to reveal!
Morty saw a licence plate number that contained ‘H42W’
Solution
# Read data files
driver_licence <- vroom(here("data-raw/drivers_license.csv"))
# Find licence ID of Joe and Jeremy
person %>%
filter(id %in% c(28819, 67318)) %>%
select(c(name, license_id))
# Find car
driver_licence %>%
filter(id %in% c(173289, 423327))
# Combine data - Easier to read
combined_data <- left_join(x = person, y = driver_licence, by = c("license_id" = "id")) %>%
filter(license_id %in% c(173289, 423327) &
str_detect(plate_number, "H42W"))
# Print data
combined_data