In this tutorial, you will learn
install.packages(c("lubridate", "ggthemes", "forcats"))
and if you missed tutorial 1, you will also need to install packages listed in those instructions.
Say hello to your tutor and to your neighbours!
This question relates to the Tidy Tuesday Data on locations of alternative fuel recharging stations. Have a read through this site, and also visit the link to the data providers, DOT.
library(tidyverse)
library(ggthemes)
library(lubridate)
stations <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-03-01/stations.csv')
# Filter to continental USA, but this cannot
# be done using states because some records have
# erroneous lat/long
# stations <- stations %>%
# filter(!(STATE %in% c("AK", "HI", "PR", "ON")))
# Map the sites
usa <- map_data("state")
# Filter to continental USA using map boundary
stations <- stations %>%
filter(between(LONGITUDE, min(usa$long)-1, max(usa$long)+1),
between(LATITUDE, min(usa$lat)-1, max(usa$lat)+1))
ggplot() +
geom_path(data=usa, aes(x=long, y=lat, group=group), colour="grey80") +
geom_point(data=stations, aes(x=LONGITUDE,
y=LATITUDE,
colour=FUEL_TYPE_CODE),
alpha=0.8) +
facet_wrap(~FUEL_TYPE_CODE, ncol=4) +
coord_map() +
theme_map() +
theme(legend.position = "none")
# Time line of opening
stations %>%
mutate(OPEN_DATE = as.Date(OPEN_DATE)) %>%
filter(!is.na(OPEN_DATE)) %>%
mutate(m = month(OPEN_DATE),
yr = year(OPEN_DATE)) %>%
#filter(y < 2022) %>%
mutate(open_yrmth = as.Date(paste(yr, m, "01", sep="-"), "%Y-%M-%d")) %>%
group_by(open_yrmth, FUEL_TYPE_CODE) %>%
summarise(nopen = n(), .groups = "drop") %>%
ggplot(aes(x=open_yrmth,
y=nopen,
colour=FUEL_TYPE_CODE)) +
geom_line() +
facet_wrap(~FUEL_TYPE_CODE, ncol=4, scales="free_y") +
ylab("# opening") +
scale_x_date("", date_labels="%y") +
theme(legend.position = "none")
Here we will look at the Chocolate bar ratings. Details (brief) of how the data was collected are provided here and more about the data itself is here.
chocolate <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-18/chocolate.csv')
library(forcats)
keep <- chocolate %>%
count(country_of_bean_origin, sort = TRUE) %>%
filter(n>10) %>%
pull(country_of_bean_origin)
chocolate %>%
filter(country_of_bean_origin %in% keep) %>%
ggplot(aes(x=fct_reorder(country_of_bean_origin, rating, mean),
y=rating)) +
geom_jitter(width=0.1) +
stat_summary(fun = mean, fun.min = median, fun.max = median,
geom = "point", colour = "orange") +
xlab("") +
coord_flip()
Read the description of the study titled “Clearing the Fog: Is Hydroxychloroquine Effective in Reducing COVID-19 Progression (COVID-19)”.
hcq <- tibble(trt = c("standard", "standard", "hcq", "hcq"),
progression = c("all", "yes", "all", "yes"),
count = c(151, 5, 349, 11))
hcq %>%
pivot_wider(names_from = "progression", values_from = "count") %>%
mutate(p = yes/all) %>%
mutate(se = sqrt(p*(1-p)/all))
This tutorial will prepare you for the material of next week. Please follow the following steps:
Enter the address https://olc.worldbank.org/ in your browser.
Click on “Register” and create a profile.
Once you login, the following link can be used to enroll in the course Open Data for Data Users (Self-Paced), https://olc.worldbank.org/content/open-data-data-users-self-paced
Please allow the pop up blocker at the browser settings to run the course successfully.
Follow the whole course and answer the questions throughout.
Talk to your tutor about what you think you learned today, what was easy, what was fun, what you found hard.