Everything needed to complete the assignment was in the tutorial-03.zip provided. Here are possibly some of the expected things you will find when running the code.
Explore your data! These tasks should be done using the dplyr interface, so that the tidy wrangling verbs can be used instead of raw SQL functions.
WN = Southwest had the most flights.
LAX = Los Angeles had the most departing flights.
LAX is the busiest airport. The smallest departure delay for the month was much less than 0, which means the flight left early, quite early. The longest delay was a day later. (This looks like a time zone calculation error.) The median delay was is less than 0. That means that 50% of the flights left before scheduled.
mutate
.)forcats
package).Overall, there is not much difference in the median delays, and the variation in delays between carriers. American Airlines (AA) and Southwest (WN) have the highest median delay and Frontier Airlines (F9) has the lowest median delay. Skywest (OO) has the flight with the longest delay.
Here we are going to add a new table with airport information, and use this to make a map of flights.
Read the airport location data into R, and add a table to your database.
Plot the locations on a map. You should filter the airports to
only the latest location. Airports sometimes move 🤭. An Open Street Map
can be downloaded using the get_map()
function in the
ggmap
package.
Now the fun part, lets take a day’s worth of flights, and plot all the flights. You will need to join the day of flights data with the airport locations, using both the ORIGIN and DESTination.
Choose the two major carriers for your day of data, and make two separate maps of flights, one for each carrier. Compare and contrast the carrier flight patterns.
I chose Delta and Southwest. It looks a little like Delta has more of a hub system. We see this because both airlines are high volume carriers, but spatially it looks like Southwest has more flights than Delta. Delta flights are operating between fewer airports, and Southwest is more distributed, serving many more airports.
There’s not a lot to see in four big groups like this. Its an exercise in working with time. And also in ordering the four groups appropriately.
Plane N243WN have 8 flights during the day, but it goes back and forth between three airports.