Homework 2
CDS-101 (Spring 2017)
Name: Mike Ninov
Question 1 In order to produce a vector containing all integeres from 1-100, we can simply use seq(1,100,1). In order to find which integers are NOT ! divisible by 2,3,7 we use the ! and %% operators.
Question 2a Find all flights that flew to IAH or HOU
Question 2b Find all fligths operated by United, American, or Delta.
Question 2c Find all flights delayed by at least 1 hour, but made it up over 30 minutes in flight.
Question 2d Find all flights between midnight and 6 AM inclusive.
Question 3 Convert dep_time and sched_dep_tim to a more convenient representation of number of minutes since midnight.
Question 4 Compare air_time with arr_time - dep_time. What do you expect to see? What do you see? What do you need to do to fix it?
I expect to see an error because air_time is a [removed] double, and arr_time & dep_time are [removed] integers. The arr_time is a 24 hour time format, but dep_time is calcuated with repect to midnight. When you attempt arr_time-dep_time, your answer would be wrong.
In order to fix this problem, one solution would be to convert arr_time & dep_time into a standardized time format.
Question 5a Consider number of cancelled flights. Deterimine the definition of a flight cancellation. As seen above, there are no flights that arrived but did not depart, so we can just use the !is.na(dep_delay)
Question 5b Find the pattern of cancelled flights in relation to average delay. The canx/avg_delay shows a strong correlation between cancellations and delay; if one is high then the other is likely to be as well.
Question 6 What time of day should you fly if want to avoid delays? You would want to avoid flying late at night as the flight delays of the day accumilate into more delays in the evening.