This function generates a dataframe similar to the
flights
dataset from nycflights13
for any US airport and time frame. Please
note that, even with a strong internet connection, this function
may take several minutes to download relevant data.
get_flights(station, year, month = 1:12, dir = NULL, ...)
station | A character vector giving the origin US airports of interest (as the FAA LID airport code). |
---|---|
year | A numeric giving the year of interest. This argument is currently not vectorized, as dataset sizes for single years are significantly large. Information for the most recent year is usually available by February or March in the following year. |
month | A numeric giving the month(s) of interest. |
dir | An optional character string giving the directory to save datasets in. By default, datasets will not be saved to file. |
... | Currently only used internally. |
RITA, Bureau of transportation statistics, https://www.bts.gov
A data frame with ~1k-500k rows and 19 variables:
year, month, day
Date of departure
dep_time, arr_time
Actual departure and arrival times, UTC.
sched_dep_time, sched_arr_time
Scheduled departure and arrival times, UTC.
dep_delay, arr_delay
Departure and arrival delays, in minutes. Negative times represent early departures/arrivals.
hour, minute
Time of scheduled departure broken into hour and minutes.
carrier
Two letter carrier abbreviation. See
get_airlines
to get full name
tailnum
Plane tail number
flight
Flight number
origin, dest
Origin and destination. See
get_airports
for additional metadata.
air_time
Amount of time spent in the air, in minutes
distance
Distance between airports, in miles
time_hour
Scheduled date and hour of the flight as a
POSIXct
date. Along with origin
, can be used to join
flights data to weather data.
This function currently downloads data for all stations for each month
supplied, and then filters out data for relevant stations. Thus,
the recommended approach to download data for many airports is to supply
a vector of airport codes to the station
argument rather than
iterating over many calls to get_flights()
.
If you are repeatedly getting a timeout error when downloading flights,
this could be because your download is taking longer than the default timeout
R option. You can change the timeout value for your R session by running the
code option(timeout = timeout_value_in_seconds)
in your console.
get_weather
for weather data,
get_airlines
for airlines data,
get_airports
for airports data,
get_planes
for planes data,
or anyflights
for a wrapper function.
Use the as_flights_package
function to convert this dataset
to a data-only package.
# flights out of Portland International in June 2018
if (FALSE) get_flights("PDX", 2018, 6)
# ...or the original nycflights13 flights dataset
if (FALSE) get_flights(c("JFK", "LGA", "EWR"), 2013)
# use the dir argument to indicate the folder to
# save the data in \code{dir} as "flights.rda"
if (FALSE) get_flights("PDX", 2018, 6, dir = tempdir())