Doing your own on-time flight time analysis Part III

In the last post, the data from the on-time flight database was loaded in a column-orientated storage engine. Now the numbers can be crunched.

The original goal of this exercise was to find the flight from Los Angeles International Airport, LAX, to Dallas Fort Worth International Airport, DFW, that was the most likely to arrive on-time.

The data is 'opportunity rich' in that there is a lot information in there. It is easy to start wondering about the various nuggets of information in there. Are their certain aircraft (tail numbers) that are routinely bad performers? Are some days of the week better than others? Do national holidays have an effect on the on-time performance? If you are delayed, is there a 'regular amount' of delay? Does early departure make for an early arrival? Can the flight crew make up for a late departure? How much time is usually spend on runways?

But to look for the flight from LAX …

Doing your own on-time flight analysis, Part I

This will be a quick tutorial on looking at on-time flight analysis. This material will be part of a lab for a class on InfiniDB that I am developing. The information is from Data.Gov Website and you are free to follow the steps presented.

What I want to know is what flight from a certain airport arrives at my local airport on time the most frequently. Traveling from LAX to DFW can often be a combination of cancellations, flight delays, and being the nth plane in line for takeoff. So what is the best flight choice for that route?

The first step is getting the data. And is is available for free from Airline On-Time Performance and Causes of Flight Delays. Be sure to select the check box for documentation so that there will be a readme.html to described the file …

