In the last post, the data from the on-time flight database was loaded in a column-orientated storage engine. Now the numbers can be crunched.
The original goal of this exercise was to find the flight from Los Angeles International Airport, LAX, to Dallas Fort Worth International Airport, DFW, that was the most likely to arrive on-time.
The data is 'opportunity rich' in that there is a lot information in there. It is easy to start wondering about the various nuggets of information in there. Are their certain aircraft (tail numbers) that are routinely bad performers? Are some days of the week better than others? Do national holidays have an effect on the on-time performance? If you are delayed, is there a 'regular amount' of delay? Does early departure make for an early arrival? Can the flight crew make up for a late departure? How much time is [Read more...]