Project name : Travel-DDDAS Class: CS689-002 Instructor ...

46
CS 689-002 2007 Travel-DDDAS Page 1 of 46 Project name : Travel-DDDAS Class: CS689-002 Instructor: Dr. Douglas Authors: Divya Bansal Soham Chakraborty Jay Hatcher Ray Hyatt , Jr Chun-Lung Lim Mark Maynard Trevor Presgrave

Transcript of Project name : Travel-DDDAS Class: CS689-002 Instructor ...

Page 1: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 1 of 46

Project name : Travel-DDDAS Class: CS689-002

Instructor: Dr. Douglas

Authors: Divya Bansal Soham Chakraborty Jay Hatcher Ray Hyatt , Jr Chun-Lung Lim Mark Maynard Trevor Presgrave

Page 2: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 2 of 46

Introduction

Each year approximately twenty percent of flights in the U.S. are delayed, costing

large sums of money in terms of lost business opportunities. There is currently no known

method to predict the delay or cancellation of flights with any reasonable degree of

accuracy. We propose to develop a dynamic data-driven application simulation

(DDDAS) to track the results of airline flights over time and use this data to accurately

predict the probability of delay or cancellation of a flight. The factors taken into account

are weather, terrorism threat level as reported by the FAA, and historical trends (for

holidays, seasonal patterns, etc.).

Our approach is outlined in the following sections:

Sensors – This section discusses the design and implementation of sensors used to

collected data for the DDDAS.

Transport – This section describes the software used for transferring data between the

sensors and the DDDAS.

Datastore – The datastore is the central repository for collected data. This section

discusses the implementation of the datastore and its relationship to the rest of the

system.

Predictor – This section describes the software used to predict flight delay and

cancellation based on historical data and current conditions.

Corrector – This section discusses the corrector, a program that analyses the accuracy of

predictions and makes corrections to the predictor to improve accuracy over time.

Page 3: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 3 of 46

Project Assignment Breakdown

Divya Bansal Datastore, DB back-up, DB

Synchronization

Soham Chakraborty

Modifications to Wei Li’s code, Datastore

Jay Hatcher Corrector Module

Ray Hyatt Delta Airlines Sensor, Wunderground

Sensor, Datastore

Chun-Lung Lim National Weather Association Sensor

Mark Maynard Predictor, Yahoo Weather Sensor

Trevor Presgrave Threat Level Sensor, Flightlookup

Weather Sensor

Page 4: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 4 of 46

Delta Airlines sensor (Ray Hyatt)

This sensor collects data from the delta airlines’ website and reformats it into a

form suitable for use by our various tools.

Background

The Delta Airlines sensor had several interesting challenges in its evolution to the current state.

The first challenge in the prototype was getting the data from the delta website1. No API was found so I had to look for some viable alternative. After some experimenting, a url2 was discovered that permitted you to place the date and flight number directly into the url, resulting in the browser fetching the desired information from the website without having to bother with forms or other input. This worked fine for browsers, but it caused problems initially in command line tools like fetch3 or wget4 . It was discovered that this was due to the various cookies the website was setting and after isolating that cookie, it was possible to configure wget to fetch pages reliably. Once the page was collected it was piped through a number of grep statements to pick out the interesting parts and to exclude the unwanted data. This proved that it could be done, so I moved to implementation of a production model in python.

Python was chosen largely due to strong lobbying for it during our in class discussions. This being my first time coding from the ground up in python, I spent a large amount of time figuring out what libraries where available to make the job easier. A key library was mechanize5 which neatly wrapped up the entire url fetching process in a single command and bypassed all of the manual cookie handling. The pre-April version of the sensor also used libxml2dom6 to assist in stripping the html & xml formatting from the page but due to problems in date handling and delta’s changes to the website it was dropped from the final version.

The initial version of delta sensor outputted a comma separated value (CSV) list for each flight entry of the format:

['Epoch', 'Flight', 'Departing', 'Planned', 'Actual', 'Arriving', 'Planned', 'Actual', 'Status']

1 http://www.delta.com/home/index.jsp 2 http://www.delta.com/flifo/servlet/DeltaFlifo?flight_number=442&flight_date=Today&request=main 3 http://www.freebsd.org/cgi/man.cgi?query=fetch&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html 4 wget is a GNU tool available from http://www.gnu.org/software/wget/ 5 http://wwwsearch.sourceforge.net/mechanize/ 6 http://www.boddie.org.uk/python/libxml2dom.html

Page 5: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 5 of 46

Once the datastore format became available I wrote a a perl script, loaddb, to transform it into a series of insert statements for postgres sql. Since it was in CSV form, this could have been converted in a number or other ways, but I needed to rewrite the flight field and the airport fields to match the final format of the datastore so I used perl to do all of these at once.

At the start of April, delta overhauled their website, changing the format of the page, and breaking the first version of delta sensor. This coupled with a finalization of the data format7 in the datastore prompted major changes to the delta sensor. Specifically, I reduced the python portion of the delta sensor, via heavy commenting, to fetching the url, generating the timestamp, and then outputting this to standard out only. This output could be redirected to a file or piped to another program in typical unix command line fashion. I modified the makefile for the project to have several new build stanzas, specifically ones to take a collection of flights from flightlist.txt and generate either flat text output files or postgres insert statements which would be appended to a master insert file. That file could then be loaded into the database via a simple \i filename on the postgres command prompt. It would not be difficult to have these directly loading into the database but since remote connectivity to the database was difficult, a firewall was blocking access, this seemed like a reasonable workaround. A sensor running on the same box with the database should have no problems with the firewall and couple deliver that data directly.

Current Implementation

The basic flow of the final sensor is this:

Delta_s

Website

Delta

fetcher

Delta

Scrubber

Delta

to sql

Delta fetcher The delta fetcher was a stripped down version of the original delta sensor as

discussed in the background section. It built a valid URL from the input parameters, flight number and date, and used that to fetch the webpage detailing the flight. That output was piped directly to the delta scrubber.

Delta scrubber The delta scrubber was a filter that stripped the html and xml formatting from the

page and extracted the key information vi a regular expression of the form:

'EPOCH|Flight.*on|\([A-Z][A-Z][A-Z]\)|[0-9]am|[09]pm|AM|PM|Jan|Feb| Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|Time|Complete|Diverted|Delayed|

7 See appendix for table formats for datastore

Page 6: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 6 of 46 Canceled|In flight'

This was immediately followed by another regular expression to eliminate unwanted data the first regular expression collected: 'Notifications|Security|Business|<.*>|Travel|Equipment|font|1\-800|=|\+

Using egrep8 these regular expressions where easily implemented in the Makefile pipe stack in stanzas flighttobd: and flighttofile: (which formatted horribly here since the lines are rather long in the Makefile) flighttodb: deltaouttosql.pl Makefile ./deltaflightworkaround.py -d ${DAY} -f $(FLIGHT) \ | egrep 'EPOCH|Flight.*on|\([A-Z][A-Z][A-Z]\)|[0-9]am|[0-9]pm|AM|PM|Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|Time|Complete|Diverted|Delayed|Canceled|In flight'\ | egrep -v 'Notifications|Security|Business|<.*>|Travel|Equipment|font|1\-800|=|\+' \ | ./deltaouttosql.pl \ >> $(FLIGHT).$(DAY).insert

flighttofile: Makefile ./deltaflightworkaround.py -d ${DAY} -f $(FLIGHT) \ | egrep 'EPOCH|Flight.*on|\([A-Z][A-Z][A-Z]\)|[0-9]am|[0-9]pm|AM|PM|Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec|Time|Complete|Diverted|Delayed|Canceled|In flight'\ | egrep -v 'Notifications|Security|Business|<.*>|Travel|Equipment|font|1\-800|=|\+' \ > $(FLIGHT).$(DAY).out

8 http://www.freebsd.org/cgi/man.cgi?query=egrep&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html

Page 7: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 7 of 46 This produced a nice, concise output like the following examples

A flight that has been completed Flight #1 EPOCH 1177819195

Flight 48 on Sat, 28 Apr 2007

Flight Complete (SLC) 1:34pm 28 Apr 1:30pm 28 Apr (CVG) 7:00pm 28 Apr 6:51pm 28 Apr

Flight Complete (CVG) 7:40pm 28 Apr 7:35pm 28 Apr (FRA)

10:15am 29 Apr 10:22am 29 Apr

Or a flight that hasn’t taken off yet

Flight #2 EPOCH 1178020871

Flight 959 on Tue, 01 May 2007 On Time (FLL) 5:50pm 01 May On Time (ATL) 7:47pm 01 May On Time On Time (ATL) 9:15pm 01 May On Time (SAN)

11:00pm 01 May On Time

Page 8: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 8 of 46

Delta to sql Note the differences in Flight #2 which has not begun and Flight #1 which has been

completed, specifically the differences in the number of fields. This prevents a naïve mapping of the output to sql inserts and required either counting fields ahead of time, which the pre-April version did, or writing a state machine to make choices on the data as it was processed, which the current version does.

The basic flow of the state machine is as follows (diagram generated by dot9):

9 www.graphviz.org source to diagrams in cvs:/travel-dddas/docs

Page 9: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 9 of 46

The state machine was written in perl, and is a very simple implementation using under two dozen states. At each state typically a piece of the input is consumed and possibly a decision made if a branch to another path is needed. This allows it to understand multi-legged flights and flights in all their observed states. It outputs a single sql insert statement for each leg of the flight. It also captures additional information internally not used in our current implementation and could be easily modified to output that information as well.

Current output produced as applied to the previously mentioned flights:

Page 10: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 10 of 46

Flight #1 sql insert into airline (epoch,sensorid,datatype,flight,departairportcode,departdate,arrivedate,arriveairportcode, status) values (1178020871,'dddas.deltasensorRay', 'flight', '959', 'FLL' ,'01May 5:50pm','01May 7:47pm','ATL','OnTime'); insert into airline (epoch,sensorid,datatype,flight,departairportcode,departdate,arrivedate,arriveairportcode, status) values (1178020871,'dddas.deltasensorRay', 'flight', '959', 'ATL' ,'01May 9:15pm','01May 11:00pm','SAN','OnTime');

Flight #2 sql insert into airline (epoch,sensorid,datatype,flight,departairportcode,departdate,arrivedate,arriveairportcode, status) values (1177819195,'dddas.deltasensorRay', 'flight', '48', 'SLC' ,'28Apr 1:34pm','28Apr 7:00pm','CVG','FlightComplete'); insert into airline (epoch,sensorid,datatype,flight,departairportcode,departdate,arrivedate,arriveairportcode, status) values (1177819195,'dddas.deltasensorRay', 'flight', '48', 'CVG' ,'28Apr 7:40pm','29Apr 10:15am','FRA','FlightComplete');

Delta Sensor Future Direction

The Delta Sensor needs an automated check to see if the website format has changed.

One way this could be done is to take several copied of the same page over a span of a few hours to days and figure out what changes. diff10 would be handy here. Once the dynamic elements are identified, a regular expression could be used to strip them from the page and the remaining output becomes a signature for that page. Testing for website changes then becomes as straight forward as collecting the target page, stripping the dynamic elements and diffing it against the signature page. If changes have occurred notify the sensor support team that and update has occurred and investigation is needed.

10 http://www.freebsd.org/cgi/man.cgi?query=diff&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html

Page 11: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 11 of 46

Delta Sensor Conclusion

The Delta Sensor was a nice challenge to implement. Most of the time was spent in finding a stable source for the data and combating the changes in the site. Once it was found the implementation was relatively straight forward. A minor challenge was the variable number of legs a given flight may have and the various possible changes a flight could make in progress. The sensor eventually suffered somewhat from the changes in the delta website format. If an API for Delta’s data remains unavailable then I expect man-hours will have to be allocated to monitor and update the data sensor on a periodic basis.

Wunderground sensor

(Ray Hyatt)

Background

This is a sensor to collect data from wunderground11 I wrote up as a quick adaptation of the early version of delta sensor. If was not updated to match the latested data requirements as our team elected to use a different sensor for the production product due to its available API.

Wunderground sensor design

This is a trivial rewrite of the delta sensor if simply fetches the page and uses regular

expressions and libxml2dom to eliminate the unwanted text and keep only the interesting fields. Data was written to a logfile after it was collected.

Future Direction

It won’t take much work to bring this up to standard, modify the regular expression in

weatherdata2 = re.findall(".*Current Conditions.*|.*Weather Underground RSS.*", p.sub('',sitereturns.toString()))

to collect any additional fields from the output needed for the project.

Delta Sensor Conclusion

This turned out to be mostly an exercise in adapting an existing python script as the team chose to use a different sensor.

11 http://www.wunderground.com/

Page 12: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 12 of 46

United Airlines sensor (Mark Maynard)

This sensor takes flight information from the United website and formats it for later use.

Background

Creating a sensor for United flight information was a very challenging task, the data had to be retrieved from the website and then parsed into a usable format. As the intended purpose of United site information was human readable web display many adjustments were needed.

Sensor Components

Data Retrieval To begin data retrieval from United I created a python script to gather the information

from the website. To get the html for the website I used the mechanize12 package for web browsing. Each segment of a flight is gathered from a separate url request with differing destination and origin airport codes. #specify url url="http://www.ua2go.com/flifo/FlightDetail.do?airline=UA&fltNbr=6819&orig=LEX&dest=ORD&date=20070501&stamp=null" # load data from united site fetch = mechanize.Request(url) result = mechanize.urlopen(fetch) #turn data into text resultstring = result.read()

Data Parsing Once the web page had been downloaded and placed into a string I was able to use the re13 package for regular expressions to retrieve the relevant data. As the pertinent information was place inside layers of html each parameter was parse using several steps. Below are the three regular expressions used to find the flight number: Flights1 = r"Flight:[\w\W]{100,180}[0-9]{3,4}" Flights2 = r'face[\w\W]{1,30}[0-9]{3,4}' Flights3 = r'[0-9]{3,4}'

Firstly the regular expression Flights1 must find a string starting with “Flight:”. The set

12 See Appendix M1 13 See Appendix M1

Page 13: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 13 of 46

[\w\W] accepts all characters, the {100,180} following specifies this must match at least 100 times and up to 180. The set [0-9] is the set of all integers, this must be matched three to 4 times. The large gap of any character allows for large amounts of html between the label “Flight:” and the flight number. Flights2 will then parse the string further starting with “face” which is part of the font tag in html. Once this has been refined the Flights3 expression looks for the flight number. This iterative approach is needed to ensure that some random 3 to 4 digit number in the html isn't mistaken for the flight number. Data output Once all the data is gathered is is output in the following format: ['Epoch', 'Flight', 'Departing', 'Planned', 'Actual', 'Arriving', 'Planned', 'Actual', 'Status'] Example: [1178064104, Flight:6819, Lexington, KY (LEX), 6:00 AM, unknown, Chicago O'Hare, IL (ORD), 6:20 AM, unknown, ARRIVED]

As United does not always display the information required the value unknown has been used to fill locations for missing data. To keep track of all the data being generated a database was required. Using the pygresql14 package to access a Postgres15 database the information from United was placed into a database where other DDDAS components can easily access it. First a connection to the database is made providing the database name, host name and user name. To add data to the database an insert query is submitted in the following format: insert into airline values ('1178066199','unitedsensor','flight','6819','LEX','Tue, May 1 6:00 AM','Tue, May 1 unknown','ORD','ARRIVED')

The following code snippet connects to the database and inserts the data retrieved from united: con1 = pg.connect(dbname='travel_dddas_dev', host='localhost', user='dddas') insert="('"+NOW+"','unitedsensor','flight','"+string.join(flightNumber,"")+"','"+string.join(codes[0],"")+"','"+ dateparse + " " + string.join(schedDep,"")+"','"+ dateparse + " " + string.join(actualArr,"")+"','"+string.join(codes[1],"")+"','"+ string.join(status,"")+"')" con1.query("insert into airline values"+insert)

The pg.connect command makes the connection to the database travel_dddas_dev. A variable insert is created to produce a string containing the value list of parameters. Next a query is used to insert the new data into the table.

14 See Appendix M1 15 http://www.postgresql.org/

Page 14: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 14 of 46

Current Status and Further Work

The United Sensor can currently gather flight information for individual legs of the flight through url requests and insert them into the database. It has limited error handling with some of the fields having a default value of unknown. What is needed is a mechanism to automate queries such as a cron job16. Currently some disabled code is available in the United parser to change the flight by command line which could be modified to accept changes in origin and destination. With these modifications this parser could serve it's purpose quite well, although with the regular site redesigns by United it is by no means a permanent solution.

Yahoo! Weather

(Mark Maynard)

To gain weather information related to flights one of the avenues explored was Yahoo!'s weather service. This service is very developer friendly and provides a plethora of data based on zip code.

Yahoo! RSS Yahoo! provides RSS17 feeds for their weather reports and forecasts. On their developer

network they even provide a tutorial on accessing this information through Python along with a name space minidom18 uses for parsing. Using the urllib19 library to fetch the RSS feed a mindom to parse it.

#create rss ur; WEATHER_URL = 'http://xml.weather.yahoo.com/forecastrss?p=%s&u=c' #location of namespace for RSS feed WEATHER_NS = 'http://xml.weather.yahoo.com/ns/rss/1.0' url = WEATHER_URL % zip_code dom = minidom.parse(urllib.urlopen(url)) #get atmosphere element yatmosphere = dom.getElementsByTagNameNS(WEATHER_NS, 'atmosphere')[0] #get humidity from atmosphere tag wData.humidity=yatmosphere.getAttribute('humidity')

RSS Information Yahoo! Weather's RSS system contains a vast amount of information about weather conditions. This information is based on area code and is accessed by attribute and tag name. title The title of the feed, which includes the location city. For example "Yahoo! Weather - Sunnyvale, CA"

16 http://www.unixgeeks.org/security/newbie/unix/cron-1.html 17 http://en.wikipedia.org/wiki/RSS_(file_format) 18 See Appendix M1 19 See Appendix M1

Page 15: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 15 of 46

link The URL for the Yahoo! Weather page of the forecast for this location. For example http://us.rd.yahoo.com/dailynews/rss/weather/ Sunnyvale__CA/ *http://xml.weather.yahoo.com/ forecast/94089_f.html

language The language of the weather forecast, for example, en-us for US English.

description The overall description of the feed including the location, for example "Yahoo! Weather for Sunnyvale, CA"

lastBuildDate The last time the feed was updated. The format is in the date format defined by RFC822 Section 5, for example Mon, 256 Sep 17:25:18 -0700. ttl Time to Live; how long in minutes this feed should be cached. yweather:location The location of this forecast. Attributes:

• city: city name (string)

• region: state, territory, or region, if given (string)

• country: two-character country code. (string)

yweather:units Units for various aspects of the forecast. Attributes:

• temperature: degree units, f for Fahrenheit or c for Celsius (character)

• distance: units for distance, mi for miles or km for kilometers (string)

• pressure: units of barometric pressure, in for pounds per square inch or mb for millibars (string)

• speed: units of speed, mph for miles per hour or kph for kilometers per hour (string)

Note that the default RSS feed uses Fahrenheit degree units and English units for all other attributes (miles, pounds per square inch, miles per hour). If Celsius has been specified as the degree units for the feed (using the u request parameter), all the units are in metric format (Celsius, kilometers, millibars, kilometers per hour).

yweather:wind Forecast information about wind. Attributes:

• chill: wind chill in degrees (integer) • direction: wind direction, in degrees (integer) • speed: wind speed, in the units specified in the speed attribute of the

yweather:units element (mph or kph). (integer)

yweather:atmosphere Forecast information about current atmospheric pressure, humidity, and visibility. Attributes:

• humidity: humidity, in percent (integer) • visibility, in the units specified by the distance attribute of the

yweather:units element (mi or km). Note that the visibility is specified as the actual value * 100. For example, a visibility of 16.5 miles will be specified as 1650. A visibility of 14 kilometers will appear as 1400.

Page 16: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 16 of 46

(integer) • pressure: barometric pressure, in the units specified by the pressure

attribute of the yweather:units element (in or mb). (float). • rising: state of the barometric pressure: steady (0), rising (1), or

falling (2). (integer: 0, 1, 2)

yweather:astronomy Forecast information about current astronomical conditions. Attributes:

• sunrise: today's sunrise time. The time is a string in a local time format of "h:mm am/pm", for example "7:02 am" (string)

• sunset today's sunset time. The time is a string in a local time format of "h:mm am/pm", for example "4:51 pm" (string)

image The image used to identify this feed. See Image Elements for element descriptions Child elements: url, title, link, width, height

item The local weather conditions and forecast for a specific location. See Item Elements for element descriptions. Child elements: title, link, description, guid, pubDate, geo:lat, geo:long, yweather:forecast

title The forecast title and time, for example "Conditions for New York, NY at 1:51 pm EST" link The Yahoo! Weather URL for this forecast. description A simple summary of the current conditions and two-day forecast, in HTML format, including a link to Yahoo! Weather for the full forecast. guid Unique identifier for the forecast, made up of the location ID, the date, and the time. The attribute isPermaLink is false. pubDate The date and time this forecast was posted, in the date format defined by RFC822 Section 5, for example Mon, 25 Sep 17:25:18 -0700. geo:lat The latitude of the location. geo:long The longitude of the location. yweather:condition The current weather conditions. Attributes:

• text: a textual description of conditions, for example, "Partly Cloudy" (string)

• code: the condition code for this forecast. You could use this code to choose a text description or image for the forecast. The possible values for this element are described in Condition Codes (integer)

• temp: the current temperature, in the units specified by the

Page 17: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 17 of 46

yweather:units element (integer) • date: the current date and time for which this forecast applies. The

date is in RFC822 Section 5 format, for example "Wed, 30 Nov 2005 1:56 pm PST" (string)

yweather:forecast The weather forecast for a specific day. The item element contains multiple forecast elements for today and two days in the future. Attributes:

• day: day of the week to which this forecast applies. Possible values are Mon Tue Wed Thu Fri Sat Sun (string)

• date: the date to which this forecast applies. The date is in "dd Mmm yyyy" format, for example "30 Nov 2005" (string)

• low: the forecasted low temperature for this day, in the units specified by the yweather:units element (integer)

• high: the forecasted high temperature for this day, in the units specified by the yweather:units element (integer)

• text: a textual description of conditions, for example, "Partly Cloudy" (string)

• code: the condition code for this forecast. You could use this code to choose a text description or image for the forecast. The possible values for this element are described in Condition Codes (integer)

The condition codes returned by Yahoo! Weather are integer values that correspond to different weather conditions such as rain and snow. Not only do these values correspond to different weather condition some of them such as “mostly cloudy (night)” include time of day. Below is a sample output from the Yahoo! Sensor:

Yahoo! Weather - Lexington, KY humidity: 61 visibility: 1609 precipitation: fair (night) temperature: 21 wind speed: 10 direction N

Current Status and Further Work

The weather sensor can currently query weather info by area code and display this information as text. Currently the weather information is not being stored in the database as decisions still need to be made as to weather we want to use weather statistics from multiple sights. Another issue to address is the need to map airports to area codes so that weather data can be correlated with flight data. Yahoo!'s weather service provides an easy and reliable way to gather weather data which is provided with a name space and support for maintaining the code.

Page 18: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 18 of 46

Datastore (Ray Hyatt, Divya Bansal)

This is the central storage for all the data and predictions generated by our DDDAS.

Background

Originally we planned to just make a custom, minimal datastore program, perhaps using

flat files with an interface suitable to later convert to a full database. However after looking at the evolving requirements, I decided to just jump ahead to a full SQL database system and eliminate the need to change later. I chose postgres 20as the database and installed it on my testbed for this project and the central class server ml-dddas. Like many well behaved open source projects, this one installed from source without significant difficulty.

Datastore design

This design evolved over time, I’ll discuss the current implementation. We have 4 major

tables currently: airline, weather, prediction, and corrector. Airline Field epoc

h sensorid

datatype

flight Departing airport code

Depart date

Arrive date

Arriving airport code

Status

format INT CHAR 40

CHAR 20

CHAR 10

CHAR 3

CHAR 20

CHAR 20

CHAR 3

CHAR 20

Corrector Field epoc

h sensorid

datatype

wtemperature

wthreat

whumidity

wwind

wdirection

format INT CHAR 40

CHAR 20

REAL REAL REAL REAL

CHAR 3

20 http://www.postgresql.org/

Page 19: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 19 of 46

Prediction Field epoc

h sensorid

datatype

flight prediction

Weight-key-epoch

Weight-key-sensorid

Weight-data-type

format INT CHAR 40

CHAR 20

CHAR 20

CHAR 10

INT CHAR 40 CHAR 20

Weather

Field epoch sensorid datatype

airportcode

temp Percep itation

Humidity

Visa-bilty

wind-direction

wind-speed

format INT CHAR 40

CHAR 20

CHAR 3

REAL Celsius

CHAR 4

REAL REAL

CHAR 3

REAL

Weather is straight forward and needs little further comment. I suspect the flight table needs to also store the actual departure and arrivals once they are

available to assist in the prediction and correction development. Prediction stores the current prediction of a flight and the weights used to make that

prediction. I suspect that we need to add an additional column to the table to include departure date to permit predicting flights days into the future.

Corrector is a table of weights that modify the various data points used by the predictor to determine if a flight is going to be delayed or canceled.

Page 20: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 20 of 46

Overall flow of data in the dddas looks like this

The first three fields in each table for the key for that entry. EPOCH is a timestamp of seconds past the epoch. SensorID us a uniq string given to each sensor, we suggest the fully qualified hostname plus some other string as a good SensorID. Datatype is once of the following: flight, weather, prediction, weights. This is somewhat redundant and is a holdover from our earlier designs.

Care and feeding of the databases

Two databases where built travel_dddas_dev and travel_dddas_prd. The dev database is intended as a working area to develop and test sensors, predictors, etc. without affecting the master datastore which lives in prd. When the particular tool is completed and debugged it could then start talking to the prd database. The dev database is to be periodically overlaid with the contents of prd to reset it to valid data.

Interfacing

Page 21: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 21 of 46

Initially the target was for two interfaces, python and command line. I chose PyGreSQL21

as it was the highest ranked in the Google search and had decent documentation. The basic command line interface comes with python so nothing further was needed there. I received a request for a C-interface later, since postgres has a C-interface, this posed no significant additional trouble. I modified examples from the manual to create a quick reference interface for python for use on the ml-dddas machine.

Excerpt from reference.py

# database access library import pg #create connection con1 = pg.connect(dbname='travel_dddas_dev', host='localhost', user='dddas') # print results of query print con1.query ("select * from airline")

There were a few problems in interfacing with this database. Some developers had

problems linking against the libraries but these turned out to be not related to the database. Remote access to the database is disabled by fire walling, but it was verified as functional by running the client on the ml-dddas machine and specifying the IP address, port, and usual parameters.

Database backup and synchronization (Divya Bansal) The datastore module of the DDDAS Airline project has 2 databases, travel_dddas_PRD and travel_dddas_DEV. The production data base is backed up every hour into a flat file and it also copied into the travel_dddas_DEV database to maintain synchronization. This is accomplished by running a shell script, containing the back up commands, every hour using crond job. The shell script for backing up and copying is as follows: /usr/local/pgsql/bin/pg_dump travel_dddas_PRD > BackupFile /usr/local/pgsql/bin/psql travel_dddas_DEV < BackupFile The database can be restored to its last saved settings using the command: /usr/local/pgsql/bin/psql travel_dddas_PRD < BackupFile

Table schema correction and updating 21 http://www.pygresql.org/

Page 22: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 22 of 46

The datatype of the temperature, humidity, visibility and windspeed columns of the weather table was changed from integer to double precision. The following SQL commands were used to accomplish this. drop table weather create table weather (epoch int, sensorid character(40), datatype character(20), airportcode character(3), temperature double precision, precipitation character(4), humidity double precision, visibility double precision, winddirection character(3), windspeed double precision)\g

Future Directions

This portion will need to evolve with the project. Remote access needs to be overcome so distributed sensors can feed data into it easier, this could be easily done by moving to a dedicated server. Other alternatives are to use some other transport method to interface with it remotely.

Database changes needed to implement our solution are documented in cvs travel-dddas/docs/datastore/* and in the appendixes of this paper.

Conclusion

I think jumping directly to the database was a good move, it is far easier to modify a table design than to rewrite a custom data storage app. With the changes we have seen in this project over a short amount of time and suspected future needs, a sql database seems to be the right choice.

U.S. Threat Levels (Trevor Presgrave)

Description

This program reads the current threat level from the Department of Homeland Security website (Appendix T1) and writes the level, along with a timestamp to a text file. The program makes a conversion between the word/color system used by the government and a numerical representation for internal use by the DDDAS. Required files

1. Threat.java 2. location.txt 3. result.dat

Files basic description

1. Threat.java – This java file contains the logic to read in and parse the current threat level. 2. location.txt – This file contains the full path to broker file that is supposed to hold the

Page 23: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 23 of 46

path for the file to be sent. 3. result.dat – This data file contains the epoch time the data was collected and the current

threat level. The basic working mechanism behind the Java file

1. Threat.java (See Appendix T2 for details) i. Connect to the Department of Homeland Security website. ii. Read in the source HTML. iii. Read in the location of the broker’s send file. iv. Open broker’s send file. v. Write threat data to result file. vi. Write location of result file to broker’s send file. vii. Close all connection

Setup working environment

Description This is tested under the Windows XP operating system. It requires the JRE to be installed. Steps Edit the location file to hold the location of the file that is used to store what files need to be sent via the broker.

1. Run the program.

Data structure of weather metadata Threat Metadata Format

EPOCH Sensor-ID Threat CHAR (20)

Hostname Or

MAC Address

INT

Example

EPOCH Sensor-ID Threat 00000001 WINLAPTOP 3

Page 24: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 24 of 46

Algorithm Currently the searching mechanism is using a simple linear search. The complexity is O(n).

www.flightlookup.com Weather sensor22

(Trevor Presgrave)

Description

This program will read in the temperature, precipitation, and visibility (if specified) from the local conditions page on www.flightlookup.com. Project limitation

1. Currently only gets data for the Lexington Airport. Required files

4. Weather.java Files basic description

4. Weather.java – This java file contains the code needed to read in, and parse, the data from the flightlookup page.

The basic working mechanism behind the Java file

2. Weather.java (See Appendix T4 for details) viii. Connect to the webpage. ix. Read in the HTML source. x. Use the substring function to get the relevant parts. xi. Store each relevant substring in its own variable. xii. Display the results to the screen.

22 See Appendix T3

Threat representation Low = 0 Guarded = 1 Elevated = 2 High = 3 Severe = 4 Error = -1

Page 25: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 25 of 46

Setup working environment

Description This was tested under Windows XP and requires the JRE to be installed. This code requires an internet connection to work. Steps

2. Run the program to get the data displayed to the screen. Algorithm Currently the algorithm uses a simple brute force approach to locate certain substrings of text within the html.

Future work

This code was stopped after its first iteration through development as web pages with more data were found. If it were needed, added a prompt to pass in other airports could be useful. Replacing the brute force approach with a regular expression based approach would also make the program more robust.

Weather data from NWS.NOAA.GOV (Chun Lung Lim)

1. Get weather information such as the time when data is collected, airport code,

temperature, humidity, visibility, sky conditions, wind speed and wind direction from National Weather Service website www.nws.noaa.gov and store those information into our agreed weather metadata format.

2. Track weather storm that is happening in any airport that would affect the flight delay and store them as precipitation.

3. Weather information will be converted into a binary format that will be used to calculate the possibility of Dr. Douglas flight schedule.

Description

With all the files and data, any users should be able to get airport weather information around the United States and store them in the metadata format, which can be read into our database for predicting Dr. Douglas flight schedule. Extra for this project 1. In this project, I am not required to track all the weather information in all the airports within

Page 26: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 26 of 46

United States. However, this program has the capability to do so. And data is collected for future weather history reference. Project limitation

2. The weather information is currently limited to US airports. Obstacles in this project

1. Getting weather information for NWS.NOAA.gov is difficult. The function for GRIB2Decoder to extract weather information is not working as desired. I contacted Arthur Taylor (Degrib Author from NWS.NOAA.gov), he has no idea what went wrong. He did make a few suggestions but none of them works. I have forwarded the emails to Dr. Douglas between my conversations with Arthur Taylor.

2. I have spent a lot of time searching decoded weather information from this government website after I decided not to use the degrib function. I found an ftp link buried deep in the website that finally has weather information. The weather information is updated automatically every one hour.

3. The weather data file structure is not consistent. If a specific data is not available, the whole line is removed from the data storage file. If that data is available again, data will be stored at random locations of the data file. This causes me to re-design and rewrite my program over again because of the structure is not consistent.

4. I am facing a lot of problems using the Postgres SQL database due to the accessibility of header files that is installed in ml-dddas.csr.uky.edu. This is a bug reported to PostgresSQL at http://bugs.archlinux.org/task/6638 .

5. We are still changing the design of our data representation in weather information.

Page 27: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 27 of 46

Required files

1. GetNWSData.c 2. NWS_10.c 3. NWStoDatabase.c 4. PrecipitationRep.txt 5. SkyConditionsRep.txt 6. USairport.txt

Files basic description

5. GetNWSData.c – This C file allows user to automatically login to the NWS ftp server to get all the airport weather data around the world, store those information into local drive and trigger READNWSDATA executable to retrieve all the necessary information for our prediction model.

6. NWS_10.c – This C file allows user to read each airport weather data downloaded from NWS ftp server which is specified in International Civil Aviation Organization (ICAO) format.

7. NWStoDatabase.c – Read NWS_Log.txt file that contains all the weather data information in defined weather metadata into SQL server.

8. PrecipitationRep.txt – This text file contains all the tokens of description regarding to precipitation.

9. SkyConditionsRep.txt - This text file contains all the tokens of description regarding to sky condition.

10. USairport.txt – This text file contains a predefined airport code written in International Civil Aviation Organization (ICAO) format.

The basic working mechanism behind C files

1. GetNWSData.c i. Make a directory based on local time. ii. Enter to the created folder according to local time. iii. Ftp to NWS.NOAA.GOV website to obtain data files. iv. Enter decode folder. v. Copy everything in the READNWSDATA folder to current directory. vi. Read all the weather information from NWS data and store them to the defined

Weather metadata in NWS_LOG.txt. vii. Move NWS_LOG.txt to NWSLOG folder. viii. Go back to root directory. ix. Copy NWS_LOG.txt to READNWSDATA directory.

2. NWS_10.c

Page 28: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 28 of 46

i. Open USairport.txt to pass in the necessary file ready to be read in. ii. Search for ICAO and convert it into IATA airport code. iii. Get local time in epoch format. iv. Get machine hostname. v. Get temperature in Celsius. If no temperature is found, value -273 will be stored. vi. Get precipitation information. vii. Convert information into a binary representation if a token is found the same in the

PrecipitationRep.txt. viii. Get humidity. If no humidity is found, value -273 will be stored. ix. Get visibility. If no visibility is found, value -273 will be stored. x. Get sky condition information. xi. Convert information into a binary representation if a token is found the same in the

SkyCondtionsRep.txt. xii. Get wind direction. If no wind direction is found, a string “NULL” will be replaced. xiii. Get wind speed. If no wind speed is found, value -273 will be stored. xiv. Finally write all the above information into a file named “NWS_LOG.txt”.

3. NWStoDatabase.c i. Open NWS_LOG.txt. ii. Read all the information that is related to individual airport and then store it to our

database.

Setup working environment Description This is tested under Ubuntu operating system. User is responsible to have gcc compiler ready in the system. Setup Environment A

1. Create folder in the user home directory named “NWSdata” to store all the downloaded weather data from NWS.

2. Create folder in the user home directory named “NWScode” to store all the downloaded weather data from NWS.

3. Create folder in the user home directory named “READNWSDATA” to store all the downloaded weather data from NWS.

4. Put GetNWSData.c in “NWScode” folder. Type “gcc GetNWSData.c –o GetNWSData”. 5. Put NWS_10.c, PrecipitationRep.txt, SkyConditionsRep.txt and USairport.txt into

“READNWSDATA” folder. 6. Type “gcc NWS_10.c –o READNWSDATA”. 7. Use “crontab e” to setup automatically execute “GetNWSData” every one hour. 8. Now you are set.

Page 29: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 29 of 46

Setup Environment B 1. Upload NWS_LOG.txt to ml-dddas.csr.uky.edu. 2. Compile NWStoDatabase.c to an executable file “NWStoDatabase”. 3. Run NWStoDatabase in ml-dddas.csr.uky.edu.

Data structure of weather metadata Weather Data Metadata Format EPOCH

Sensor-ID

Datatype

Airport code

Temperature

Precipitation

Humidity

Visibility

Sky Condition

Wind Speed

Wind Direction

CHAR (20) Hostname

Or MAC Address

CHAR (20)

CHAR (3)

Integer Celsius

CHAR (20)

Integer

Integer (Miles)

CHAR (20)

Integer (Miles)

CHAR (10)

Example EPOCH

Sensor-ID Datatype Airport code

Temperature

Precipitation

Humidity

Visibility

Sky Condition

Wind Speed

Wind Direction

00000001

MACBOOKPRO

NWS_Weather

LEX

45 000001 80 10 00001

8 N

Precipitation representation Clear = 0000001 Rain = 0000010 Thunder = 0000100 Thunderstorm = 0001000 Ice = 0010000 Snow = 0100000 Sand = 1000000

Sky Condition representation Clear = 000001 Overcast = 000010 Cloudy = 000100 Wind Direction N = North S = South E = East W = West NW = North West EW = East West

Page 30: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 30 of 46

Algorithm Currently the searching mechanism is using a simple linear search, the complexity is O(n). For the comparison between found keys and pre-recorded information is using linear comparison and the complexity is O(n). What is being left? 1. Passing the data from NWS_LOG.txt to SQL data table.

Future work

1. The complexity of search should use Hash table as if we are going to track all the weather information around the world. And the complexity will be O(1) as if the Hash table function for searching is implemented correctly for Precipitation and Sky Condition.

2. Collect information that could delay flight schedule that has nothing to do with weather such as the arrival of president from another countries.

3. Collect information about disaster such as volcano eruption.

Conclusion This is proven that it is possible we can get weather information from the government website without any assistance. The group that responsible for weather needs to meet again to decide where and how we need to improve the way we collect weather information.

Page 31: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 31 of 46

TRANSPORT MODULE (Divya Bansal, Jay Hatcher and Soham Chakraborty)

The transport module serves as a link between the sensors and the simulation unit. It is based on a point-to-point communication tool developed by Wei Li. The original communication tool is not suited for easy deployment in multiple computers. Therefore the software had to be suitably packaged and modified to allow easy installation in different computers.

The communication software suite consists of four tools- data sender, data receiver, broker and broker admin.

Simulation

Sensor 1 Sensor 1 Interface

Sensor 1 Sensor 2 Interface

Sensor n Sensor n Interface

T R A N S P O R T

Page 32: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 32 of 46

Broker Tool

Setup Screen for Broker

Broker UI The Broker serves as a central repository for the IP addresses of data senders and data receivers. A broker has a set of groups. A sender or receiver can register under anyone of these groups. When a sender belonging to a particular group sends a message, it is broadcast to all the receivers belonging to the group. To use the broker, simply click on the “Broker” link on the application webpage. If it is running for the first time then it will display the Broker setup screen. Enter the absolute path to the location where the user files are to be created and hit ok. Next the Broker UI becomes visible. Enter the administrator password (the password to be used by broker admin) and the port address for the Broker and click “Run”.

Page 33: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 33 of 46

Broker Admin

Broker Admin GUI Broker Admin tool is used to create new groups in the Broker. Download the Broker admin jar file from the application’s webpage. Double click on the jar file to run the application. To create a new group, enter the broker’s IP address, port, administrator password and the new group name. There is also a text area for entering a description of the newly created group. Data Sender

Data Sender UI Data sender is responsible for broadcasting sensor data to all receivers in

Page 34: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 34 of 46

the same group as the sender. To send a file, write the absolute path of the file into the ‘playlist’ file in the senders working directory, SenderWorkingDirectory. The data sender checks this file periodically for updates. When it finds a valid entry in the file, it will broadcast it to the receivers and clear the playlist. Data Receiver

Data Receiver UI The data receiver receives the data broadcast by the data senders in the same group as the receiver. The received files are stored in the ‘ReceiverWorkingDirectory’ folder.

Implementation The following steps are involved in preparing a Java application for deployment using JWS: 1. Configuring the web server: The web server hosting the application must be configured so that all files with the .jnlp file extension are set to the application/x-java-jnlp-file MIME type [3]. Web browsers use MIME type to determine how to handle the content sent to them from web servers. A web server must return the MIME type application/x-java-jnlp for JNLP files in order for JWS to be started [3]. The procedure for configuring the web server varies from server to server. For example, the line “application/x-java-jnlp JNLP” must be added to the mime.types configuration file in the Apache Web Server [3].

Page 35: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 35 of 46

2. Creating a JNLP file for the application: Java Network Launch Protocol and API (JNLP) is the technology behind JWS [3]. JWS is the reference implementation of the JNLP specification [3]. One of the things JNLP defines is the file format of the .jnlp file which launches the application. A .jnlp file associated with an application is an xml file containing information such a; the URL where the application is located, access permissions and the resources required to run the application. Given below is an example of a JNLP file: <?xml version='1.0' encoding='UTF-8' ?> <jnlp spec='1.0' codebase='http://128.163.154.83:10000' href='brokerAdmin.jnlp'> <information> <title>Broker Admin</title> <vendor>Dr.Douglas</vendor> <description kind='one-line'> Admin tool </description> <shortcut online='false'> <desktop/> </shortcut> </information> <resources> <j2se version='1.5+' /> <jar href='brokerAdmin.jar' main='true' /> </resources> <application-desc main-class='broker.BrokerAdminGUI' /> </jnlp> 3. Place the JNLP and JAR files associated the application in the web server. In this step we place the files associated with the application in web server, and check to see if the files are accessible from the URL specified in the JNLP file[3]. 4. Setting up the website: In order for a Java application to be launched from a webpage, the page must contain a link to the JNLP file associated with the application. For example, if the website is www.DDDASTool.com, the web page must have

Page 36: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 36 of 46

the link – www.DDDASTool.com/MyApp.jnlp. In addition to this it is necessary to check if the browser supports JWS. In order to do this, a JavaScript code is run when the page loads to determine the browser type and if JWS is installed. If JWS is not present, the user is directed to the site from where it can be installed. Please see the link 3 in the “Source” section of this document for detailed information about JWS implementation. Instructions: 1. Browse to the website where the application is located. 2. If JWS is not detected the user will be directed to a site where

it can be downloaded and installed. The user must install JWS and browse back to the application web page before proceeding to step 3.

3. Click on the link to the tool you would like to start. There are four tools, namely Broker Admin, Broker, Data Sender and Data Receiver.

4. Upon clicking on the link JWS will launch the application. At this stage the user will be asked if he/she would like to place a shortcut to the application on his/her desktop. The shortcut will allow the user to start the program without having to browse back to the application web page.

Predictor

(Mark Maynard) In order to decide whether a flight will be canceled a prediction is made based on information about the weather conditions and U.S. Threat level.

Calculating Predictions To calculate whether or not a flight it going to be delayed the predictor produces a value from 0 to 1, the higher the value the more likely the flight will be delayed. To obtain a value, weather statistics along with the current U.S. terrorist threat levels are used. The information currently utilized about weather are wind speed, wind direction, precipitation type, temperature, and visibility. These factors are also given weights by the corrector from 0 to 1 to judge the significance of each factor in the total outcome. In this way the correcter can adjust the prediction to fit the actual outcome. Below is a diagram of a the expected structure of the final system:

Page 37: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 37 of 46

Current

Status and

Further

Work

The current predictor is a toy version that has the scaffolding for calculating delay in place but has no reasonable algorithms for calculating the factors. Once sufficient data has been obtained from flight histories the values of the individual pieces of data such as precipitation can be analyzed alongside whether or not a delay occurred. With this data a polynomial fitting of the data can be achieved to be used in the calculation of further predictions.

Page 38: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 38 of 46

Corrector (Jay Hatcher)

Overview

The purpose of the corrector is to compare a previous prediction with the actual results

and adjust weights accordingly. If the predicted result is within a certain tolerance, no correction is necessary and the corrector finishes. If the actual result is significantly different from the prediction, the corrector must determine what weights to adjust so that future predictions will be more accurate. Adjustments should use previous data to determine what factor(s) contributed most to the predictor’s divergence from the actual result.

Correction Process

When the corrector is run, it finds the last prediction that has corresponding actual results. \Once the prediction is found, entries in the airline table are extracted if they match the prediction’s flight number and sensor ID. Any of these entries occurring after the prediction’s epoch have their results compared to the prediction’s result (i.e. the flight was on time, delayed, or canceled). If the prediction was incorrect, we use our prediction’s weight epoch key and weight sensor ID key to find the weather conditions corresponding to the time of the prediction. We then query the weather table for similar weather conditions. The airport codes and epoch times for these weather entries are then used to query the airline table for flight numbers occurring under similar weather conditions. These flight numbers are then used to query the prediction table to find predictions that were made under similar conditions. The predictions that are correct are compared to the prediction being corrected to mark weather factors that are similar to correct results. Incorrect predictions are used to identify

For example, if the prediction was that the flight would be delayed, and it was on time,

the corrector looks at the weight epoch key and weight sensor ID key of the prediction. It performs a query on the weather table looking for a match to these two keys. Let us say the result gives airport code LEX, 10 degrees Celsius, Raining, 40% humidity, visibility 10, wind NNE, and wind speed 30 mph. We then query the weather table for any entries with LEX, 5 – 15 degrees C, Raining, 30% – 50% humidity, 5 – 15 visibility, wind N to NE, and wind speed 20 to 40 mph. The airport codes and epoch time in the resulting list are then used to query the airline table for flight numbers. Let’s say we get flights 116, 314, and 211. The prediction table is then queried for these flight numbers, and the results of those predictions are examined. Let’s say that predictions on 116 and 211 were wrong, but predictions on 314 were correct. The weather conditions for flight 314 are compared to the weather during our current prediction, and any weather factors that are sufficiently similar are not adjusted. Any weather factors not omitted by the comparison with 314 are examined to see which one changes the most. The weight for this factor from the previous correction table entry for the sensor is then adjusted in the direction of the actual result (on time). The new correction record is then added to the correction table. If no correct past predictions are found for these flights, then a weight is

Page 39: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 39 of 46

randomly adjusted in the direction of the actual result and a new record is added to the correction table.

Further Work

The corrector does not currently adjust factors for national threat level or historical flight data. This is planned for future implementations of both the predictor and corrector. This corrector has not been tested due to issues accessing the database and time taken adjusting the design of the database as the project progressed. A further improvement to the corrector model would be to allow the corrector to set initial weights be finding a best fit for historical data. This process could also be used to reset in the event that predictions do not seem to be improving over time do to poor initial values for the weights or unusual patterns that will take a long time to self correct. Using multiple correctors with different initial weights or different thresholds defining “similar” weather and comparing their results might also yield better performance over time.

Conclusion

We have shown that a system to predict flight delays has no technical obstacles. We

were successful in collecting data from airline and weather Internet sources. We have a method

in place to distribute that data, a database to store the data, and a framework for predicting airline

timeliness and correcting predictions over time. We believe that future development of these

components would lead to a viable commercial product.

Page 40: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 40 of 46

Appendix M1

Python Installation and Modules.

Python The current versions of United Sensor, Predictor and the Yahoo! Weather sensor all run in python 2.523. Installation of Python is fairly simple on all major platforms24. Once installed several modules are required to run database and URL fetching code. Mechanize The Mechanize25 module is used to fetch websites and handle cookies for the United Flight sensor. This module and all others required by our python applications can be found in the travel-dddas repository in the pythonlib folder. An INSTALL file is included in the distribution. Pygresql In order to access the database for this project the Pygresql26 module was used. The module is located in the pythonlib folder and installation27 instructions can be found online. Other Modules In addition to Mechanize several other libraries that are standard in Python 2.5 are utilized in these applications. Urllib is used to fetch websites for the Yahoo! Weather sensor, this is then fed into the Minidom module to handle name space parsing of the RSS feed. The Re module is used with the String module in both the United and Yahoo! applications to parse text with regular expressions. The Time module is used to get the Epoch28 for time stamping results. Optparse is used in the currently disable mechanism to take input from the command line and use it in United queries.

23 www.python.org 24 http://www.python.org/download/ 25 http://wwwsearch.sourceforge.net/mechanize/ 26 http://www.pygresql.org/ 27 http://www.pygresql.org/install.html 28 http://en.wikipedia.org/wiki/Unix_time

Page 41: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 41 of 46

Appendix R1 Dump of tables from database Table "public.airline" Column | Type | Modifiers -------------------+---------------+----------- epoch | integer | sensorid | character(40) | datatype | character(20) | flight | character(10) | departairportcode | character(3) | departdate | character(20) | arrivedate | character(20) | arriveairportcode | character(3) | status | character(20) | Table "public.corrector" Column | Type | Modifiers --------------+------------------+----------- epoch | integer | sensorid | character(40) | datatype | character(20) | wtemperature | double precision | wthreat | double precision | whumidity | double precision | wwind | double precision | wdirection | double precision | Table "public.prediction" Column | Type | Modifiers --------------------+---------------+----------- epoch | integer | sensorid | character(40) | datatype | character(20) | flight | character(10) | prediction | character(10) | weightkeysepoch | integer | weightkeyssensorid | character(40) | weightdatatype | character(20) | Table "public.weather" Column | Type | Modifiers ---------------+------------------+----------- epoch | integer | sensorid | character(40) | datatype | character(20) | airportcode | character(3) |

Page 42: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 42 of 46

temperature | double precision | precipitation | character(4) | humidity | double precision | visibility | double precision | winddirection | character(3) | windspeed | double precision |

Configuration files for postgress database

pg_hba.conf

# TYPE DATABASE USER CIDR-ADDRESS METHOD # "local" is for Unix domain socket connections only local all all trust # IPv4 local connections: host all all 127.0.0.1/32 trust hostssl all all 0.0.0.0/0 md5 # This machine host all all 128.163.154.98/32 md5 # IPv6 local connections: host all all ::1/128 trust

changes to defaults in postgresql.conf data_directory = '/usr/local/pgsql/data/'

# use data in another directory # (change requires restart) hba_file = '/usr/local/pgsql/data/pg_hba.conf' # host-based authentication file # (change requires restart) listen_addresses = '*' # what IP address(es) to listen on; # comma-separated list of addresses; #defaults to 'localhost','*'= all # (change requires restart) port = 5432 # (change requires restart)

Page 43: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 43 of 46

Appendix T1

Appendix T2

Calendar rightNow = Calendar.getInstance(); long ep = rightNow.getTimeInMillis(); Long L = new Long(ep); buf_writer.write(L.toString()); //time stamp //Determine threat level if(temp.contains("low")) buf_writer.write(", 0"); else if(temp.contains("guarded")) buf_writer.write(", 1"); else if(temp.contains("elevated")) buf_writer.write(", 2"); else if(temp.contains("high")) buf_writer.write(", 3"); else if(temp.contains("severe")) buf_writer.write(", 4"); else buf_writer.write("-1"); //Error buf_writer.newLine(); buf_writer.close();

Page 44: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 44 of 46

Appendix T3

Page 45: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 45 of 46

Appendix T4 //Get relevant portion of page String target = "alt=\"weather icon\"/>"; int start = page.indexOf(target) + target.length(); int end = page.indexOf("<a href"); String rel = page.substring(start, end); //Get degrees start = rel.indexOf(">"); end = rel.indexOf("&"); String deg_s = rel.substring(start+1, end); deg_s = deg_s.trim(); int f = Integer.valueOf(deg_s).intValue(); int deg_c = (int) Math.round(5.0/9.0 * (f - 32));

//Get cloud cover target = "Cloud coverage:</b><br/>"; start = rel.indexOf(target) + target.length(); end = rel.length() -1; String clouds = rel.substring(start,end); end = clouds.indexOf("<br"); clouds = clouds.substring(0,end); //Get conditions target = "<b>Conditions:</b><br/>"; start = rel.indexOf(target) + target.length(); if(start >= 0){ //Check to see if conditions are listed String cond = rel.substring(start, rel.length()); end = cond.indexOf("<br/>"); cond = cond.substring(0,end); System.out.println(cond); }

Page 46: Project name : Travel-DDDAS Class: CS689-002 Instructor ...

CS 689-002 2007 Travel-DDDAS Page 46 of 46

Sources 1. Wei Li, Master’s Thesis 2. Dmitry Leskov, http://www.excelsior-usa.com/articles/java-to-exe.html 3. Java Technotes,

http://java.sun.com/javase/6/docs/technotes/guides/javaws/developersguide/contents.html 4. Weather information collected from www.nws.noaa.org . 5. C language Code reference from www.cplusplus.com . 6. Delta Flight Information from www.delta.com 7. United Flight Information from www.ua2go.com 8. Fetch documentation from

http://www.freebsd.org/cgi/man.cgi?query=fetch&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html

9. Documentation on wget from http://www.gnu.org/software/wget/ 10. Mechanize located at http://wwwsearch.sourceforge.net/mechanize/ 11. Libxml2dom located at http://www.boddie.org.uk/python/libxml2dom.html 12. Egrep located at

http://www.freebsd.org/cgi/man.cgi?query=egrep&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html

13. Source to diagrams in cvs:/travel-dddas/docs www.graphviz.org 14. Diff documentation from

http://www.freebsd.org/cgi/man.cgi?query=diff&apropos=0&sektion=0&manpath=FreeBSD+6.2-RELEASE&format=html

15. Weather information located at http://www.wunderground.com 16. Python to Postgres interface located at http://www.postgresql.org/ 17. Cron documentation located at http://www.unixgeeks.org/security/newbie/unix/cron-1.html 18. RSS documentation located at http://en.wikipedia.org/wiki/RSS_(file_format) 19. Yahoo! Weather RSS documentation located at http://developer.yahoo.com/weather/ 20. U.S. Threat Sensor found at http://www.dhs.gov/xinfoshare/ 21. Epoch time information at http://en.wikipedia.org/wiki/Unix_time