Statewide Real-time Data Hub Update Presented by Marullus Williams April19, 2012.
-
Upload
hannah-morson -
Category
Documents
-
view
213 -
download
0
Transcript of Statewide Real-time Data Hub Update Presented by Marullus Williams April19, 2012.
Statewide Real-time Data Hub UpdatePresented by Marullus Williams
April19, 2012
Background The transit community in Virginia is looking at transit
traveler information and is discussing standards, trends and applications
ITS Virginia along with the Virginia Department of Rail and Public Transportation (DRPT) led an effort to create a technology community for transit operators statewide
To accomplish the task, a working group was formed to; Discuss, develop and promote the use of transit technology
standards, Act as a resource for the exchange of ideas and general
technology discourse, Promote the systems engineering process for the development,
procurement and deployment of transit ITS projects The working group is an open group for anyone interested
in transit technology
2011 ITS Program Update Survey
*Survey conducted in 2010 and published in 2011 ITS Program Update - ITS participants rated the needs of transit operators
Background Transit Traveler Information is primarily in two forms
Static Data – most static transit data is now available in electronic form provided in; Trip planners via the web, and In standard formats like GTFS
Real-time Information – some larger transit providers are now providing in; Various form via web tools and applications In various formats
There are nearly 30 transit operators supported by State funds in Virginia
There are over a dozen urban, small urban and rural transit agencies engaged in this traveler information discussion in Virginia
Virginia has an active community discussing transit ITS issues through the support of ITSVA and the Commonwealth
Approach The working group is interested in making real-time and
historical data available to the public and to 3rd party developers in order to Improve passenger information, Improve government transparency, and Improve multimodal transportation options
A reasonable approach is to use the standards working group to define Virginia transit traveler information goals and leverage the work and approach undertaken by WMATA, Blacksburg and other national leaders
The potential benefits of this approach include; Strengthening of standards-based sharing, Out of the box interoperability, and Cost efficiencies to agencies by leveraging the existing
investment
Progress The working group has met numerous times over a number of months to discuss
Agreements and potential public policy Standard formats, including the possible creation of a standard real-time data
format for Virginia transit agencies to follow Lead and participating agencies Hosting locations for static and real-time data Lead efforts;
Recently, WMATA released real-time and historical data to the public through the use of very inexpensive, commercially available, cloud computing technologies through a standards based application programming interface (API). This API has been extremely well received by the transit, software, and passenger communities
Blacksburg Transit in the Virginia Tech community has also provided their real-time transit data in open API
In August 2011, DRPT sponsored the development of a ConOps to guide the development of the Statewide Real-time Data Hub. The ConOps was completed November 2011. Future implementation plans are yet to be determined.
ConOpsFINDINGS and
RECOMMENDATIONS
ConOps Study Team Washington, DC –based SIRC created the
ConOps
Team Members: Jamey Harvey, Subject Matter Expert
Marullus Williams, Project Manager
Kunmi Ayanbule, Technical Architect
Important Notes These recommendations are currently being
reviewed by DRPT
No decisions have been made on how to proceed with these recommendations
This presentation is only an update on the findings presented to DRPT by the ConOps Team
10
The Regional API Concept Needed to decide on the approach to the “Regional” API.
Is the goal to facilitate each transit agency’s ability to publish its own standard’s based API, or is the idea to have all of the regional data fed into a regional API?
If each local agency publishes its own API, the regional responsibility would be to ensure that each agency is truly standards-based and interoperable. There would be a regional directory of agency API feeds, but each agency would be responsible for building and maintaining its own API.
There could be a hybrid approach in which the Regional API would aggregate data from each local API. The Regional API would in essence become a consumer of each local API’s data and in turn provide that information to the public
8/8/2011
11
Four Regional ApproachesLocal
Transit Agency A
Local Transit
Agency BAPI API
Local Transit
Agency A
Local Transit
Agency B
Regional Data Store and API
Developer Community / Public Data Consumers / Transit Agencies
1: Direct from Local API to Developer with Regional Directory
Regional Directory
of API Feeds
2: Local API to Regional API
3: All data aggregated and served at regional level
8/8/2011
4: Hybrid approach of 1-3
12
Stakeholder Interview Topics Describe your agency’s organizational structure. Provide details on your transportation system.
Modes of travel Types of service schedules Number of riders Number and types of incidents Geographic range Interconnections with other systems
Describe your current IT environment. Describe your current transportation technology. What data does your organization currently make available to the
public (including paper-based info)? Which data elements would be easiest to provide to the public via
the API? What is your role in providing real time data for the API?
8/8/2011
13
Agencies Interviewed Fredericksburg: Arnold Levine PRTC/Omniride: Doris Chism, Ryan Jones and
Eric Marx Williamsburg/James-City: Kevan Danker Blacksburg: Tim Witten and Aneil Samuel Arlington: Bee Buergler and Tom Scherer VDOT: Scott Cowherd and Noah Goodall Loudoun County: Scott Gross University of MD (RITIS): Michael Pack
14
Key Interview Findings All local agencies have data available to create static GTFS feeds With the exception of WMATA and Hampton Roads, all agencies have bus service only. Most agencies do not have dedicated information technology departments.
They are heavily dependent upon city/county IT resources or outside contractors Many agencies have either recently procured, currently evaluating or soon issuing
RFPs for AVL technology All agencies are interested in participating in the real time API The three most critical issues facing local agencies in providing real time data:
Integrating information from the many disparate transit systems that are in place within each agency
Encouraging vendors to provide data in an open, standards-based format Obtaining technical help given the lack of IT resources within most transit agencies
Agencies need guidance from the real time API team on how to ensure AVL vendors provide data in the proper format
Real-time and static data collection regionally is needed as much for transit planning purposes as for creation of public-facing applications. The scope of this project is to develop the real time API, not the data warehouse specifications
RITIS is an important stakeholder in the API development. Most agencies underscored the importance of ensuring that providing data to the API and RITIS are as similar as possible
Arlington Example
Blacksburg Example
Concept of Operations OV-1
Rolling Out a Successful API Project Agreeing to data sets to be published Implementing a standards-based approach Connecting all required data elements to the API Creating a fast, reliable infrastructure by leveraging cloud
services and API-specific solutions like Mashery Publicizing the API Communicate regularly with the developer community Building an API forum / community using tools such as
Facebook, Twitter Managing updates to the API. Good documentation is key. Identifying and managing all legal, policy and security
risks. Monitoring the use of transit data by developers and the
public.
Local Agency Data Collection It is the responsibility of the participating local agencies to integrate
the required data and provide a location (or locations) within each agency’s infrastructure for retrieving the data required for the API.
In order for the data to be collected consistently and uniformly from each local agency, it is important that all local data be formatted as defined in the API specification. The data must be made available via csv files, xml files or Excel
spreadsheets. Depending on the type of data that is contained for each file, the data will be updated by the local transit agency and provided to the VTA at varying frequencies.
The data retrieval layer will be built within the VTA infrastructure (whether cloud or on-premises).
In order to support all agencies with varying technology infrastructures, the data retrieval layer will offer a push and pull service.
Concept of Operations OV-1
Database Considerations Data from the local transit agencies must be stored in the
VTA database. The ConOps OV-1 illustrates the need for Data Translation and Integration to accommodate any semantic or syntactical differences in data collected from the regional transit agencies. The VTA Database is intended to be a temporary storage with current data and limited historical data. For example, the Database can keep four hours of transit data after which that data will be pushed to a data warehouse.
The Database will be based on a real-time database systems or an in-memory data-store. To improve scalability, several traffic management, rate-limiting and smart time-sensitive data caching strategies will be implemented. Caching will reduce the latency between HTTP requests to the application server and the fetching of data feeds
Concept of Operations OV-1
API Assembly Layer The Feed Assembly Layer packages data that will be
provided to Data Consumers. The interface to this layer will be HTTP-based REST protocol, which will respond in one of the supported output formats, SIRI and GTFS/GTFS-RT.
This layer will have specific modules for converting to XML, Protobuf and JSON formats depending on the request. Protocol buffers (Protobuf) is a binary format used by GTFS-RT and is a flexible, efficient, automated mechanism for serializing structured data. It is smaller, faster, and simpler than XML. JSON is also a small
footprint format that is simpler, less verbose than XML and widely used by application developers.
JSON is not natively supported by SIRI and GTFS-RT, but the API Assembly Layer will be able to produce JSON formatted responses based on the structure of GTFS-RT.
API Management An API Management tool like Mashery would provide the following
benefits to the API: Eliminates the need to internally develop API gatekeeping functionality Well-supported and currently employed by WMATA, Best Buy, Netflix, Cnet
and others to support publication of APIs for third-party developer use Provides API registration, access and self-service provisioning Provides key issuance and credential management Allows usage control: throttling and limiting tied to key, user, method or
group Caches frequently used calls Supports business rules configuration based on filters, parameters, and
methods Provides real-time insight to all activity and data export available for
independent analysis Provides reports that measure uptime, track errors, and show cache activity Provides API usage information including call volumes, top method calls,
and top user activity Includes content management, versioning and documentation change
control
Portal The Portal must provide information and
documentation for Data Consumers
The term “Data Consumers” refers to computer applications (and the users of those applications) that retrieve data via the VTA. The most popular applications that will use the API data can generally be divided into the following two types: Traveler Applications built for desktop, web and
mobile platforms Transit Agency Operations and Planning Applications
that leverage the data to improve safety, efficiency, and customer satisfaction of transit operations
The User CommunityThird Party Developer Portal• 400+ developers have registered • 380,000+ successful API calls per week
The User Community – Window Unit
WMATA Signboard Example
27http://www.flickr.com/photos/mringlein/4987275977/
Data Set Definition Standard data sets foster subsystem and multi-agency
communication Proprietary formats can be restrictive or cost prohibitive to convert
to a non-proprietary format The national trend is for transit agencies and others to make static
and real-time information openly available to developers at no charge
Information clearinghouses like Regional Integrated Transportation Information System (RITIS) and VA 511 can also be data receivers
Google transit information data standard, general transit feed specification (GTFS) has emerged as a national standard for static information and for the most part is the standard in Virginia
Real-time data standards have yet to formally emerge The working group reviewed local existing data formats including;
Washington Metropolitan Transit Authority (WMATA) real-time data format, SIRI – transit-specific, highly extensible, and Virginia Tech Bus Tracker
GTFS GTFS transit feed specification defines a common format
for public transportation schedules and associated geographic information. GTFS-RT is a feed specification that allows public transportation agencies to provide real-time updates about their fleet to application developers.
GTFS Advantages: Supported by Google. Google provides significant marketing
resources for publicizing the availability of agencies’’ GTFS data feeds. Easy for agencies to adopt standard and quickly display data via the popular Google Maps service.
Robust online documentation and forums to provide support to transit agencies
Free to connect to GTFS Many transit technology vendors have adopted GTFS There is a large community of developers familiar with Google’s
API specifications
GTFS Disadvantages: Completely dependent upon Google’s support; if Google ceases
support for GTFS, the standard would be in jeopardy of obsolescence
Google does not provide access to raw data that it collects from agencies
Must agree to Google’s inflexible legal terms regarding indemnification
SIRI SIRI is managed by a CEN Working Group - TC278 WG3 SG7. SIRI allows pairs of
server computers to exchange structured real-time information about schedules, vehicles, and connections, together with general informational messages related to the operation of the services. The information can be used for many different purposes, for example: To provide real time-departure from stop information for display on stops, internet and mobile
delivery systems. To provide real-time progress information about individual vehicles. To manage the movement of buses roaming between areas covered by different servers. To manage the synchronization of guaranteed connections between fetcher and feeder services. To exchange planned and real-time timetable updates. To distribute status messages about the operation of the services. To provide performance information to operational history and other management systems
SIRI Advantages: Vendor-neutral standard Supports significantly more data elements than GTFS Widely used Internationally Extensible; agencies can create their own custom data fields
SIRI Disadvantages: Complex to implement Not used as much in the US as in Europe
Proposed API Technical Specification The API will provide data access via three interfaces: SIRI, GTFS RT and GTFS.
Only data elements that are part of a standard can be delivered via that standard’s interface.
The goal is to have SIRI provide access to all data elements.
Mode of Transportation (Bus, Rail) Information Type (Static, Real-time, Support)
Data Category (Groups similar information, e.g., Agency Information, Stop Information, Route Information) Data Element (Defines individual data elements available via the API)
The following information is provided for each VTA Data Element: VTA Name: The unique name assigned by VTA for each Data Element. Participating local
agencies will provide data to the API using the VTA names. Description: Explains the information provided VTA Data Type: The data type required by VTA for local agencies to provide the Data Element Transmodel/SIRI Equivalent: The SIRI name that Data Consumers will use to access the Data
Element Transmodel / SIRI Module Source: The SIRI module in which Data Consumers will find the
Data Element GTFS-RT Equivalent: The GTFS-RT name that Data Consumers will use to access the Data
Element GTFS-RT Module Source: The GTFS-RT module in which Data Consumers will find the Data
Element GTFS Equivalent: The GTFS name that Data Consumers will use to access the Data Element GTFS Module Source: The GTFS module in which Data Consumers will find the Data Element
Mode Static / RT Datapoint
Bus and Rail
Static
Agency Information
Agency IdentifierAgency NameAgency URLTimezoneFare URL
Stop Information
Stop identifierStop codeName of stopStop descriptionLatitudeLongitudeZone identifierStop URLType of locationParent station identifier
Route Information
Route IdentifierAgency IdentifierShort nameLong nameDescriptionType of route
Trip Information
Route IdentifierService IdentifierTrip identifierHeadsign textShort nameDirectionBlockShape
Stop times
Trip identifierArrival timeDeparture timeStop identifierStop sequenceHeadsign textPick off typeDrop off typeShape distance traveled
Data-point
Data Type
TransModel/ SIRI Equivalent
TransModel/SIRI Module Source
GTFS-RT Equivalent
GTFS-RT Module Source GTFS Equivalent
GTFS Module Source Description
Agency Infor-mation
Agency Identifier
uint64
authorityIDAuthority
- - agency_id AgencyThis field is an ID that uniquely identifies a transit agency.
Agency Name
String authorityNameAuthority
- - agency_name AgencyThe full name of the transit agency
Agency URL
String authorityURL
Authority
- - agency_url Agency
This field contains the URL of the transit agency. Example: http://www.agency.gov
Timezone StringauthorityTimezone
Authority
- - agency_timezone Agency
The timezone where the transit agency is located. Example: UTC+02
Language String authorityLang
Authority
- - agency_lang Agency
This field contains a two-letter ISO 639-1 code for the primary language used by this transit agency. Example: EN
Phone Number
String authorityPhoneAuthority
- - agency_phone AgencyThe agency's phone number
Fare URL String - - - - agency_fare_url Agency
This specifies the URL of a web page that allows a rider to purchase tickets or other fare instruments for that agency online
Future Considerations Finalize phased plan for rollout of the real time
data hub.
Who will build and manage the infrastructure?
What type of governance will be implemented?
How will local agencies obtain the funding and technical support required to connect to the data hub?