Big Data in AC Transit - sfbayite.org · AC Transit at a Glance Daily 169,000 Daily service hours...
Transcript of Big Data in AC Transit - sfbayite.org · AC Transit at a Glance Daily 169,000 Daily service hours...
Big Data in AC TransitChallenges, Opportunities and Initiatives
Ahsan Baig, CIO/CTOAlameda-Contra Costa Transit District
2 0 1 9 S F B ay A re a I T E / I T S C A J o i n t Tra n s p o r t a t i o n Wo r k s h o p , S a n F ra n c i s c o, C A • A p r i 2 5 , 2 0 1 9
Agenda
• Intro to Alameda-Contra Costa Transit
• Digital Disruption in TransitDigital Transformation
Connected Enterprise Framework
• Case Study• Big Data – Volume, Velocity, and Variety
• Data is the New Oil• Data Governance
• Data Democratization
• Data Integration
• Concluding thoughts
2
Bay Area Transit…27 Agencies
The “Big Seven” (Muni, BART, AC Transit, Caltrain, VTA, SamTrans and Golden Gate Transit) each have annual ridership over 9 million. The remaining 16 agencies carry only 4% of the region’s transit trips.
Bay Area “Big Seven” Transit Agencies Annual Ridership (2017)
Muni 226,261,960
BART 132,802,066
AC Transit 52,310,594
VTA 39,137,607
CalTrain 19,267,022
SamTrans 12,550,962
Golden Gate Transit 5,698,961
3actransit.org
AC Transit at a Glance
• Alameda and Contra Costa Counties
• Serve 13 cities and 8 unincorporated areas
• Facilities:3 – Oakland1 – Emeryville1 – Hayward1 – Richmond
• Service across 3 Bay Area bridgesDumbartonSF–Oakland San Mateo
4actransit.org
AC Transit at a Glance
Daily
169,000
Daily service hours
5,800(weekday)
16 other bus systems
25 BART stations
6 Amtrak stations
3 ferry terminalsAnnual
52,300,000
Paratransit
771,000(annual)
RIDERSHIP
Bus lines
160
SERVICE
Bus stops
5,500(approximately)
Annual service miles
20.4 million
CONNECT WITH
Transbay daily
14,500
5actransit.org
Digital Disruption
Shared Mobility
Electrification
Automation • Three Rs – Shared Mobility –
Automation – Electrification • 5G is REAL - High-speed and reliable
connectivity is available• Bi-Directional Data Flow• Mobile Edge Computing
Data is the new oil!
• Business Case - Media rich applications are becoming norm with real-time video, voice, maps and images
• Data is the most important component of any transit agency
• Public vs Private Data Domains
• Digital Framework to support IoT
• Data Driven Decisions
7
Digital Framework
8
ConnectedVehicles
RidersCities and Counties MPO
CAD/AVL
9
Big Data Challenges:
• Complexity due to Volume, Variety and Velocity• Data from internet connected sensors, videos, social media and mobile applications,
• How to quickly clean, qualify and facilitate ?
• Complexity due to numerous Data Sources for Big Data
• Integrating unstructured Big Data with Structured Organizational Data. • Unstructured data shouldn’t become a silo of its own
• Should be cleaned and qualified
• Facilitate Advanced Analytics, Artificial Intelligence, Real Time Analytics and more
• Business needs a complete and accurate data, with complete lineage – Big, Large or Small.
Big Data Challenges
Data Governance - The orchestration of people, processes, and technology to manage critical data assets by using roles, responsibilities, policies, and procedures to ensure the data is accurate, consistent, secure, and aligns with overall organizational objectives.
10
• More than 30 small and big applications, grew over the period of time across ACT. (HASTUS, ELLIPSE, CAD/AVL, PeopleSoft, etc)
• Complex infrastructure with Applications hosted on-premise, data-centers and cloud.
• Several sources-of-truth.
• Additional new applications bringing very large volume of data (Big Data).
• Transit Organizations generating highly useful data, both for public and private domain. Important to share meaningful information with public and external agencies.
• Very high demand of Quality and Reliable Data internally by various ACT Departments for better Planning & Scheduling, effective Maintenance, Operations, etc.
Data Integration Complexity
11
• Advanced Integration Architecture by leveraging some of the latest tools and technologies. Few of these under review are• - Azure SQL Services and API Management, MuleSoft, Kafka, etc.
• Better Master Data Management and Data Quality Management to ensure the core data is universally defined, qualified and validated
• Define and establish Data Lakes, with Global Security and common access methods.
• Finally, adapt to a comprehensive Corporate Data Governance across the organization.
High Level Approach
12
Source Applications and Transactional Data
Unified Integration and API Management
DQM as team effort between Data Stewards, Departments and IT
Integrated, Qualified and Aggregated Data Warehouse defined by Business
Logical Interface for Integration of all Data Sources, providing Secure and Simplified view of Enterprise Data
Rich set of Analytics and Visualization tools, governed by Global Security Model
Data Access Layer - Secure(Reporting Tools)
Data-warehouse
Logical Layer - Secure LogicalData Marts
Data Marts
Unstructured Data
Master D
ata Managem
ent
Dat
a G
over
nanc
e
CleverCAD/AVL
PS HASTUS Ellipse other other
Controlled, Direct
Connections
Data Quality Management(validate and qualify)
Data Ingestion / Integration Layer(API Management)
Conceptual Data Architecture
13
• Activating People, Aligning Processes and Enabling with Technology
• Defining and Developing Data Catalogs & Data Stewards
• Federating Governance across Organization
• Communication and Data Management
• A comprehensive and Global Data Security Model
• Policy Formulation and Standards setting
Data Governance Approach
Concluding thoughts…
14
• Digital Disruption needs to be embraced by Transit Operators• Media rich applications are normal• Slice and Dice of Data • Cyber Security requires attention
• Data Governance and Framework• Data Science – New Skills are required• Enabling AI and ML
• Data is eating the world
Ahsan BaigChief Information and Technology Officer
Alameda-Contra Costa Transit [email protected]
(510) 891-5490
Thank You!