Briefing on US EPA Open Data Strategy using a Linked Data Approach
-
Upload
3-round-stones -
Category
Environment
-
view
174 -
download
1
description
Transcript of Briefing on US EPA Open Data Strategy using a Linked Data Approach
Bernadette Hyland CEO & co-founder
David Wood CTO & co-founder
1400 Key Blvd, Ste 100
Arlington VA 22209
Tel. +1-877-290-2127
[email protected] @BernHyland
[email protected] @prototypo
[email protected] @3RoundStones
Extend Your Reach.
Better Data. Smarter Decisions
This presentation delivered 18-Nov-2014 & is available at http://slideshare.net/3RoundStones
!
With everything else happening in the world, why does
Open Data matter anyway??
3
Credits: !WV Chemical spill: http://www.nytimes.com/2014/01/11/us/west-virginia-chemical-spill.html!Hurricane Sandy: http://www.nytimes.com/2012/10/28/us/hurricane-sandy-on-collision-course-with-winter-storm.html!Ebola: http://www.nytimes.com/interactive/2014/07/31/world/africa/ebola-virus-outbreak-qa.html
Discovery, access & re-use goes beyond government transparency &
accountability … !
Access to timely, accurate data is vital to
first responders, legislators, scientists, policy makers,
journalists & the general public
Taxpayers spend billions of dollars for our government to
collect data
We, the people, expect government authorities to treat information
as an asset. It complies with regulations (Quality
of Information Act, Section 508, protects PII) & is:
public, accessible, described, reusable, complete, timely, sustainable over
election cycles
US Federal Government is listening … “Open Data” per M13-13*
• Public
• Accessible
• Described
• Reusable
• Complete
• Timely
• Managed Post-Release
• Project Open Data
• OMB & OSTP online tools, best practices & schema to help agencies implement M13-13. See Project Open Data
• May 9, 2014 the Digital Accountability & Transparency Act (DATA Act) became Public Law 113-101
The goal of treating Information as an asset is
not new …
“Linked Data was part of my initial vision for the Web and is an important part of the Web’s
future. The Web took off as a web of hyperlinked documents which were exciting to read, but which could not be effectively used as
data.
“Linked Data was part of my initial vision for the Web and is an important part of the Web’s future. The Web
took off as a web of hyperlinked documents which were exciting to read, but which could not be
effectively used as data.” !
- Tim Berners-Lee
We all know the ground truth of data on the Web
Lots of [government] open data without labels or context
What is needed is … data that describes itself
Linked Open Data is called “self-describing” data
Linked Data is “A method of publishing structured data so that it can be interlinked &
become more useful. … Extends Web pages to share information in a way that can be
read automatically by computers.”
- Sir Tim Berners-Lee
Linked Data on the Web
my data
collector
collected by
measurement
Michael
first name
Hausenblaslast name
Person
a
a measurement
2011-01-01date
0
valueunits of measure
degrees Centigrade
...
Galway Airport
collected at
or
• Linked Data has one amazing property: it can easily be combined with other Linked Data to form new knowledge.
• Based on 20+ year old idea
• A system of linked information systems
M A N N I N G
David WoodMarsha ZaidmanLuke RuthWITH Michael Hausenblas
FOREWORD BY Tim Berners-Lee
Structured data on the Web
We’re making progress with US Government
Open Data
• On our third iteration of a catalog of datasets, using CKAN. • >500k datasets from 200+ USG authorities
• Sustained executive support for data.gov via OMB & OSTP - Project Open Data • GSA team who are engaging with Open Data / OSS /
standards community • Health, Energy, Law, Education & Public Safety specific
communities in place. • Agencies are [beginning] to name Chief Data Officers !
But we still have a lot to do …
RCRA = Resource
Conservation and Recovery
Act
A search for “EPA RCRA” shows displayed
the first dataset 6th position :-(
This dataset is just one piece of a complex set
of data in understanding solid
waste reporting
First 5 results are for
Facilities Registry
Service …
For example, The Right-to-Know Network is a
consumer of EPA open data from
data.gov
They’ve build some nice
visualizations!
But the Toxics Release Inventory (TRI) is
complicated data . The RTF Network
would have benefited from more context
had it been available from the EPA…
RTK Network provides access
to machine readable content (as XML) but … it lacks context
This data does not use shared vocabularies!
:-( No units of
measure, No definition of codes
Linked Data Management System For government open data publishing
Funded by
Landing page for new EPA Open Data site
Search for facilities in your neighborhood… !Click through to an individual facility
View by map or by table layout …
Another sample linked data app shows nuclear power plants regulated by the US EPA
The power of Open … A useful app developed in 2 days using Open
Government Data + Open Source + Open Web Standards …
!Deployed on the cloud.
4
5
6
7
8
1
2
Key to data sources: 1 Wikidata Project (description) 2 Open Street Maps (OSS) 3 Wikimedia Commons (photo) 4 Raw data available for developers (RDF/XML) 5 EPA Resource Conservation and Recovery Act (RCRA) 6 EPA Facilities (FRS) 7 EPA Toxic Release Inventory (TRI) 8 ABT Enviro Consultants (from a spreadsheet)
3
Pollution reports using linked data … Developed in < 1 week using open data & OSS
List of input reports for this pollution report … Possible because of a linked data approach
Use of shared vocabularies, e.g. Places, Geographis, Dublin Core, Geo, FOAF, ORG, Vcard are the “lingua franca” of data interoperability
WeatherHealth A mobile app for chronic asthma/COPD
patients with weather alerts
Funded by
User
NOAA US EPA AirNow
DBpediaNational Library of Medicine
US EPA SunWise
Orgpedia An open organizational data project
on public & private companies
Funded by
Callimachus apps allow for crowdsourcing
How did we handle data publishing & application
development US EPA, Sentara Healthcare &
Orgpedia?
!
The leading Web application server for Linked Data !
Fanatically standards compliant ** !
Used to creating data-driven applications that combine data across silos
** http://www.w3.org/2013/data/
<HTML>
Enterprise Data Documents
Read/ Write
Point to, include
Our customers use Callimachus to:
Create responsive apps with many different data sources &
types of data
CONTENT MANAGEMENT
SYSTEM
LINKED DATA MANAGEMENT
SYSTEM
UN
ST
RU
CT
UR
ED
T
EX
T
TE
XT
ST
RU
CT
UR
ED
D
AT
A
DA
TA
Customers are creating data-driven applications with data in leading graph databases:
Callimachus conveniently
supports in-browser
development for faster iteration
Do not recreate the wheel!
Summary
• Billions of dollars are spent by taxpayers for government to collect useful information - e.g., geospatial data, population, healthcare, medicine & clinical trials, environment, energy, law, education …
• Data consumers must help government to fulfill its goal to treat “information as an asset” by participating & giving feedback
• Steady forward progress has been made however, take care to not re-create the wheel!
• Use Open Data, Open Source, Web standards & published best practices whenever possible
• More work to be done …
Addi%onal Resources
• “Open by Default” presenta%on by Dr. David Wood to Virginia Commonwealth officials 10/7/2014, see hKp://www.slideshare.net/3roundstones/open-‐by-‐default-‐39976290
– Open Data is the idea that "certain data should be freely available to everyone to use and republish as they wish, without restric%ons from copyright, patents or other mechanisms of control”. Open Data follows similar “open” concepts that have proven to be valuable in the informa%on economy such as Open Standards, Open Source SoRware, Open Content and has been followed more recently by varia%ons on the theme such as Open Science and Open Government.
– Linked Data Developer website, see hKp://linkeddatadeveloper.com/
– Linked Data: Structured Data on the Web, see hKp://books.google.com/books/about/Linked_Data.html?id=rA8-‐mQEACAAJ
– Add Linked Data to HTML with RDFa.info, see hKp://seman%cweb.com/new-‐resource-‐for-‐web-‐developers-‐announced-‐add-‐linked-‐data-‐to-‐html_b28813
–See also RDFa website on GitHub, see hKps://github.com/rdfa/rdfa-‐website
48