Open Source, Open DataKirrily RobertFlorida Linux Show, 2009
From Open Source to Open Data
1993
Me in 1993 My Linux desktop looked like this
1993
• I started using Linux in 1993
• I was very excited by it, even though it was quite primitive at the time
• Other people thought I was a little crazy
Image: Wikipedia Image: Engadget
1999
Google’s servers in 1999Jar Jar in 1999
1999
• By 1999 Linux + open source was starting to take off
• Companies using and building services on Linux etc.
• We were calling it “Open Source” - a more marketable term for Free Software
Four Software Freedomshttp://www.gnu.org/philosophy/free-sw.html
• Freedom to run the program
• Freedom to study the program and modify it for your own use
• Freedom to redistribute verbatim copies
• Freedom to improve the program, and release your improvements
Free Culture
• A similar movement
• Make cultural works freely available
• Mostly over the Internet
Free Culture
Free Culture
Free Culture
Free Culturehttp://wiki.freeculture.org/Free_Culture_Definition
• Freedom to use the work
• Freedom to study the work and to apply knowledge acquired from it
• Freedom to make and redistribute copies
• Freedom to make changes and improvements, and to distribute derivative works
Image: masternewmedia.org
What is Open Data?
Data
Image: himmelskratzer @ Flickr
What is data?
• Ones and zeroes (obviously)
• But also filing cabinets, research archives, and other offline resources
• It’s not OPEN data unless you can get at it
Open Data Freedoms
• Freedom to use the data
• Freedom to study the data and modify it for your own use
• Freedom to make and share verbatim copies
• Freedom to improve the data and redistribute the results
Data availability
• Digital
• Online
• Well formatted
Open Data Projects
public.resource.org
• Created 2007 by Carl Malamud
• “Making Government Information More Accessible”
public.resource.org
• SEC EDGAR records
• Patents database
• Copyright database
• Congressional records
• Legal decisions
• Fedflix
Data.gov
• Founded 2008
• “Increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.”
OpenStreetMap
Compare...
OpenStreetMap
Open Library Project
• CD data
• Tracks, artists, releases...
• CC license
Flickr
• Images
• Metadata• tags, timestamps, geolocations, etc.
• Range of CC licenses and permissive TOS
Infochimps
• Large data sets
• Various licenses
• Tools for transformation
• Open data about “everything”
• 8.5m concepts
• CC-BY license
• API and data dumps
2,416,683 books
16,608 ships
488 cheeses
Structured data { "name": "Asiago cheese" "id": "/en/asiago_cheese", "region": [{ "id": "/en/asiago", "name": "Asiago", "type" : "/location/location"
}], "source_of_milk": [{ "id": "/en/cattle", "name": "Cow", "type" : "/biology/organism_classification" }] }
Open Data Apps
• Apps for America competition
• Open source and open data
• Round 1: various data sources
• Round 2: Data.gov
Legistalker
Filibusted
Where the money goes
Open Source for Open Data
What can open source do?
Input
Processing
Output
Scrape
Munge
Visualise
Scraping data
• APIs• XML, RSS, JSON...
• Downloadable data sets• XML, Excel, CSV, triple dumps...
• Beautiful Soup (Python)• http://www.crummy.com/software/
Munging data
• Perl• http://perl.org/
• R (statistical analysis)• http://r-project.org/
• Hadoop (parallel data processing)• http://hadoop.apache.org/
Visualisations
• MIT Simile• http://simile.mit.edu/
• Processing• http://processing.org/
Semantic Web
• Describe meaning, not markup
• Triples: subject, predicate, object
• Expression: RDF
Linked Open Data
Semantic web tools
• Triple stores• Sesame, BigData, Virtuoso...
• Libraries• RDFLib (Python), Redland RDf (librdf)...
Freebase Acre
Open source for open data
• Low barrier to entry
• Hooks in to Freebase data
• Share and clone apps
• Apps are BSD licensed
FMDB
Gendered names app
Query editor
Clone!
Where next?
Open Data: Issues
• License clarity
• Govt + Corporate acceptance
• Developer literacy
• What do we DO with it?
What do we do with it?
What do we do with it?
• 10 years ago we were asking the same questions of Open Source
• With Open Data, we are just starting to realise its potential
• Please join us!
Keep in touch
• Email• [email protected]
• Freebase blog• http://blog.freebase.com/
• Twitter• @fbase
Top Related