Democratizing Data

19
Democratizing Data to transform government, workplaces & our lives IgniteBoston Feb. 12, 2009 W. David Stephenson Stephenson Strategies Hello. Tonight I’d like to give you a preview of a book I’m writing for O’Reilly Media, Democratizing Data, to transform government, workplaces, and our lives, which is scheduled for July publication. Originally I was to co-author the book with Vivek Kundra, Chief Technical Officer of the District of Columbia, and a true trailblazer in this field. However, fortunately for the US, unfortunately for me, President Obama has chosen Vivek to become the Deputy Director of the Office of Management and Budget, in charge of all federal e-government initiatives, so what you see tonight is what you get!

description

my talk to 2/12/09 O'Reilly IgniteBoston, emphasizing that passage of economic stimulus package, combined with current economy, is perfect time to introduce data-centric "democratizing data" approach, giving workers, regulators, public, watchdogs real-time access to critical information! Video version: http://tinyurl.com/c9vkjy

Transcript of Democratizing Data

Page 1: Democratizing Data

Democratizing Datato transform government, workplaces & our lives

IgniteBostonFeb. 12, 2009

W. David StephensonStephenson Strategies

Hello. Tonight I’d like to give you a preview of a book I’m writing for O’Reilly Media, Democratizing Data, to transform government, workplaces, and our lives, which is scheduled for July publication. Originally I was to co-author the book with Vivek Kundra, Chief Technical Officer of the District of Columbia, and a true trailblazer in this field. However, fortunately for the US, unfortunately for me, President Obama has chosen Vivek to become the Deputy Director of the Office of Management and Budget, in charge of all federal e-government initiatives, so what you see

tonight is what you get!

Page 2: Democratizing Data

Now, I don’t know about you, but for me, data used to be good for one thing, and one thing only: figuring the Sox’ batting averages. I’m a right-brained, creative type, and row upon row of numbers left me absolutely cold.

Page 3: Democratizing Data

But increasingly, numbers started to intrude on my life, and I couldn’t ignore them anymore. Numbers such as how much the local aid my town gets was going to be cut...

Page 4: Democratizing Data

.. How much the damn war was costing, and how much it was diverting from things we should be doing, such as providing quality health care…

Page 5: Democratizing Data

… and, recently, documenting exactly how dire our situation had become.

Page 6: Democratizing Data

Democratizing Data: How free access will transform our lives

Ignite BostonFeb. 12, 2009

W. David StephensonStephenson Strategies

But when I got interested in data, I found it was pretty hard to get at. Remember the end of “Raiders of the Lost Ark,” when the Ark of the Covenant was moved to a government warehouse?

You knew it would never be seen again. That’s what seems to happen with a lot of data. We pay taxes so government can collect them, and you can bet companies know all about our shopping habits. Our activities and lives are their raw material. But once they’re collected, most citizens -- and a lot of employees for that matter -- don’t have a clue where data are stored or how they’re used. Even worse, that robs us of important tools that could improve organizations’ performance and cut

their operating costs.

Page 7: Democratizing Data

Fast forward , and lo and behold, in the latest Indiana Jones sequel, Indy retrieved the Ark! In my book, that’s an omen that you can’t keep things hidden forever! Similarly, closely-controlled and long-lost data are being liberated by the growing demand for transparency because of

outrage about how TARP money was or was not spent and concern that te stimulus package be as effective as possible, by watchdog groups, the media -- and us. The time has come to democratize data -- to make it available, when and where people need it to do their jobs or to improve their lives. The result will be change and benefits in every aspect of our lives.

Page 8: Democratizing Data

Beyond shedding light on how government operates, far-reaching and unprecedented change can result when we make

reams of data available, plus tools to portray them visually.Generally acknowledged as the leading thinker on data graphics, Edward Tufte says that even the most skilled statisticians

often find representing data visually is the most insightful way of making sense of them:"Modern data graphics can do much more than simply substitute for small statistical tables. At their

best, graphics are instruments for reasoning about quantitative information. Often the most effective way

to describe, explore and summarize a set of numbers -- even a very large set -- is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful.”

This example is a Google mashup Jon Udell whipped up quickly to highlight pothole complaints to the DC Department of Public Works, and track -- on a real-time basis (because the city releases that data automatically) -- the repairs’ status.

Sure, you might find that information in a chart, but who’d sift through pages of records in hopes of possibly finding the one or two that applied to their neighborhood? By contrast, if you saw this map, and lived near one of the pointers, wouldn’t curiosity compel you to click on it? Wouldn’t the fact that it includes not only information about where the pothole is and when the complaint was made, but also the repair status TODAY, both fascinate you -- and provoke you to call the DPW if it’s now 3 months later and the map shows the repair still hasn’t been made?

Thus, a simple map can be the impetus for citizen awareness – and greater agency accountability. Incidentally, this example also illustrates an important aspect of data visualizations, reflected in the democratizing data

concept: while many are official organization projects, many more are done by individuals or groups with a passion for a specific issue, such as..

Page 9: Democratizing Data

… Rami Tabello’s illegalsigns.ca, documenting illegal billboards in Toronto ….

Page 10: Democratizing Data

…. Adrian Holovaty & Dan O’Neill’s EveryBlock …

Page 11: Democratizing Data

…. and Jacqueline DuPree’s documentation of neighborhood issues in Southeast D.C.

Page 12: Democratizing Data

Some visualizations combine various data bases to illustrate convergence, contrasts or possible causality.

This example is Neighborhood Knowledge Los Angeles, a collaboration between UCLA and community activists. Their

motto: “neighborhood improvement and recovery is not just for the experts.” This is an great example of democratizing data’s impact, because it combines and maps data on 7 “problem indicators” (including code violations, property tax delinquencies, and fire records, etc.) that might have otherwise remained isolated in various data bases in various agencies within city government. However, when the data are brought together and displayed on a map of a single block, that’s a red flag to city officials to intervene NOW with coordinated services to halt the

decline. For the first time, it’s really possible to break down old barriers and work smart!

Page 13: Democratizing Data

“… put together big enough and diverse

enough groups of people & ask them to

make decisions affecting [the] general

interest, [and] that group's decisions will,

over time, be intellectually superior to the

isolated individual, no matter how smart or

well-informed he is.

”-- The Wisdom of Crowds

Equally important, web-based data visualization sites often include a variety of community-building Web 2.0 tools such as

topic hubs, tags, and discussion areas. They make it easy to focus many individuals’ and groups’ attention on a policy issue, increasing the chance that new insights will emerge precisely because of the interplay of so many perspectives. What could be more democratic?

As James Surowiecki wrote in “The Wisdom of Crowds,” “… put together big enough and diverse enough groups of people & ask them to make decisions affecting matters of general interest, [and] that group's decisions will, over time, be intellectually

superior to the isolated individual, no matter how smart or well-informed he is."

Page 14: Democratizing Data

<breakfast_menu>− <food><name>Belgian Waffles</name><price>$5.95</price>− <description>two of our famous Belgian Waffles with plenty of real maple syrup</description><calories>650</calories></food>−

<food><name>Homestyle Breakfast</name><price>$6.95</price>− <description>two eggs, bacon or sausage, toast, and our ever-popular hash browns</description><calories>950</calories>

1st: tag & syndicate data

But the devil’s in the details,. As of now, there’s far too little data available, in a timely and usable fashion, to workers, regulators, and/or the public.

It’s time to switch to a data-centric approach, in which usable data is accessible to all sorts of applications and devices, automatically.

The first, and most important, step is to structure data, in formats such as XML or KML, that will allow the data to be identified and read by both programs and devices.

Equally important, the data must be syndicated, in streams such as RSS or Atom where it will be automatically delivered

without any additional effort on users’ part.In fact, Princeton researchers last year released a paper making a startling assertion. They said the single most important

step government can take to make web sites that really serve the public is to concentrate its attention on data streams: “Rather than struggling, as it currently does, to design sites that meet each end-user’s need, we argue that the executive

branch should focus on creating a simple, reliable and publicly accessible infrastructure that exposes the underlying data”

Page 15: Democratizing Data

Transparency begins at home\2nd: give workers data they need

Curiously, although a growing range of government agencies release public data streams, almost none provide them to their own workforces, to give workers actionable data precisely when and where they need it, to do their work more efficiently. Agencies -- and corporations -- need to follow the District of Columbia's lead, and apply the same strategy behind the

firewall first.After all, agencies’ employees may be struggling with incompatible data bases, may need to reach across agency “silos” to

see if there might be synergies between programs, and employees from another agency may be able to provide new insights simply because of their differing life experiences and expertise.

Also, as more young workers, who have never known life without the Web, join governmental workforces, they’ll naturally

ask why tools they’ve used can’t be used in government. A data graphics project can empower them and tap their expertise.

Page 16: Democratizing Data

3rd: release the data

Text

Several federal and state agencies now publish a variety of data feeds.

The most exciting model in the US is the District of Columbia’s Citywide Data Warehouse. It provides real-time numerical and geospatial feeds, drawn from more than 250 data sets, ranging from crime reports to to building permits to all purchase orders over $2,000.

Anyone may access the feeds. In fact, a major reason why they are issued is to invite the media, community groups and watchdog organizations to examine -- closely -- the District’s internal operations, and to hold them accountable. After a long

legacy of corruption, the DC government is earning public confidence, not through patronizing platitudes, but a transparent “don’t trust us, track us” invitation to check the facts.

Given the loss of confidence in the federal government and industry in the wake of the financial collapse, it is urgent that they follow the District of Columbia’s lead.

Page 17: Democratizing Data

4th: make public co-creators

•$50,000•30 days•47 apps•4,0000% ROI!

Finally, on the cutting edge of democratizing data is to use it to invite your customers or citizens to become co-creators. That’s what my co-author, Vivek Kundra did as Chief Technology Officer of the District of Columbia. His Apps for Democracy contest was open to any developer, anywhere. They were invited to use one or more of DC’s data feeds, and create

an open source app that would benefit the public. In one month, developers created 47 different usable apps, at a total cost to DC of $50,000 -- $20,000 of that for prizes -- an estimated ROI of 4,000% Now that Vivek has been named director of e-gov for the Obama Administration, look for this same sort of innovative public partnership to be replicated nationwide.

Page 18: Democratizing Data

•More informed policy debate•Consensus building•Better legislation•Transparency•Less corruption•Efficiency•Lower costs•Co-creating

Benefits:

The potential benefits of democratizing data are many, and varied:

• more informed policy debate, grounded in fact, rather than rhetoric

• consensus building

• better legislation

• greater transparency and less corruption: greater accountability

• optimizing program efficiency and reducing costs:

• new perspectives, especially when “the wisdom of crowds” emerges.

Who would have believed that dry data -- with a healthy dose of Web 2.0 magic -- could become the engine to involve the public in governmental transformation!

Page 19: Democratizing Data

To learn more about democratizing data, contact:

W David StephensonStephenson Strategies

335 Main Street, Medfield, MA 02052508 740-8918

[email protected]

.. and watch for “Democratizing Data,” coming in July from O’Reilly Media!

To learn more about transparent government and how to create the processes and policies to make it a reality, contact:

Stephenson Strategies 335 Main Street, Medfield, MA 02052 (617) 314-7858 [email protected]