PyData DC 2016: A DOC Conundrum

Post on 09-Jan-2017

130 views 0 download

Transcript of PyData DC 2016: A DOC Conundrum

a DOC Conundrum

Star Ying, Data Scientist at Department of Commerce

first, some background

grow the economy

⅓ of all federal public data

new data released constantly

$

a DOC conundrum

how to impart better understanding of our data

of any data

so really a data conundrum

a different perspective

def wdtd()……

def wdtd()……

#wdtd: what did this

#do?

is inherited

a quick quiz

american community survey

how many erratas have been issued for 2016?

how many erratas have been issued for 2016?

http://www.census.gov/programs-surveys/acs/technical-documentation/errata.html

viirs nighttime lights

which pixels are really blank?

which pixels are really blank?

http://ngdc.noaa.gov/eog/viirs/download_monthly.html

survey of income and program participation

which weights do I use?

which weights do I use?

http://www.census.gov/programs-surveys/sipp/methodology/weighting.html

how do we convey the necessary information to use our product

now, an anecdote

can we tie satellite images to economic

activity?

accounted for cloud coverage,

population, etc...

forgot the earth is a

sphere

tendency to silo ourselves

real insights and outcomes can only be derived from true synthesis of

knowledge of the data and mechanics in processing it

so what are we doing about it?

it is a communication problem

Data Usabilitycommerce.gov/datausability

handcrafted tutorials with working open code

I’d love to hear your ideas

no really

sying@doc.gov