BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data...

21
Google Confidential and Proprietary Tools for open data Filip Hráček BigClean, Nov 2012

Transcript of BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data...

Page 1: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Tools for open dataFilip HráčekBigClean, Nov 2012

Page 2: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Big data is hard.

1. Data gathering – hard2. Data refinement – hard3. Data analysis – super hard4. Data sharing – hard

Page 3: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Data gathering

Page 4: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Public Data Explorerhttp://www.google.com/publicdata/directory

Page 5: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Fusion Tables (Public)http://research.google.com/tables

Page 6: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Trendshttp://www.google.cz/trends

Page 7: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Trends Correlatehttp://www.google.com/trends/correlate

Page 8: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Ngram Viewerhttp://books.google.com/ngrams

Page 9: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Data refinement

Page 10: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Open Refinehttp://code.google.com/p/google-refine/

Page 11: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Page 12: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Data analysis

Page 13: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Big data.

1. Text editor2. Excel3. Local database

Page 14: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

https://bigquery.cloud.google.com/

Google BigQuery

Page 15: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Fusion Tableshttp://www.google.com/fusiontables http://research.google.com/tables

Page 16: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Data sharing

Page 17: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Fusion Tableshttp://www.google.com/fusiontables

Page 18: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Fusion Tables – in the wildhttp://www.guardian.co.uk/news/datablog/interactive/2011/aug/16/riots-poverty-map

Page 19: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Google Docs – in the wildhttp://data.blog.ihned.cz/c1-57386250-hledejte-s-nami-fakta-v-projevu-davida-ratha

Page 20: BigClean, Nov 2012 Filip Hráček · Google Confidential and Proprietary Big data is hard. 1. Data gathering – hard 2. Data refinement – hard 3. Data analysis – super hard 4.

Google Confidential and Proprietary

Big thank you! (for your attention)

Filip Hráček, Google Czech Republic