Text analytics for Google Spreadsheets using dataTXT add-on

28
Doing text analysis inside Google Spreadsheets, using dataTXT add-on food for thoughts

description

This add-on allows Google Spreadsheet users to enhance their textual spreadsheets content by automatically extracting named entities (such as places, persons, events or concepts) and linking them to Wikipedia by using dataTXT semantic API.

Transcript of Text analytics for Google Spreadsheets using dataTXT add-on

Page 1: Text analytics for Google Spreadsheets using dataTXT add-on

Doing text analysis inside Google Spreadsheets,

using dataTXT add-on

food for thoughts

Page 2: Text analytics for Google Spreadsheets using dataTXT add-on

wait, wait: what’s text analysis?

Page 3: Text analytics for Google Spreadsheets using dataTXT add-on

what’s text analysis?

turn text into data for analysis

Page 4: Text analytics for Google Spreadsheets using dataTXT add-on

turn text into data for analysis

why it’s useful

• Enterprise Business Intelligence/Data Mining, Competitive Intelligence • E-Discovery, Records Management • National Security/Intelligence • Scientific discovery, especially Life Sciences • Sentiment Analysis Tools, Listening Platforms • Natural Language/Semantic Toolkit or Service • Publishing • Automated ad placement • Search/Information Access • Social media monitoring

Page 5: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

turn text into data for analysis

usually you have to be a developer, but now you can do a lot of things directly inside Google Spreadsheet, thanks to dataTXT add-on

http://bit.ly/dataTXT-googleSheets

Page 6: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

but why is it useful?

Page 7: Text analytics for Google Spreadsheets using dataTXT add-on

turn text into data for analysis

infographics, tag clouds, mind maps,graphs, charts…

Page 8: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

let’s start from an example…

Page 9: Text analytics for Google Spreadsheets using dataTXT add-on

extract useful informations from a news article published on

#textanalysis

http://edition.cnn.com/2014/09/10/world/rosetta-philae-landing-site/index.html?hpt=hp_t3

#dataTXT#gdrive

Page 10: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

copy & paste this text on a Google Sheet…

Page 11: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

this is just text: we call it “unstructured data”

Page 12: Text analytics for Google Spreadsheets using dataTXT add-on

if we select the cell, launch dataTXT add-on, and click “Analyze text”…

#textanalysis #dataTXT#gdrive

Page 13: Text analytics for Google Spreadsheets using dataTXT add-on

… we are performing named entity extraction with dataTXT-NEX APIs,

inside the Google Sheet

#textanalysis #dataTXT#gdrive

Page 14: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

now, we find something else: a new sheet titled “Analysis” with

a lot of useful stuff…

Page 15: Text analytics for Google Spreadsheets using dataTXT add-on

TEXT -> it’s the original content SPOT -> the label of an “entity”, taken from the original text CONFIDENCE -> it’s a quality score of the matching

WIKIPEDIA URL -> it’s the URL of the entity on Wikipedia

Page 16: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

TYPES -> the type of the entity extracted from DBpedia

CATEGORIES -> extracted from DBpedia, it’s useful as tag

Page 17: Text analytics for Google Spreadsheets using dataTXT add-on

so why is it useful?

before: it’s only text

now it’s contextual data

Page 18: Text analytics for Google Spreadsheets using dataTXT add-on

in other words:

#textanalysis #dataTXT#gdrive

the text “67P/Churyumov-Gerasimenko” has now some structured details, like

“categories”: a sort of tag set very useful:

Page 19: Text analytics for Google Spreadsheets using dataTXT add-on

you can do a lot of things with dataTXT add-on for Google Sheets

#textanalysis #dataTXT#gdrive

make a tag cloud using concepts labels ( typed concept )

extract persons cited in a lot of content

build some graph/chart using types found inside the content

extract some data from a lot of tweets (useful for Social Media

consultants and not so many data)

find useful keywords to enrich your content (a better SEO?)

enrich your content with useful links to contextual Wikipedia pages

Page 20: Text analytics for Google Spreadsheets using dataTXT add-on

and all of this without programming :) and inside your own Google Spreadsheet!

#textanalysis #dataTXT#gdrive

democratizing text analytics!

Page 21: Text analytics for Google Spreadsheets using dataTXT add-on

and if you are a smart guy, or a data journalist for example,

you can do something better…

#textanalysis #dataTXT#gdrive

Page 22: Text analytics for Google Spreadsheets using dataTXT add-on

use your Google SpreadSheet as a little database, to build smart interactive web pages

Google Spreadsheet unstructured

data

Google Spreadsheet structured

data

+ dataTXT

Page 23: Text analytics for Google Spreadsheets using dataTXT add-on

and don’t forget: you are using some data taken from the Linked Open Data Cloud without knowing anything

about it!

Page 24: Text analytics for Google Spreadsheets using dataTXT add-on

How-to install dataTXT add-on for Google Sheets

#textanalysis #dataTXT#gdrive

Page 25: Text analytics for Google Spreadsheets using dataTXT add-on

#textanalysis #dataTXT#gdrive

inside a Google Sheet, looking for “dataTXT” inside the store…

Page 26: Text analytics for Google Spreadsheets using dataTXT add-on

http://bit.ly/dataTXT-googleSheets

or using this link at the bottom…

Page 27: Text analytics for Google Spreadsheets using dataTXT add-on

there is a tutorial on dandelion.eu to setup it

http://bit.ly/howto-dataTXT-on-google-sheet

Page 28: Text analytics for Google Spreadsheets using dataTXT add-on

Unleash your creativity, give it a try!

#textanalysis #dataTXT#gdrive

http://bit.ly/dataTXT-googleSheets

@SpazioDati