Optimising Google's Knowledge Graph - #SMX Munich

International Freelance SEO

International Freelance SEO

SEO Consultant Internet Advantage

Brand Ambassador Majestic

Cycling & Skating

Science: Physics in particular

Search engines have to jump on the bandwagon:

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

North America

South America

Europe Asia Africa Oceania Global

Source: Statcounter

February 2014 versus February 2015

Even more awesome !

Artificial Intelligence and Machine Learning

Algorithms and Theory

Human-Computer Interaction and Visualization

Natural Language Processing

Machine Perception

Information Retrieval and the Web

Security, Cryptography, and Privacy

Data Mining

Software Systems

What happened during the past 8 years?

2007 2010 2015

From a database to search engine result pages

Now… Let’s be honest

And to be really honest…

Basic information retrieval

http://arxiv.org/pdf/1503.00759.pdf

http://research.google.com/pubs/pub41894.html

Four different methods to extract triples from web content

Natural Language

Processing tools

Entity recognition

Entity linkage

Entity verification

against Freebase

Source: https://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf

Document Object Model

Either text or database

driven “deep web” sources

Think of quering HTML

forms

570M tables on the web

Relations are difficult to

extract

Schema matching methods

Entity verification

against Freebase

Schema.org

Mostly people related

Products & Events are not

stored

Mapping Schema.org to

Freebase for predicates

Researchers deal with “duplicate content” as being just one source

P1

P2 P3

P4

Exploring the power of tables on the Web

https://research.google.com/tables

The papers share some insights about the factors relevant to Google Tables results

Sources of data Google uses according to the paper

Table contents

Optimise the surrounding

content with relevant captions

and texts.

Use <th> table headings to

add labels to specific columns

Add relevant attributes to your

table headings focusing on

the queries used

Only add useful content to the

table. Boilerplate content is

filtered out.

Attributes Table headings Surrounding

Content

http://www.cidrdb.org/cidr2015/Papers/CIDR15_Paper3.pdf

“Extraction errors are far more prevalent than

source errors. Ignoring this distinction can

cause us to incorrectly distrust a website”

Back to the basics for Google (and probably the other search engines too)

Links still tell something about relationships between pages

but also between entities.

Simply search in the indices you already have. In the case of

Google, they already have “everything”.

Simply gather user feedback from within the search results.

Source: https://twitter.com/brentnau

One in 20 searches is health related according to Google.

Use Web based Fact

extraction, like DOM,

tables and annotated

data (Schema.org)

Text based extractors

adding more triples to

the datasets

Systems like described in the

Biperpedia paper. Data is

enriched and quality control

takes place. Use partnerships for

trusted resources.

Use existing datasets like

Freebase / Wikidata to verify

extracted data and calculate

probability

No, Google also gives a fair chance to competitors

At that moment, notprovided.eu/jan-willem-bobbink.jpg was used as a source

Source: https://twitter.com/CyrusShepard/status/575555722529894400

This Knowledge card is triggered for the query “HDMI cable”.

Google shows a number of related topics, hence:

Source: https://twitter.com/Andrew_Isidoro/status/576363205120905216

Make sure you understand

A few possibilities to influence the content of brand cards

Main source still is Wikipedia, always backup your edits with sources

Your are able to give Google hints about your logo, corporate contacts and social profiles

Add schema.org Organization markup to your official website

Add schema.org Organization markup to your official website

Find example JSON-LD at

https://developers.google.com/structured-data/customize/overview

What about the localised Google search indices?

America

19%

?

?

?

?

?

Source: https://www.stonetemple.com/rich-answers-in-search/

Contains the main subject of

the required answer

Contains the main subject of

the required answer

Within the content, the

question is answered in a

single sentence

No, Euro NCAP is more

authoritative in the EU for car

safety levels.

NHTSA for the US

Two indices, two truths?

So how can we make use this for our brand?

Since not many are focusing on the getting into the Direct Answers yet, grab the positions first!

95% of the cases had increased traffic - filtered out movements within top 10 normal blue links.

Less than expected,

probably because of quality

of the answer: results

between -5% and +6%

traffic.

Results varied between -3%

and +11% depending on

previous position in the

SERPs

These were performing the

best, increases between 6

and 14%

Depending on the topic,

complicated topics tend to

get more clicks. Average

results between -2% and

16% increase

Be ready for the future, before tomorrow starts

Make sure you monitor everything search

engines are researching. Papers, patents

and other publications give clues about

future developments

≠

There is still a lot of research being done

by Google, it will take time & testing to

develop the “next Knowledge Graph”

Look to the websites that are already

getting results and reverse engineer.

For all public available information, be

realistic since Google will take the SERP

real estate in those niches for themselves

http://bit.ly/kg-smx



Facepalm: http://en.wikipedia.org/wiki/Facepalm

Wearables: http://www.electricfoxy.com/electricfoxy/2014/02/why-companies-need-to-make-wearables-cool

Internet of Things: http://www.i-scoop.eu/internet-of-things/

Voice search: http://ecommerceconsulting.com/2014/06/voice-search-versus-typed-search-strategy.html

Einstein quote: http://randumbuzz.com/tag/human-stupidity/

The web http://soocurious.com/fr/les-toiles-des-araignees-possedent-un-pouvoir-secret-pour-les-aider-a-attraper-leurs-proies/

Chicken & Egg http://analyticsweek.com/which-came-first-the-chicken-or-the-egg-data-science-can-help/

Robot: http://www.bloomberg.com/bw/articles/2013-12-16/google-just-bought-some-really-creepy-military-robots

Mount Everest: http://en.wikipedia.org/wiki/Mount_Everest

Hummingbird: http://amsdaily.net/2012/01/05/the-legend-of-the-hummingbirds/

Optimising Google's Knowledge Graph - #SMX Munich

Internet

Transcript of Optimising Google's Knowledge Graph - #SMX Munich