Post on 22-Jan-2018
GraphAware®
RELEVANT SEARCH LEVERAGING KNOWLEDGE GRAPHS WITH NEO4J
Alessandro Negro Chief Scientist @ GraphAware
graphaware.com@graph_aware, @AlessandroNegro
‣ The rise of Knowledge Graphs
‣ Relevant Search
‣ Knowledge Graphs for e-Commerce
‣ Infrastructure
‣ Conclusions
OUTLINE
GraphAware®
“Knowledge graphs provide contextual windows
into master data domains and the links between
domains”
KNOWLEDGE GRAPH CONNECTING THE DOTS
GraphAware®
The Forrester Wave, Master Data Management
THE RISE OF KNOWLEDGE GRAPHS
GraphAware®
E-Commerce
‣ Many data sources
‣ Marketing strategies
‣ Business goals
‣ Category hierarchies
‣ Searches
Enterprise Networks
‣ Uncover new opportunities, hidden leads
Finance
‣ Textual corpora such as financial documents contain a wealth of knowledge
‣ Structured knowledge of entities and relationships
Medicine & Health
‣ Dynamic ontologies where data is categorized and organised around people, places, things and events
‣ Patterns in disease progression, causal relations involving disease and symptoms, new relationships previously unrecognised
Criminal Investigation & Intelligence
‣ Obfuscated information
‣ Traceability to sources of information
GraphAware®
THE RISE OF KNOWLEDGE GRAPHS
DATA SPARSITYPROBLEM
GraphAware®
Collaborative Filtering
‣ Cold Start
Content Based Recommendation
‣ Missing Data
‣ Wrong Data
Text Search
‣ User agnostic
‣ Relevant Search
KNOWLEDGE GRAPH: DATA CONVERGENCE
GraphAware®
RELEVANT SEARCH
GraphAware®
“Relevance is the practice of improving search
results for users by satisfying their information
needs in the context of a particular user
experience, while balancing how ranking
impacts business’s needs.”
RELEVANT SEARCH DIMENSIONS
GraphAware®
KNOWLEDGE GRAPHSTHE MODEL
Search architecture must be able to handle highly heterogenous data
Knowledge Graphs represent the information structure for relevant search
Graphs are the right representation for:
‣ Information Extraction
‣ Recommendation Engines
‣ Context Representation
‣ Rule Engine
Critical aspects and peculiarities:
‣ Defined and controlled set of searchable Items
‣ Multiple category hierarchies
‣ Marketing strategy
‣ User feedback and interactions
‣ Supplier information
‣ Business constraints
THE USE CASEE-COMMERCE
GraphAware®
→ Text search and catalog navigation as Sales People
KNOWLEDGE GRAPHFOR E-COMMERCE
GraphAware®
INFRASTRUCTUREA 10K-FOOT VIEW
GraphAware®
A graph centric approach
THE DATA FLOW
GraphAware®
‣ Async data ingestion
‣ Data Pipeline
‣ Single Neo4j Writer
‣ Microservice approach for isolation and scalability
‣ Event notification
‣ Multiple views exported into Elasticsearch
THE NEO4J ROLES
GraphAware®
‣ Single source of truth
‣ Cleansing
‣ Fast access to connected data
‣ Query
‣ Knowledge Graph store
‣ Merging External Data
‣ Existing Data Augmentation
Natural Language Processing
‣ Unsupervised Topic Identification
‣ Word2Vec
‣ Clustering (Label Propagation)
EXTERNALISE INTENSE PROCESSES
GraphAware®
Recommendation model building
‣ Content-Based
‣ Collaborative Filtering (internal and external)
Fast, Reliable and Easy-to-tune textual searches
‣ Multiple views for multiple scopes:
‣ Catalog Navigation and Search
‣ Faceting
‣ Product details page
‣ Product variants aggregation
‣ Autocomplete
‣ Suggestion
THE ELASTICSEARCH ROLES
GraphAware®
→ It is not used as a database
Any components of relevance-scoring calculation corresponding to a meaningful and measurable information
Two techniques to control relevancy:
‣ Signal Modeling
‣ Ranking Function
Note: balance precision and recall
Multiple sources
CRAFTINGSIGNALS
GraphAware®
→ Users as a new source of information
GraphAware®
Profile-based personalisation:
‣ Explicit: Users provide profile information
‣ Implicit: Profile created from user interactions
Behavioural-Based personalisation
‣ Focus on User-Item Interaction
‣ Make explicit the relationships among users and items
PERSONALISINGSEARCH
Tying personalisation back to search
‣ Query-time personalisation
‣ Index-time personalisation
→ Search for things, not for strings
CONCEPTSEARCH
GraphAware®
Basic Approaches:
‣ Concept field (Manual Tagging)
‣ Synonyms
Content Augmentation (ML based)
‣ Co-occurrence
‣ Latent Semantic Analysis
‣ Latent Dirichlet Allocation
‣ Word2Vec
COMBINED SEARCH APPROACHES
GraphAware®
Knowledge Graphs can
‣ store easy-to-query model
‣ gather data from multiple sources
‣ be easily extended
Search Engines can
‣ provide fast, reliable and easy-to-tune textual search
‣ provide features like faceting, autocomplete
CONCLUSION
GraphAware®
→ By combining them, it is possible to offer an unlimited set of services to the end users
www.graphaware.com@graph_aware
GraphAware
GraphAware®
world’s #1 Neo4j consultancy