The Power of Machine Learning and Graphs
-
Upload
franz-inc-allegrograph -
Category
Technology
-
view
18 -
download
0
Transcript of The Power of Machine Learning and Graphs
The power of
machine learning
and graphs
March 2017, Webinar
Jans Aasman
allegrograph.com
Contents
• Cognitive computing overview
• Why we put the results of analytics and machine learning
back in the graph
• Some examples first
• The main loop for data science with graphs
• Two AG open source packages for data science
• AllegRo: an R interface to AllegroGraph
• Python-Agraph: a Python interface to AllegroGraph (installable with
Anaconda)
• Future work
10 years ago
Structured Data
7 years ago
Structured Data Unstructured Data
7 years ago
Structured DataUnstructured Data
NLP, Key/Value stores, NoSQL, Big Data, Hadoop, IoT
4 to 5 years ago
Structured DataUnstructured Data and IOT
KnowledgeDomain knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontologies
New #1: Learning. Feed output of data
science back into data infrastructure
Structured
Data
Unstructured
Data and IOT
KnowledgeDomain knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontolo
gies
Probabilistic
Inferences.
New # 2: Everything in one (distributed)
Semantic Graph
Structured
Data
Unstructured
Data
KnowledgeDomain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.Unstructured
Data and IOT
KnowledgeDomain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
AKA: Cognitive Computing
Structured
Data
Unstructured
Data
KnowledgeDomain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Probabilistic
Inferences.Unstructured
Data and IOT
KnowledgeDomain
knowledge
Linked Open Data
Vocabularies
Taxonomies/Ontol
ogies
Current state of analytics
Usually the output of data science results in reports and publications but
• No formal trace where the data came from
• No formal link to the actual methods you used,
or who did it, or when you did it
• Cannot be compared to earlier results
• Cannot be used as building blocks for further research
• In general : the output is not queryable and discoverable
Enriching the graph with analytics
True Machine Learning
• results become data
• build layers of analytics
• Formal provenance for all results. Links to the actual data and methods
you used, or who did it, or when you did it, or even why you did it.
• Important for compliance and auditability
• Important for explaining why you took certain actions
• Historical analysis
• Results become queryable and discoverable
• The analytics fits in the total infrastructure of structured/unstructured and
knowledge.
Odds ratio
Association rules
K-means clustering
In the ecommerce world: find similar objects based on > 10 criteria, including description, product codes, pictures, etc
The main loop for data science with graphs
AllegroGraph
SPARQL
dataframe
R Python SPARK
results
AllegRo: work with AG directly from R
• Line1
AllegRo: work with AG directly from R
• The entire AllegroGraph API directly available from R
• Create/open databases, add/delete/query, SPARQL 1.1
• Create data frames directly from getStatement or SPARQL queries
• Will work with free AllegroGraph trial version
Quick tutorial demo
Agraph-python (available on github but please use Anaconda to install)
• Line1
Title
• Line1
Quick tutorial demo Python & Anaconda
create an environment in Anaconda2
• conda create -name testenv -c franzinc agraph-python numpy pandas matplotlib
activate environment
• source activate testenv
if you want to install a particular version of agraph-pythonif you want to install a particular version of agraph-python
• conda install -c franzinc agraph-python=6.2.0
or, if an older version is already installed:
• conda update -c franzinc agraph-python
IRIS example
• Classify 3 types of Irises: Setosa, Virginica, Versicolour
• based on petal length, petal width,
sepal length, sepal width.
Future Webcasts
• Formal ontologies to represent analytic output
• Using Knime as a data science framework
• Distributed AllegroGraph & SPARK
SDL Super-
Learner
Conclusion
• Adding data science results back to the graph is an valuable new
paradigm
• We make it really straight forward to do data science with AllegroGraph
• Try it