End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the...

31
End-to-End quantum language model with Application to Question Answering Peng Zhang [1] , JiaBin Niu [1] , Zhan Su [1] , Bengyou Wang [2] , Liqun Ma [1] , Dawei Song [1] Tianjin University [1] Tencent [2]

Transcript of End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the...

Page 1: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

End-to-End quantum language modelwith Application to Question Answering

Peng Zhang[1], JiaBin Niu[1], Zhan Su[1], Bengyou Wang[2], Liqun Ma[1], Dawei Song[1]

Tianjin University[1]

Tencent[2]

Page 2: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Contents

➢QA System

➢Statistical Language Model

➢Quantum Language Model

➢NN-based Quantum Language Model

➢Quantum AI

Page 3: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

QA System

富途

Page 4: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

QA system in Tencent

➢Community QA FAQs

➢KBQA Knowledge Base

➢Passage QA Only unstructured documents

Page 5: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Two-step Architecture in Community QA

Prerank Rerank

Page 6: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Textual Matching

score

Match

✓Unsupervised Models✓ TFIDF/BM25✓ language model

✓Neural Network Models✓DSSM✓CNN/RNN variants

Page 7: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Contents

➢QA System

➢Statistical Language Model

➢Quantum Language Model

➢NN-based Quantum Language Model

➢Quantum AI

Page 8: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Statistical Language Model

• For a sequence of terms in the document d=w1w2…wn, SLM calculates the probability P(w1w2…wn). Based on Beyes’ rule, we have:

Page 9: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

SLM-based IR model (SLMIR)

Page 10: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Contents

➢QA System

➢Statistical Language Model

➢Quantum Language Model

➢NN-based Quantum Language Model

➢Quantum AI

Page 11: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Quantum Concept

• Simple Example:

✓ A unit vector 𝑢ϵℝ𝑛, 𝑢 2 = 1 is written as | ۧ𝑢 (ket) ✓ The transpose 𝑢⊺ is written as ۦ𝑢| (bra)✓ The projector onto the direction 𝑢 writes as | ۧ𝑢 |𝑢ۦ (dyad), corresponding to the pure state✓ The inner product between two vectors writes as ۦ𝑢| ۧ𝑢✓ The elements of the standard basis in ℝ𝑛 are denoted as | ۧ𝑒𝑖 = 𝛿1𝑖 , … , 𝛿𝑛𝑖

⊺, 𝑤ℎ𝑒𝑟𝑒 𝛿𝑖𝑗 = 1, iff 𝑖 = 𝑗

✓ Generally, any ket | ۧ𝑣 = σ𝑖 𝑣𝑖 | ۧ𝑢𝑖 is called a superposition of the | ۧ𝑢𝑖 , where {| ۧ𝑢1 , … , | ۧ𝑢𝑛 } form an orthonormal basis

projection (1,0)

(0,1)

(0,0)

(1

2,1

2)

(1

2, 0)

| ۧ𝑒1 = 1,0 ⊺, | ۧ𝑒2 = 0,1 ⊺

𝑠1 = | ۧ𝑒1 |𝑒1ۦ =1 00 0

𝑠2 = | ۧ𝑒2 |𝑒2ۦ =0 00 1

𝑠3 =1

2(𝑠1+𝑠2)

റ𝑣 = (1

2,1

2) T

Projection_1 =𝑠1 ⋅ റ𝑣 = (1

2, 0)T

Projection_2 =𝑠2 ⋅ റ𝑣 = (0,1

2)T

Projection_3 =𝑠3 ⋅ റ𝑣 = (1

2,1

2)T

Page 12: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Density Matrix

• Density MatrixA density matrix corresponds to the discrete probability distribution in classical probability theory.

It assigns a quantum probability to each one of the infinite dyads (an elementary event in quantum probability). For a density matrix 𝜌.

𝜌 =0.5 0.50.5 0.5

𝜇𝜌 | ۧ𝑒 |𝑒ۦ = 𝑡𝑟 𝜌 ۧ𝑒 |𝑒ۦ = 0.5, 𝜇𝜌 | ۧ𝑓 |𝑓ۦ = 𝑡𝑟 𝜌 ۧ𝑓 |𝑓ۦ = 1

where: | ۧ𝑒 = 1,0 ⊺ | ۧ𝑒 |𝑒ۦ =

1 00 0

| ۧ𝑓 = (1

2,1

2) T | ۧ𝑓 |𝑓ۦ =

0.5 0.50.5 0.5

Gleason’s Theorem : A. Gleason. Measures on the closed subspaces of a hilbert space. Journ. Math. Mech., 6:885–893, 1957.

Page 13: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Quantum Language Models (QLM)

• For example:

𝑉 = {computer, architecture, system}, 𝑊𝑑={computer, architecture} If we only observe single words:

If we observe the dependency of “computer” and “architecture”

𝒫𝑑 = ℰ𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟 , ℰ𝑎𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒

ℰ𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟 =1 0 00 0 00 0 0

, ℰ𝑐𝑜𝑚𝑝𝑢𝑡𝑒𝑟 =0 0 00 1 00 0 0

𝒦𝑐𝑎=

2

3

2

30

2

3

2

30

0 0 0

𝑘𝑐𝑎 = 𝜎𝑐 ۧ𝑒𝑐 + 𝜎𝑎 ۧ𝑒𝑎 , Set 𝜎𝑐 = 2/3, 𝜎a = 1/3

Page 14: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

N-gram in extended Vector Space

Word/Term Dependency

Computer Architecture System Computer Architecture

ComputerSystem

Architecture System

ComputerSystemArchitecture

Count 10 6 5 4 3 2 0

Frequency 0.33 0.2 0.166 0.133 0.1 0.066 0

|𝑉|= 𝐶𝑛1 + 𝐶𝑛

2 +⋯+ 𝐶𝑛𝑛 = σ𝑖=0

𝑛 𝐶𝑛𝑖

P(W|𝜃𝑑 ) = [0.33, 0.2, 0.166, 0.133, 0.1, 0.066, 0]

The dimension of parameter in extended Vector Space : o(n!)

Page 15: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Term Dependency (N-gram) in QLM

Word/Term Dependenc

y

Computer Architecture System Computer Architecture

ComputerSystem

Architecture System

ComputerSystemArchitecture

Projection 𝑒𝑐=[1,0,0] 𝑒𝑎=[0,1,0] 𝑒𝑠=[0,0,1] 𝑘𝑎𝑐=[2

2,

2

2,0] 𝑘𝑐𝑠=[

2

2, 0,

2

2] 𝑘𝑎𝑠=[0,

2

2,

2

2] 𝑘𝑐𝑎𝑠=[

3

3,

3

3,

3

3]

Frequency Tr(𝜌 ۧ|𝑒𝑐 |𝑒𝑐ۦ ) Tr(𝜌 ۧ|𝑒𝑎 (|𝑒𝑎ۦ Tr(𝜌 ۧ|𝑒𝑠 (|𝑒𝑠ۦ Tr(𝜌 ۧ|𝑘𝑎𝑐 (|𝑘𝑎𝑐ۦ Tr(𝜌 ۧ|𝑘𝑐𝑠 (|𝑘𝑐𝑠ۦ Tr(𝜌 ۧ|𝑘𝑎𝑠 (|𝑘𝑎𝑠ۦ Tr(𝜌 ۧ|𝑘𝑐𝑎𝑠 |𝑘𝑐𝑎𝑠ۦ )

𝜌 =

2

3

2

30

2

3

2

30

0 0 0

P(W|𝜃𝑑) = [Tr(𝜌 ۧ|𝑒𝑐 𝑒𝑐|), Tr(𝜌ۦ ۧ|𝑒𝑎 𝑒𝑎|), Tr(𝜌ۦ ۧ|𝑒𝑠 𝑒𝑠|), Tr(𝜌ۦ ۧ|𝑘𝑎𝑐 𝑘𝑎𝑐|), Tr(𝜌ۦ ۧ|𝑘𝑐𝑠 𝑘𝑐𝑠|), Tr(𝜌ۦ ۧ|𝑘𝑎𝑠 𝑘𝑎𝑠|), Tr(𝜌ۦ ۧ|𝑘𝑐𝑎𝑠 [(|𝑘𝑐𝑎𝑠ۦ

The dimension of parameter in QLM: o(n^2)

Page 16: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Term Dependency in Quantum Entanglement

• Modeling quantum entanglements in quantum language models. IJCAI 2015

Page 17: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

• LM: a document d is represented by a sequence of terms• QLM: d is represented by a sequence of quantum events

(with dyads for a term or a dependency)[Sordoni, Nie, Bengio , 2013]

Quantum Language Model (QLM)

Page 18: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Where 𝞺 is a density matrix

Computing probabilities

Page 19: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

• A density matrix 𝜌 to represent sentence

• Given the observed projectors 𝒫𝑑 = {Π1, … , Π𝑀} for sentence S, the quantum language model 𝜌 is estimated through Maximum Likelihood Estimation, and the likelihood is represented as:

• ℒ𝒫𝑑𝜌 = ς𝑖=1

𝑀 𝑡𝑟(𝜌Π𝑖)

Measurement in QLM

Page 20: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

•Likelihood:

•Estimation/Training of Density Matrix:

•Matching:

Maximum likelihood estimation for QLM

Page 21: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Limitation in QLM

If the two documents do not share any words, especially in short textUse embedding as a basic vector

It is independent with the label。

Training in a end-2-end network

Neural Network based Quantum Language Model

Page 22: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Contents

➢QA System

➢Statistical Language Model

➢Quantum Language Model

➢NN-based Quantum Language Model

➢Quantum AI

Page 23: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Simple version: NNQLM1

• Density matrix representation for sentences (q or a)

𝜌 = σ𝑝𝑖S𝑖 = σ𝑝𝑖|𝑆𝑖⟩⟨𝑆𝑖|

𝑆𝑖 =𝑆𝑖

𝑆𝑖

Page 24: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Architecture of NNQLM1

Using the product of the density matrixes as their joint representation, . The combined representations show the similarity of their density matrices.

Page 25: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Inter-sentence Similarities

• Since the density matrix is semi-positive, it can be decomposed

the similarity between 𝜌𝑞 and 𝜌𝑎

Page 26: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Architecture in NNQLM2

Page 27: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Future work in Quantum-inspired NN

Complex embedding

Richer input, higher performance

Interference in NN,

Cross-modal fusion

Entanglement in NNConnection and memory in NN

More works try to bridge the gap between Quantum Concept and Deep learning [1], It may open a new door to reveal the black-box inner mechanism of Neural Network

• [1] Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design. ICLR 2018

Page 28: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Contents

➢QA System

➢Statistical Language Model

➢Quantum Language Model

➢NN-based Quantum Language Model

➢Quantum AI

Page 29: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Exploration in Quantum AI

Machine learning algorithm in Quantum computer

✓Quantum-inspired models and ideas, but not depends on Quantum Computer

Page 30: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Quantum on general AI

• Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017• Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design.

ICLR 2018• Deep complex Network. ICLR 2018• Efficient representation of quantum many-body states with deep neural networks. Nature Communications.

2017• SchNet: A continuous-filter convolutional neural network for modeling quantum interactions, NIPS 2017

Page 31: End-to-End quantum language model with Application to ... · Quantum on general AI • Solving the quantum many-body problem with artificial neural networks[J]. Science, 2017 •

Quantum AI on Language

• End-to-End Quantum-like Language Models with Application to Question Answering. AAAI 2018.• Modeling multi-query retrieval tasks using density matrix transformation. SIGIR 2015• Modeling quantum entanglements in quantum language models. IJCAI 2015• Learning Concept Embeddings for Query Expansion by Quantum Entropy Minimization. AAAI 2014• Modeling latent topic interactions using quantum interference for information retrieval. CIKM 2013• Modeling term dependencies with quantum language models for IR. SIGIR 2013• Pure high-order word dependence mining via information geometry, ICTIR 2011 best paper.• A novel re-ranking approach inspired by quantum measurement. ECIR 2011 best paper.