MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in...

14
MHDNE: Network Embedding Based on Multivariate Hawkes process Ying Yin 1 , Jianpeng Zhang 1 , Yulong Pei 2 , Xiaotao Cheng 1 , and Lixin Ji 1 1 Information Engineering University, China. 2 Eindhoven University of Technology, the Netherlands. Abstract. With the evolution of the network, the interactions among nodes in networks make networks exhibit dynamic properties. Mining the rich information behind dynamic networks is of great importance for net- work analysis. However, most of the existing network embedding meth- ods focus on static networks which ignore the dynamic properties of net- works. In this paper, we propose a novel approach MHDNE (Multivariate hawkes process network embedding) to learn the representations of nodes in dynamic networks. The key idea of our approach is to integrate the historical edge information as well as network evolution properties into the formation process of edges based on Hawkes process. By integrating the multivariate Hawkes process into network embedding, MHDNE re- solves the issue that the existing methods cannot effectively capture both of the historical information and evolution process of dynamic networks. Extensive experiments demonstrate that the embeddings learned from the proposed MHDNE model can achieve better performance than the state-of-the-art methods in downstream tasks, such as node classification and network visualization. Keywords: network embedding; dynamic network; Hawkes process; ternary closure theory 1 Introduction In the era of big data, how to analyze contemporary networks is an urgent problem to be solved. The research on complex networks can help us deal with applications such as node classification, link prediction, community discovery and so on. With the development of machine learning, network embedding, also named network representation learning, serves as a bridge connecting networked data analysis and traditional machine learning. It maps nodes in a network to low-dimensional spaces in order to form low-dimensional dense vectors that can be used as the input of traditional machine learning models to conduct the downstream tasks. The existing network embedding methods mostly model static networks with- out dynamic attributes, that is, they assume that nodes and their edges in the networks don’t change with time. The early static network embedding methods were mainly based on the matrix decomposition [18] [2], which have high com- putational complexity and cannot adapt to the growing large-scale networks.

Transcript of MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in...

Page 1: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based onMultivariate Hawkes process

Ying Yin1, Jianpeng Zhang1, Yulong Pei2, Xiaotao Cheng1, and Lixin Ji1

1 Information Engineering University, China.2 Eindhoven University of Technology, the Netherlands.

Abstract. With the evolution of the network, the interactions amongnodes in networks make networks exhibit dynamic properties. Mining therich information behind dynamic networks is of great importance for net-work analysis. However, most of the existing network embedding meth-ods focus on static networks which ignore the dynamic properties of net-works. In this paper, we propose a novel approach MHDNE (Multivariatehawkes process network embedding) to learn the representations of nodesin dynamic networks. The key idea of our approach is to integrate thehistorical edge information as well as network evolution properties intothe formation process of edges based on Hawkes process. By integratingthe multivariate Hawkes process into network embedding, MHDNE re-solves the issue that the existing methods cannot effectively capture bothof the historical information and evolution process of dynamic networks.Extensive experiments demonstrate that the embeddings learned fromthe proposed MHDNE model can achieve better performance than thestate-of-the-art methods in downstream tasks, such as node classificationand network visualization.

Keywords: network embedding; dynamic network; Hawkes process; ternaryclosure theory

1 Introduction

In the era of big data, how to analyze contemporary networks is an urgentproblem to be solved. The research on complex networks can help us deal withapplications such as node classification, link prediction, community discoveryand so on. With the development of machine learning, network embedding, alsonamed network representation learning, serves as a bridge connecting networkeddata analysis and traditional machine learning. It maps nodes in a network tolow-dimensional spaces in order to form low-dimensional dense vectors that canbe used as the input of traditional machine learning models to conduct thedownstream tasks.

The existing network embedding methods mostly model static networks with-out dynamic attributes, that is, they assume that nodes and their edges in thenetworks don’t change with time. The early static network embedding methodswere mainly based on the matrix decomposition [18] [2], which have high com-putational complexity and cannot adapt to the growing large-scale networks.

Page 2: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

2 Ying Yin et al.

With the development of artificial intelligence, PEROZZI et al. proposed theclassic Deepwalk algorithm [16] to apply neural networks to network embed-ding which is based on the Word2vec [12] model in natural language processing.GROVER et al. proposed the Node2vec algorithm [8] to modify the randomwalk process of Deepwalk, which preserves both of the homogeneity and isomor-phism of the networks by combining breadth-first search and depth-first search.Deepwalk and Node2vec algorithm are based on the Word2vec framework, whichhas a three-layer shallow neural network at its core. With the development ofdeep learning, WANG et al. proposed the SDNE algorithm [21] to apply thedeep neural network to network embedding. The semi-supervised deep learningmodel preserves the local information as well as the global information in thenetworks through the first-order similarity module and the second-order simi-larity module. The GraphGAN model [22] proposed by WANG et al. and theANE model [4] proposed by DAI et al. adopt the generative adversarial nets [6]to network embedding, which greatly improved the robustness of static networkembeddings.

In recent years, the representation learning for static networks has graduallymatured. However, as the network evolves over time, new edges may appear andexpired edges may disappear. The evolving interactions among the nodes in net-works make networks exhibit dynamic properties. The addition of time informa-tion makes the networks more complex, and dynamic networks often have largescales. The research on dynamic network is of great significance to solve practi-cal application problems, such as community detection [1] [17] and link predic-tion [11]. Traditional dynamic network representation methods take snapshots ofdynamic networks at specific times to process which is equivalent to split the dy-namic network into multiple static network sequences. Thus, the static networkembedding models can be extended to handle dynamic networks. Most of the ex-isting dynamic network embedding methods are derived from the static networkembedding models: Inspired by the static network embedding method based onmatrix eigenvalue decomposition, LI et al. proposed DANE algorithm [10] tocapture the evolving patterns of network structures and attribute information.DANE updates the current node embeddings based on the node embeddingsobtained from the previous time. DHPE algorithm [24] preserved the high-orderproximity based on generalized singular value decomposition and updated thenode embeddings dynamically based on matrix perturbation theory. In additionto dynamic network embedding methods based on matrix decomposition, thereare some dynamic network embedding methods extended from classic static net-work embedding methods, for instance, DNE algorithm [5] which extends theLINE model [19] to dynamic network embedding and DynGEM algorithm [7]which is based on the SDNE model. These snapshot-based embedding methodsoften ignore the network evolution patterns. Besides, the embedding methodswhich only process dynamic information on the snapshots are relatively coarse-grained. The HTNE algorithm [25] proposed by ZUO et al. leverages the Hawkesprocess to model the formation process of neighbor nodes, which provides a novelidea of dynamic network embedding. However, HTNE only takes the influence

Page 3: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based on Multivariate Hawkes process 3

of historical neighbor nodes on current node embeddings into account, whileignoring the impact of network evolution properties.

Therefore, to overcome the drawbacks of the existing methods, this paperproposes MHDNE (Multivariate hawkes process dynamic network embedding)model to learn the dynamic network embedding. MHDNE integrates the his-torical edge information as well as network evolution properties to model theformation process of edges based on Hawkes process. The main contributions ofthis paper are summarized as follows:

1) We propose a novel approach for dynamic network embedding which preservesthe impacts of historical information on the current network based on theHawkes process.

2) The MHDNE model proposed in this paper considers the historical edge in-formation as well as the network evolution properties, which captures theinfluence of historical information on the formation of current edges compre-hensively.

3) Experimental results on real-world networks demonstrate that the embeddingvectors learned from the proposed MHDNE model can achieve better perfor-mances than the state-of-the-art methods in node classification and networkvisualization.

2 Problem Definition

In this section, we formulate the problem of dynamic network embedding andgive necessary definitions used throughout this paper as follows:

Definition 1. (Dynamic network). A dynamic network within time T canbe defined as a collection G = {G1, G2, ..., GT } containing a series of networksnapshots. The snapshot at time t(0 < t < T ) can be denoted as Gt = (Vt, Et) ,where Vt and Et denote the set of nodes and edges at time t respectively.

Definition 2. (Dynamic network embedding). Given a dynamic networkG = {G1, G2, ..., GT }, we map nodes in the snapshots to the low-dimensionalspace so that nodes can be represented as vectors. And the temporal and struc-tural information can be preserved in the low-dimensional vector space.

Definition 3. (Ternary closure). The ternary closure generally refers to: twopeople who have common friends in a social circle are more likely to becomegood friends in the future. That is, if nodes a and b connect to the same node cin a network, edge between a and b is likely to form. The ternary closure theoryaffects the formation of networks which is an important characteristic reflectingthe network evolution mechanism.

Definition 4. (Hawkes process). The Hawkes process, as a special linear self-excited point process, is widely used in economic analysis, social analysis, andgeographic prediction. In the Hawkes process, the occurrence of new events isnot only affected by the internal properties of the events, but also the historical

Page 4: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

4 Ying Yin et al.

events occurring at the previous moments. The generation intensity function ofa new event can be defined as follows:

λ(t) = γt +∑ts<t

ϕ(t− ts) (1)

Where γt indicates the base intensity of a new event, showing the spontaneousevent occurring intensity at time t. ϕ(t− ts) indicates the influence of historicalevents on the occurrence of new event which continuously decays with time.ϕ(t−ts) can be expressed as ϕ(t−ts) = αδ(t, ts), where α denotes the excitationintensity of the historical events to the current event, δ(t, ts) denotes the timedecay coefficient of the historical event.

3 The Proposed Framework: MHDNE

In this section, first we generalize the MHDNE framework, and then we describethe core components of MHDNE in details. Finally, we introduce the modeloptimization.

3.1 MHDNE Framework

The framework of the MHDNE model proposed in this paper is shown in fig-ure 1. First, we model the edge formation process in the dynamic network astwo temporal sequences L1 and L2 which contain historical edge informationand network evolution information, respectively. Then, based on the temporalsequences, we apply the Hawkes process to model the new edge formation pro-cess to integrate the historical information of the dynamic network as well asthe evolution properties into the node embeddings.

3.2 Component Description

The formation of a dynamic network is a process with continuous emergence anddisappearance of edges, so the formation of edges can be regarded as temporalpoint process. We can define the generation of edges as events in this process. Andthey form in two ways: one is that the edge exists in the historical moment andis preserved at the current moment. The other is that the edge never appears inthe historical moment but forms at the current moment in the evolution process.These two kinds of edge generating ways are respectively related to the followingtwo edge formation sequences.

– Historical edge sequence:. in a dynamic network, if there is an edgebetween nodesm and n at time ti. the edge can be denoted as a time-stampedtuple (enm, ti). Then, the edges between nodes m and n within time T can bemodeled as a temporal sequence L1 = (enm, t1)−→(enm, t2)−→...−→(enm, tT ).Intuitively, nodes that have more interactions in the history tend to formedge at the current moment. Therefore, the generation intensity of the edgecontaining nodes m and n at the current time is affected by the historicaledge (enm, ti).

Page 5: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based on Multivariate Hawkes process 5

……

……

1 1 2= ( , ) ( , ) ...... ( , )n n n

m m m TL e t e t e t

1 2

2 , 1 , 2 ,= ( , ) ( , ) ...... ( , )TS S S

m n m n m n TL T t T t T t

TG1G

Historical egde

information

Network evolution information

d

V

Hawkes point process

Fig. 1. The framework of MHDNE.

– Open triangle sequence: if there is a common neighbor k between nodesm and n at time t, the formed open triangle can be denoted as a triple(ekm, e

nk , t). All of the open triangles composed of nodes m, n, and all of their

common neighbors S at time t can be represented as a set (TSm,n, t). Then, an

open triangle sequence within time T can be modeled as a temporal sequenceL2 = (TS1

m,n, t1)−→(TS2m,n, t2)−→...−→(TST

m,n, tT ). It can be known from theternary closure theory that even if there is no edge between nodes m and nin history, if the two nodes have common neighbors, they tend to connect inthe process of network evolution.

Since Hawkes process [9] well captures the exciting effects of historical infor-mation on the current events, we adapt it to model the edge formation processof nodes m and n based on the historical edge sequence L1. The generationintensity function for the arrival event (enm, t) can be formulated as:

λ1enm(t) = γm,n +∑ts<t

αeyxδ(t, ts) (2)

where λ1enm(t) indicates the probability of forming edge between node m and nat time t, γm,n indicates the base intensity of forming edge between nodes m andn at time t.The base intensity γm,n reflects the essential relationship betweennode m and n, Which can be denoted as the negative Euclidean distance betweenthe embeddings of node m and n. i.e.,γm,n = −‖vm − vn‖, vm and vn are theembeddings of node m and n respectively. x and y indicates the correspondinghistorical nodes of node m and n respectively. αeyx denotes the influence of the

Page 6: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

6 Ying Yin et al.

historical edge (eyx, ts) on the new edge (enm, t). δ(t, ts) is a time decay function,usually expressed as an exponential form δ(t, ts) = exp(−θ(t− ts)).

The more stable the local structure containing historical nodes x and y, themore likely the current nodes m and n are to be connected at current time. Wecan leverage the clustering coefficient of x and y to characterize the stabilityof the local structure of the network in the time-varying process. The localclustering coefficient refers to the ratio between the number of closed trianglescontaining node r and the number of triples containing node r in the network,which can be formulated as:

Cr =2Er

mr(mr − 1)(3)

where mr denotes the number of edges associated with node r, Er denotesthe number of edges between nodes connecting to r. The larger the cluster-ing coefficient of historical nodes x and y, the higher the probability of con-necting edges between current nodes m and n. Thus, we can denote αexy

asαexy

= CxCyg(vx, vy), the equation 2 can be updated as follows.

λ1enm(t) = g(vm, vn) +∑ts<t

CxCyg(vx, vy)δ(t, ts) (4)

In addition to the fact that the connected edges appeared in history willappear at the current time with a certain probability, in the process of networkevolution, it can be known from the ternary closure theory [23] that nodes withcommon neighbors in history tend to be connected at the current time. Thatis, when there is historical event (ekm, e

nk , ts), the probability of occurring new

event (enm, t) will increase. Similarly, we can use the local clustering coefficientsof nodes to measure the intensity of this influence: the greater the clusteringcoefficients of nodes, the stronger the ternary closure process nearby, and thegreater the probability of generating new event (enm, t). Meanwhile, the morecommon neighbors two nodes have, the more likely they are to connect. Andthe closer the historical event (ekm, e

nk , ts) is to the current time, the greater the

probability of generating edge (enm, t). Therefore, equation 4 can be updated to:

λenm(t) = g(vm, vn) +∑ts<t

(CxCyg(vx, vy)δ(t, ts) +∑

(ekm,enk ,ts)∈L2

Ckg(v′k, vk)δ(t, ts))

(5)

where g(v′k, vk) is the negative Euclidean distance between the current com-mon neighbor k′ and the historical common neighbor k in the mapping space. Itcan be seen from equation 5, λenm(t) may be a negative value, and the probabilityof generating new edges should be a positive value. Therefore, we take the index

value of λenm(t) as the final probability, namely ˆλenm(t) = exp(λenm(t)).

Based on the Hawkes process, given the relevant historical edge sequence L1

and the open triangle sequence L2, we can obtain the probability of generating

Page 7: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based on Multivariate Hawkes process 7

edge (enm, t) between node m and n at time t as follows.

P (n|m,H(L1, L2)) =ˆλenm(t)∑

n∗∈V,en∗m ∈E

ˆλenm(t)(6)

3.3 Model Optimization

For all nodes in the network, the likelihood function of the model can be denotedas follows.

logl =∑m∈V

∑n∈V,enm∈E

logP (n|m,H(L1, L2)) (7)

In order to reduce the computational complexity of the algorithm, we use thenegative sampling method [13] to optimize the algorithm. The probability thata node is selected as a negative sample is related to the frequency at which itappears in the sequence, so we get samples according to the degree distributionof nodes. According to [13], the sampling probability of node vi can be denotedas follows.

P (vi) =f(vi)

34∑K

j=1 f(vi)34

(8)

Based on the historical edge information and the ternary closure property,the objective function of generating new edge (enm, t) can be denoted as follows.

O(X) = logσ( ˆλenm(t)) +

K∑i=1

Evi∼P (v)[−logσ( ˆλevim (t))] (9)

where K is the number of negative samples. σ(x) is the sigmoid function.Commonly, we adopt Stochastic Gradient Descent method [3] to optimize

the objective function in equation 9. Algorithm 1 shows the core of our method.

Algorithm 1 MHDNE

Input: The dynamic information networkG = {G1, G2, ..., GT }, embedding dimensiond, parameter θ, time step h.

Output: The latent node embeddings X ∈ RV ×d.1: Initialize X2: for epoch in len(epochs) do3: for batch(L1, L2) do4: Calculate the influence of historical information and network evolution infor-

mation on the current edge according to equation 5.5: Select negative samples NegEK according to equation 8.6: compute the objective function O(X) according to equation 9.

7: update gradients X = X − η ∂O(X)O(X)

8: end for9: end for

Page 8: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

8 Ying Yin et al.

4 Experiments

In this section, we validate the effectiveness of our model on two real-worlddatasets as shown in table 1. First, we introduce the datasets and baseline meth-ods we used in our experiments in details, and then we conduct downstreamtasks: node classification and visualization. Finally, we analyze the parametersensitivity of θ.

4.1 Datasets

• DBLP [14]: the DBLP dataset contains a large amount of information aboutcomputer science publications. We built the dynamic network in our exper-iment through the co-author relationships of 28,085 authors in ten researchfields over ten years. The categories of authors are the research field in whichthey published the most papers.• Epinions [20]: the Epinions dataset consists of comment information, user

ID, product ID, and timestamp information. In our experiment, 21575 usersbelonging to five categories in ten years were extracted from the subset ofEpinions dataset, and we build edges between users who comment on thesame product. The categories of users are determined by the categories of theproducts they comment the most.

The detailed information of the two datasets is shown in table 1.

Table 1. Datasets in experiments.

Datasets Nodes Edges Class Type Average clustering coefficient

DBLP 28085 236894 10 Undirected 0.715648Epinions 21575 2590798 5 Undirected 0.153612

4.2 Baseline Methods

In this paper, MHDNE algorithm is compared with the following three baselinealgorithms:

• Avg Deepwalk algorithm: we conduct Deepwalk algorithm on differentsnapshots to get node embeddings in different time.• STWalk algorithm [15]: STWalk performs space-walk and time-walk on the

constructed graphs which can capture the spatio-temporal behavior of nodes.• HTNE algorithm [25]: HTNE performs the dynamic network embedding

based on the Hawkes process which can capture both the historical and currentinformation from the perspective of neighbor nodes sequences.

Page 9: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based on Multivariate Hawkes process 9

4.3 Downstream Tasks

In this section, we carry out downstream tasks such as node classification andvisualization to verify the feasibility and effectiveness of the proposed dynamicnetwork representation method MHDNE. The experimental default parametersare set as follows: the vector dimension is 128, the negative sample number is 5,and the gradient drop learning rate is 0.01. In the algorithm using random walkand skip-gram model, the random walk length is 50, the number of walks is 100,and the window size is 10.

A. Classification

We conduct node classification task on the DBLP and Epinions datasets,the embeddings learned from different methods were classified by a linear SVMclassifier. We repeat classification experiment ten times and take the average ofMicro-F1 and Macro-F1 scores as the final classification results. The experimen-tal results are shown in table 2 and table 3.

Table 2. Multi-class node classification results in DBLP dataset.

Metric Method 15% 30% 45% 60% 75% 90%

Micro-F1

Avg Deepwalk 0.6180 0.6253 0.6285 0.6310 0.6334 0.6371STWalk 0.6270 0.6336 0.6413 0.6540 0.6594 0.6586HTNE 0.6402 0.6521 0.6559 0.6594 0.6603 0.6559

MHDNE 0.6685 0.6724 0.6792 0.6890 0.6975 0.6856

Macro-F1

Avg Deepwalk 0.6193 0.6297 0.6302 0.6390 0.6389 0.6323STWalk 0.6302 0.6382 0.6 476 0.6550 0.6598 0.6612HTNE 0.6427 0.6584 0.6627 0.6583 0.6613 0.6594

MHDNE 0.6711 0.6793 0.6801 0.6850 0.6983 0.6926

Table 3. Multi-class node classification results in Epinions dataset.

Metric Method 15% 30% 45% 60% 75% 90%

Micro-F1

Avg Deepwalk 0.5279 0.5302 0.5386 0.5321 0.5391 0.5302STWalk 0.5271 0.5327 0.5367 0.5409 0.5465 0.5497HTNE 0.5689 0.5786 0.5796 0.5803 0.5851 0.5867

MHDNE 0.5964 0.5970 0.6054 0.6127 0.6089 0.6103

Macro-F1

Avg Deepwalk 0.5321 0.5343 0.5392 0.5441 0.5486 0.5467STWalk 0.5386 0.5401 0.5427 0.5489 0.5504 0.5526HTNE 0.5703 0.5794 0.5828 0.5864 0.5893 0.5907

MHDNE 0.5989 0.6054 0.6086 0.6154 0.6121 0.6128

We set the training set size varying from 15% to 90%. We can see from thetable 2 and table 3, the MHDNE algorithm proposed in this paper performs bet-ter than the baseline methods in node classification on the DBLP and Epinions

Page 10: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

10 Ying Yin et al.

datasets. On the DBLP dataset, when the training set is 75%, the MHDNE algo-rithm has the highest scores of Macro-F1 and Micro-F1, which are 3.72%∼6.41%,3.70%∼5.94% higher than the comparison algorithms respectively. On the Epin-ions dataset, when the training set is 60%, the MHDNE algorithm has the highestscores of Macro-F1 and Micro-F1, which are 3.24%∼8.06% and 2.90%∼7.13%higher than the comparison algorithms. It can be seen from the experimentalresults that the integration of network historical information into current nodeembeddings is beneficial to improve the quality of embeddings. Especially whenwe integrate the historical edge information and the network evolution proper-ties into network embedding, the obtained node embeddings perform better inclassification.

B. Network visualizationWe leverage the t-SNE algorithm to visualize the representation vectors of

2500 authors from four fields (Data Mining, Artificial Intelligence, InformationRetrieval and Computer Vision.) in the DBLP dataset into the 2-dimensionalspace. We use different colors to indicate different research area. Specifically,we use purple dot to represent “Data Mining”, blue dot to represent “Artifi-cial Intelligenc”, orange dot to represent “Information Retrieval”, green dot torepresent “Computer Vision”. Figure 2 demonstrates the visualization results ofnode embeddings obtained by different algorithms.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

(a)Avg Deepwalk.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

(b)STWalk.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

(c)HTNE.

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

(d)MHDNE.

Fig. 2. Visualization of authors from four research areas.

As can be seen from the figure 2: the Avg Deepwalk algorithm can only mapthe authors in the “Artificial Intelligenc” field to an independent community,and the authors in the other three domains are confused; the STWalk algorithm

Page 11: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based on Multivariate Hawkes process 11

maps authors in the “Data Mining” and “Computer Vision” domain to relativelyscattered locations, failing to preserve the properties of such kind of nodes; theHTNE algorithm can Map part of authors in the fields of “Artificial Intelligenc”,“Information retrieva” and “Computer Vision” to different communities, butmap some authors in these three fields to the “Data Mining” field; comparedwith other algorithms, the proposed MHDNE algorithm can map authors intodifferent communities and there are clear margins among different areas.

The visualization results indicate that the historical information combinedwith network evolution information in our method can help us do communitydetection. This is because the formation of a community is often related to histor-ical information, which can assist us in discovering communities. The embeddingsgenerated by our method MHDNE integrate historical information and networkevolution information, which can preserve the community information better.Therefore, the embeddings learned by our method MHDNE perform better thanother baseline methods in visualization.

5 Parameter Sensitivity

The time decay function can be expressed in exponential form δ(t, ts) = exp(−θ(t−ts)), where θ is the time decay coefficient excited by historical events. We ob-serve the changes of classification accuracy with θ varying from 0.01 to 1 toanalyze the parameter sensitivity. We conduct the node classification on DBLPand Epinions, and set the training set to 75%. From equation 2, it can be knownthat: the larger the value of θ is, the smaller the influence of historical eventson current events. The experimental results are shown in figure 3, from which it

0.0 0.2 0.4 0.6 0.8 1.00.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Accu

racy

Accuracy

(a) DBLP dataset.

0.0 0.2 0.4 0.6 0.8 1.00.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

Accu

racy

Accuracy

(b) Epinions dataset.

Fig. 3. sensitivity of θ on DBLP and Epinions dataset.

can be seen that the optimal value of parameter θ is different from each otherin two datasets. On the DBLP dataset, when setting θ to 0.2, the classificationaccuracy obtained is the highest; when 0.2 < θ < 0.4, the classification accuracy

Page 12: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

12 Ying Yin et al.

remains basically unchanged; when θ > 0.4, the classification accuracy decreasesslightly with the increase of θ; On the Epinions dataset, when setting θ to 0.3,the node classification accuracy is the highest; when θ > 0.4, the classificationaccuracy greatly decreases with the increase of θ.

The main reason for the different experimental results on the two datasets isthat the DBLP dataset is composed of authors and their co-author relationship,and the co-author relationships between authors will not change much in a shorttime. Therefore, the historical information of DBLP dataset has a small degreeof time decay, and the value of θ should be small. However, Epinions datasetconstitutes social network through users and their comment behaviors, and itshistorical information has a large time decay degree, so θ should be large.

6 Conclusion

To combine the dynamic properties of contemporary networks into network em-bedding, we proposed a dynamic network embedding method based on Hawkesprocess (MHDNE). Since the node embedding vectors learned by our model cap-ture both the historical structure information and evolution mechanism, theyperform well in the downstream tasks such as node classification and networkvisualization. At present, the research on network dynamic properties is still inits infancy, our method only takes the dynamic properties of homogeneous net-works into account. However, networks in our real life may have both dynamicand heterogeneous features. How to take the rich heterogeneous information intothe dynamic network embedding is the next research focus.

References

1. Ana Paula Appel, Renato L. F. Cunha, Charu C. Aggarwal, and Marcela MegumiTerakado. Temporally evolving community detection and prediction in content-centric networks. In Joint European Conference on Machine Learning and Knowl-edge Discovery in Databases, pages 3–18. Springer, 2002.

2. Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps and spectral techniques forembedding and clustering. In Advances in neural information processing systems,pages 585–591, 2002.

3. Leon Bottou. Stochastic gradient learning in neural networks. Proceedings ofNeuro-Nımes, 91(8):12, 1991.

4. Quanyu Dai, Qiang Li, Jian Tang, and Dan Wang. Adversarial network embedding.In Thirty-Second AAAI Conference on Artificial Intelligence, pages 2167–2174,2018.

5. Lun Du, Yun Wang, Guojie Song, Zhicong Lu, and Junshan Wang. Dynamic net-work embedding: An extended approach for skip-gram based network embedding.In International Joint Conferences on Artificial Intelligence Organization, pages2086–2092, 2018.

6. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.In Advances in neural information processing systems, pages 2672–2680, 2014.

Page 13: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

MHDNE: Network Embedding Based on Multivariate Hawkes process 13

7. Palash Goyal, Nitin Kamra, Xinran He, and Yan Liu. Dyngem: Deep embeddingmethod for dynamic graphs. In International Joint Conference on Artificial Intel-ligence (International Workshop on Representation Learning for Graphs), 2018.

8. Aditya Grover and Jure Leskovec. node2vec: Scalable feature learning for networks.In Proceedings of the 22nd ACM SIGKDD international conference on Knowledgediscovery and data mining, pages 855–864. ACM, 2016.

9. Alan G Hawkes. Spectra of some self-exciting and mutually exciting point pro-cesses. Biometrika, 58(1):83–90, 1971.

10. Jundong Li, Harsh Dani, Xia Hu, Jiliang Tang, Yi Chang, and Huan Liu. At-tributed network embedding for learning in a dynamic environment. In Proceed-ings of the 2017 ACM on Conference on Information and Knowledge Management,pages 387–396. ACM, 2017.

11. Taisong Li, Zhang Jiawei, Yu Philip S., Zhang Yan, and Yan Yonghong. Deepdynamic network embedding for link prediction. IEEE Access, 6(99):29219–29230,2018.

12. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimationof word representations in vector space. In International Conference on LearningRepresentations (workshop), 2013.

13. Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Dis-tributed representations of words and phrases and their compositionality. In Ad-vances in neural information processing systems, pages 3111–3119. ACM, 2013.

14. Catarina Moreira, Pavel Calado, and Bruno Martins. Learning to rank academicexperts in the dblp dataset. Expert Systems, 32(4):477–493, 2015.

15. Supriya Pandhre, Himangi Mittal, Manish Gupta, and Vineeth N Balasubrama-nian. Stwalk: learning trajectory representations in temporal graphs. In Proceedingsof the ACM India Joint International Conference on Data Science and Manage-ment of Data, pages 210–219. ACM, 2018.

16. Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. Deepwalk: Online learning ofsocial representations. In Proceedings of the 20th ACM SIGKDD internationalconference on Knowledge discovery and data mining, pages 701–710. ACM, 2014.

17. Giulio Rossetti and Rmy Cazabet. Community discovery in dynamic networks: asurvey. ACM Computing Surveys, 51(2):1–35, 2018.

18. Sam T Roweis and Lawrence K Saul. Nonlinear dimensionality reduction by locallylinear embedding. science, 290(5500):2323–2326, 2000.

19. Jian Tang, Meng Qu, Mingzhe Wang, Ming Zhang, Jun Yan, and Qiaozhu Mei.Line: Large-scale information network embedding. In Proceedings of the 24th in-ternational conference on world wide web, pages 1067–1077. International WorldWide Web Conferences Steering Committee, 2015.

20. Jiliang Tang, Huiji Gao, and Huan Liu. mtrust: discerning multi-faceted trust in aconnected world. In Proceedings of the fifth ACM international conference on Websearch and data mining, pages 93–102. ACM, 2012.

21. Daixin Wang, Peng Cui, and Wenwu Zhu. Structural deep network embedding.In Proceedings of the 22nd ACM SIGKDD international conference on Knowledgediscovery and data mining, pages 1225–1234. ACM, 2016.

22. Hongwei Wang, Jia Wang, Jialin Wang, Miao Zhao, Weinan Zhang, FuzhengZhang, Xing Xie, and Minyi Guo. Graphgan: Graph representation learning withgenerative adversarial nets. In Thirty-Second AAAI Conference on Artificial In-telligence, pages 2508–2515. AAAI-18 Conference Committee, 2018.

23. Lekui Zhou, Yang Yang, Xiang Ren, Fei Wu, and Yueting Zhuang. Dynamic net-work embedding by modeling triadic closure process. In Thirty-Second AAAI Con-ference on Artificial Intelligence, 2018.

Page 14: MHDNE: Network Embedding Based on Multivariate Hawkes …€¦ · state-of-the-art methods in downstream tasks, such as node classi cation and network visualization. Keywords: network

14 Ying Yin et al.

24. Dingyuan Zhu, Peng Cui, Ziwei Zhang, Jian Pei, and Wenwu Zhu. High-order prox-imity preserved embedding for dynamic networks. IEEE Transactions on Knowl-edge and Data Engineering, 30(11):2134–2144, 2018.

25. Yuan Zuo, Guannan Liu, Hao Lin, Jia Guo, Xiaoqian Hu, and Junjie Wu. Em-bedding temporal network via neighborhood formation. In Proceedings of the 24thACM SIGKDD International Conference on Knowledge Discovery & Data Mining,pages 2857–2866. ACM, 2018.