검색 상세

링크와 노드의 내용을 반영한 그래프 노드의 임베딩

Content- and Link-Aware Node Embedding in Graph

초록/요약

Node embedding is a method to map each node in a graph into a low dimensional vector space. With the embeddings, many downstream machine learning methods can be applied, such as classification or clustering. Most of the node embedding method only considers the graph's structure which means linkage relationship among the nodes in a graph. Nowadays, there are many graph data containing nodes' attributes in the real world. The node attributes contain information as much as the linkage relationships do. If the nodes’ own attributes as well as the graph’s structure can be incorporated into the embeddings, it can help to improve the quality of the embeddings. Some of these existing embedding methods rely on the sampling of random walks. The samples of random work are to be fed into a language model to get the embeddings of the nodes. Most of random walk based node embedding methods rely on the random walk sampling process. The sampling strategy, such as how many random walks will be sampled or how long a random walk is, affects the quality of the embeddings. In this thesis, we propose a node embedding method based on both the nodes' attributes and its local structure. Our model can also output the probability that an edge exists between given nodes. The directionality of the edge can be considered if necessary. We can use these probabilities as similarity among the nodes. Our experimental results show that the proposed model outperforms several previous methods for node classification, link prediction, and graph reconstruction tasks on citation datasets and social network datasets.

more

목차

I. Introduction 1
II. Related Works 4
A. Random walk based methods 4
B. Non-Random walk based methods 5
III. Preliminaries 9
A. Attribute Embedding 9
B. Adjacent nodes in a Directed Graph 9
C. Message 10
D. Content- and Link-aware node embedding 10
E. Random walk on a Graph 12
IV. Our Framework 15
A. Embedding Nodes 15
B. Convergence of distributions of node embeddings 17
C. Transition Probability 19
D. Overall Structure 22
E. Approximation 23
V. Experiment 27
A. Dataset 27
B. Baseline methods 28
C. Node Classification 31
D. Directed Graph Reconstruction 34
VI. Conclusion 38
References 39

more