这次简单分析一下模式挖掘中的图模式挖掘方向,包括研究什么内容以及对应的研究方法。 模式挖掘研究的一般是某个数据库中频繁出现的子模式,那么图模式挖掘的研究内容自然是在图数据库中挖掘频繁出现的子图。大概是什么样的呢?有两种情况:找到图数据库中许多图之间的共同子图,或是在单个图中频繁重复出现的子图。这两种情况分别对应 GSpan 算法(2002)和
Knowledge Discovery and Data MiningZhu, F., Yan, X., Han, J., Yu, P.S.: gPrune: A constraint pushing framework for graph pattern mining. In: Zhou, Z.-H., Li, H., Yang, Q. (eds.) PAKDD 2007. LNCS (LNAI), vol. 4426, pp. 388–400. Springer, Heidelberg (2007)...
pattern discoverygraph mininggraph-structured patterninductive inferenceouterplanar graphRecently, due to the rapid growth of electronic data having graph structures such as HTML and XML texts and chemical compounds, many researchers have been interested in data mining and machine learning techniques for ...
Recent years have witnessed a surge of interest in learning representations of graph-structured data, with applications from social networks to drug discovery. However, graph neural networks, the machine learning models for handling graph-structured data
Editor-in-Chief of the journal Pattern Recognition, etc. )原本是物理学的 PhD,早年的研究领域是 ...
Machine learning plays an increasingly important role in many areas of chemistry and materials science, being used to predict materials properties, accelerate simulations, design new structures, and predict synthesis routes of new materials. Graph neural networks (GNNs) are one of the fastest growing ...
Let us assume the set of medical codes as C, and denote the number of diseases as ‖Hie‖ in Hie, the hierarchical structure. The hierarchical structure, Hie, follows a pattern where higher-level diseases contain more general information, while lower-level diseases provide more specific ...
PREF/WD tends to generate many trees to maximize the data locality; the same table might occur in multiple trees and therefore be duplicated many times. Although PREF is the state-of-the-art partitioning method, it still has three major drawbacks. First, PREF/SD tends to cause a large ...
Frequent substructure pattern mining has been an emerging data mining problem with many scientific and commercial applications. As a general data structure, labeled graph can be used to model much complicated substructure patterns among data. Given a graph dataset, D={G0, G1, ..., Gn}, support...
Existing methods for fine-scale air quality assessment have significant gaps in their reliability. Purely data-driven methods lack any physically-based mechanisms to simulate the interactive process of air pollution, potentially leading to physically inc