CMU-CS-24-139 Computer Science Department School of Computer Science, Carnegie Mellon University
Deep Learning on Graphs: Minji Yoon Ph.D. Thesis July 2024
Graphs are everywhere, from e-commerce to knowledge graphs, abstracting interactions among individual data entities. Various real-world applications running on graph-structured data require effective representations for each part of the graph – nodes, edges, subgraphs, and the entire graph – that encode its essential characteristics. In recent years, Deep Learning on Graphs (DLG) has broken ground across diverse domains by learning graph representations that successfully capture the underlying inductive bias in graphs. However, these groundbreaking DLG algorithms sometimes face limitations when applied to real-world scenarios. First, as graphs can be built on any domain that has interactions among entities, real-world graphs are diverse. Thus, for every new application, domain expertise and tedious work are required for hyperparameter tuning to find an optimal DLG algorithm. Second, scales of real-world graphs keep increasing to billions with unfiltered noise. This requires redundant preprocessing such as graph sampling/noise filtering in advance of DLG to be realized in applications. Next, real-world graphs are mostly proprietary, while many DLG algorithms often assume they have full access to external graphs to learn their distributions or extract knowledge to transfer to other graphs. Finally, the advent of single-modal foundation models in language and vision fields has catalyzed the assembly of diverse modalities, resulting in the formulation of multimodal graphs with diverse modalities on nodes and edges. However, learning on multimodal graphs while exploiting the generative capabilities of each modality's foundation models is an open question in DLG. In this thesis, I propose to make DLG more practical across four dimensions: 1) automation, 2) scalability, 3) privacy, and 4) multimodality. First, we automate algorithm search and hyperparameter tuning under the message-passing framework. Then, we propose to sample each node's neighborhood to regulate the computation cost while filtering out noisy neighbors adaptively for the target task to handle scalability issues. For privacy, we redefine conventional problem definitions, including graph generation and transfer learning, to be aware of the proprietary and privacy-restricted nature of real-world graphs. Finally, I proposed a new multimodal graph learning algorithm that is built on unimodal foundation models and generates content based on multimodal neighbor information. As the data collected by humanity increases in scale and diversity, the relationships among individual elements increase quadratically in scale and complexity. By making DLG more scalble, privacy-certified, and multimodal, we hope to enable better processing of these relationships and positively impact a wide array of domains. 170 pages
Thesis Committee:
Srinivasan Seshan, Head, Computer Science Department
| |
Return to:
SCS Technical Report Collection This page maintained by reports@cs.cmu.edu |