RI: Small: Algorithms for Sampling Similar Graphs Using Subgraph Signatures
Purdue University, West Lafayette IN
Investigators
Abstract
Abstract below: Graphs and networks are a natural representation across a wide range of disciplines and domains. Statistical tools have recently been brought to bear on the analysis of graphs, yielding rich dividends in various application areas. The aim of this project is to use tools from statistics and graph theory to develop algorithms that generate similar graphs efficiently. Since graph data is often expensive to collect, it is desirable to synthetically generate graphs. To be widely applicable however, the generated graphs need to both preserve the semantics of the original data (i.e., be drawn from the same distribution) and be efficient to compute. Two key questions form the core emphasis of the current project. First, how does one measure similarity between two graphs? Second, how can this notion of similarity be used to generate new graphs? On the topic of similarity, the project will investigate representations to preserve global properties, propose new, efficient, representations for signatures, and explore sampling techniques and their convergence behavior. On the topic of generation of new graphs, the project will develop an exponential random graph model using signatures, investigate feature selection via regularization, propose novel methods to sample from the exponential random graph model and novel techniques to produce proposal graphs, and provide rigorous empirical validation across a range of application areas. The project will facilitate the study of large complex structures of the kind frequently encountered in domains like theoretical ecology, social networks, and chemo-informatics, allowing researchers in these domains to leverage statistical network analysis tools to identify significant patterns and understand algorithm performance. For further information see the project web page: URL: http://www.stat.purdue.edu/~vishy/graphs.html
View original record on NSF Award Search →