Publication: A Non-Parametric Perspective on the Analysis of Massive Networks
No Thumbnail Available
Date
2015-05-15
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Costa, Thiago. 2015. A Non-Parametric Perspective on the Analysis of Massive Networks. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
Research Data
Abstract
This dissertation develops an inferential framework for a highly non-parametric class of network models called graphons, which are the limit objects of converging sequences in the theory of dense graph limits. The theory, introduced by Lovász and co-authors, uses structural properties of very large networks to describe a notion of convergence for sequences of dense graphs. Converging sequences define a limit which can be represented by a class of random graphs that preserve many properties of the networks in the sequence. These random graphs are intuitive and have a relatively simple mathematical representation, but they are very difficult to estimate due to their non-parametric nature. Our work, which develops scalable and consistent methods for estimating graphons, offers an algorithmic framework that can be used to unlock the potential of applications of this powerful theory.
To estimate graphons we use a stochastic blockmodel approximation approach that defines a notion of similarity between vertices to cluster vertices and find the blocks. We show how to compute these similarity distances from a given graph and how to properly cluster the vertices of the graph in order to form the blocks. The method is non-parametric, i.e., it uses the data to choose a convenient number of clusters. Our approach requires a careful balance between the number of blocks created, which is associated with stochastic blockmodel approximation of the graphon, and the size of the clusters, which is associated with the estimation of the stochastic blockmodel parameters. We prove insightful properties regarding the clustering mechanism and the similarity distance, and we also work with important variations of the graphon model, including a sparser type of graphon.
As an application of our framework, we use the stochastic blockmodel nature of our method to improve identification of treatment response with social interaction. We show how the graph structure provided by our algorithm can be explored to design optimal experiments for assessing social effect on treatment response.
Description
Other Available Sources
Keywords
Mathematics, Statistics, Artificial Intelligence
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service