A Non-Parametric Perspective on the Analysis of Massive Networks
MetadataShow full item record
CitationCosta, Thiago. 2015. A Non-Parametric Perspective on the Analysis of Massive Networks. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
AbstractThis dissertation develops an inferential framework for a highly non-parametric class of network models called graphons, which are the limit objects of converging sequences in the theory of dense graph limits. The theory, introduced by Lovász and co-authors, uses structural properties of very large networks to describe a notion of convergence for sequences of dense graphs. Converging sequences define a limit which can be represented by a class of random graphs that preserve many properties of the networks in the sequence. These random graphs are intuitive and have a relatively simple mathematical representation, but they are very difficult to estimate due to their non-parametric nature. Our work, which develops scalable and consistent methods for estimating graphons, offers an algorithmic framework that can be used to unlock the potential of applications of this powerful theory.
To estimate graphons we use a stochastic blockmodel approximation approach that defines a notion of similarity between vertices to cluster vertices and find the blocks. We show how to compute these similarity distances from a given graph and how to properly cluster the vertices of the graph in order to form the blocks. The method is non-parametric, i.e., it uses the data to choose a convenient number of clusters. Our approach requires a careful balance between the number of blocks created, which is associated with stochastic blockmodel approximation of the graphon, and the size of the clusters, which is associated with the estimation of the stochastic blockmodel parameters. We prove insightful properties regarding the clustering mechanism and the similarity distance, and we also work with important variations of the graphon model, including a sparser type of graphon.
As an application of our framework, we use the stochastic blockmodel nature of our method to improve identification of treatment response with social interaction. We show how the graph structure provided by our algorithm can be explored to design optimal experiments for assessing social effect on treatment response.
Citable link to this pagehttp://nrs.harvard.edu/urn-3:HUL.InstRepos:17467341
- FAS Theses and Dissertations