Scientific Track

Scalable Bayesian Non-Negative Tensor Factorization for Massive Count Data

We present a Bayesian non-negative tensor factorization model for count-valued tensor data, and develop scalable inference algorithms (both batch and online) for dealing with massive tensors. Our generative model can handle overdispersed counts as well as infer the rank of the decomposition. Moreover, leveraging a reparameterization of the Poisson distribution as a multinomial facilitates conjugacy in the model and enables simple Gibbs sampling and variational Bayes (VB) inference updates. We also develop a set of online inference algorithms that allow scaling up the model to massive tensors.

Robust Classification of Information Networks by Consistent Graph Learning

Graph regularization-based methods have achieved great success for network classification by making the label-link consistency assumption, i.e., if two nodes are linked together, they are likely to belong to the same class. However, in a real-world network, there exist links that connect nodes of different classes. These inconsistent links raise a big challenge for graph regularization and deteriorate the classification performance significantly. To address this problem, we propose a novel algorithm, namely Consistent Graph Learning, which is robust to the inconsistent links of a network.

Response-Guided Community Detection: Application to Climate Index Discovery

Discovering climate indices--time series that summarize spatiotemporal climate patterns--is a key task in the climate science domain. In this work, we approach this task as a problem of response-guided community detection; that is, identifying communities in a graph associated with a response variable of interest.

Maintaining sliding-window neighborhood profiles in interaction networks

Large networks are being generated by applications that keep track of relationships between different data entities. Examples include online social networks recording interactions between individuals, sensor networks logging information exchanges between sensors, and more. There is a large body of literature on computing exact or approximate properties on large networks, although most methods assume static net- works. On the other hand, in most modern real-world applications, net- works are highly dynamic and continuously interactions along existing connections are generated.

Latent Factors Meet Homophily in Diffusion Modelling

Diffusion is an important dynamics that helps spreading informationwithin an online social network. While there are already numerous models for single item diffusion, few have studied diffusion of multiple items, especially when items can interact with one another due to their inter-similarity. Moreover, the well-known homophily effect is rarely considered explicitly in the existing diffusion models.

Hierarchical Sparse Dictionary Learning

Sparse coding plays a key role in high dimensional data analysis. One critical challenge of sparse coding is to design a dictionary that is both adaptive to the training data and generalizable to data of same type. In this paper, we propose a novel dictionary learning algorithm to build an adaptive dictionary regularized by a priori over-completed dictionary. This hence leads to a hierarchical sparse structure on the a priori dictionary over the learned dictionary, and a hierarchical sparse structure on the learned dictionary over the data, respectively.

Handling oversampling in dynamic networks using link prediction

Oversampling is a common characteristic of data representing dynamic networks. It introduces noise into representations of dynamic networks, but there has been little work so far to compensate for it. Oversampling can affect the quality of many important algorithmic problems on dynamic networks, including link prediction. Link prediction asks to predict edges that will be added to the network given previous snapshots of the network.

Generalized Modularity for Community Detection

Detecting the underlying community structure of networks is an important problem in complex network analysis. Modularity is a well-known quality function introduced by Newman, that measures how vertices in a community share more edges than what would be expected in a randomized network.

Finding large dense subgraphs in relational graphs

This paper considers the problem of finding large dense subgraphs in relational graphs, i.e., a set of graphs which share a common vertex set. We present an approximation algorithm for finding the densest common subgraph in a relational graph setbased on an extension of Charikar's linear programming basedapproach for finding the densest subgraph in a single graph. We also present a simple greedyheuristic which can be implemented efficiently foranalysis of larger graphs. We give graph dependent bounds on thequality of the solutions returned by our methods.

Finding Community Topics and Membership in Graphs

Community detection in networks is a broad problem with many proposed solutions. Existing methods frequently make use of edge density and node attributes; however, the methods ultimately have different definitions of community and build strong assumptions about community features into their models. We propose a new method for community detection, which estimates both per-community feature distributions (topics) and per-node community membership. Communities are modeled as connected subgraphs with nodes sharing similar attributes.