Conrad Lee, PhD Student in Computational Social Science, University College Dublin, gives a talk on: 'Are network communities good for nothing? Benchmarking algorithms with inference tasks.'
Title: Are network communities good for nothing? Benchmarking algorithms with inference tasks
Speaker: Conrad Lee, PhD Student in Computational Social Science, University College Dublin
Abstract: While community detection algorithms proliferate like rabbits in the spring, relatively little work has gone into determining which methods work best. In many cases, we know only that a given method can partition Zachary's Karate club - a problem which was solved over thirty years ago. Furthermore, the small literature concerned with benchmarking these algorithms focuses on synthetic data, leaving us with little evidence to support the claim that we can find meaningful communities in non-trivial, real-world social network data. We know so little about the performance of these algorithms because on the one hand we have a poor a priori intuition of how network communities are actually structured, and on the other hand we lack datasets that have a "ground truth" set of communities.
In this presentation, I argue that the quality of network communities can be evaluated by measuring how well they allow inference of missing information, such as certain node attributes and missing links. More concretely, good network communities should provide a machine learning model with informative features. I will discuss some conceptual and practical difficulties which came up when implementing a benchmark based on this premise using the Facebook100 dataset. Early results indicate that all tested methods have a bias for a particular scale, a finding which suggests that a scaling parameter is necessary. For example, modularity maximization and the Map Equation perform poorly, even when using the hierarchical versions of these methods. Their performance improved only when using their generalized formulations, which include a scaling parameter that alters the underlying objective function.
Time & place:
Monday 1 October at 10:30-11:30
Building 321, room 119
Everyone is welcome!