:heavy_exclamation_mark: ATTENTION: the new version of PyCABeM based on PyExPool v.3+ with WebUI is rebranded to Clubmark (demonstrated and published in ICDM18) :heavy_exclamation_mark:

PyCABeM (former HiCBeM) - Python Benchmarking Framework for the Clustering Algorithms Evaluation

\brief Uses extrinsic (NMIs) and intrinsic (Q) measures for the clusters quality evaluation considering overlaps (nodes membership by multiple clusters)
\author: (c) Artem Lutov artem@exascale.info
\organizations: eXascale Infolab, Lumais, ScienceWise
\keywords: overlapping clustering benchmarking, community detection benchmarking, algorithms benchmarking framework.

Content

Functionality
Dependencies
Prerequisites
Usage
Usage Examples
Benchmark Structure
Extension
Related Projects

Functionality

Generic Benchmarking Framework

optionally generates or preprocesses datasets using specified executable(s) (by default uses LFR framework for overlapping weightted networks)
optionally executes specified apps (clustering algorithms; can be a binary, any script or java executable) with the specified params on the specified datasets (networks)
optionally evaluates results of the execution using specified executable(s) (by default performs NMIs and Q evaluation) and performs unified aggregation of results from multiple apps on multiple datasets into the single file by the specified measure
per-task and global timeouts (for an app execution on a single dataset) and specified number of CPU cores (workers) are set for the batch apps execution / evaluation using the multi-process task execution pool (mpepool)
per-task and accumulative execution tracing and resutls logging is performed even in case of internal / external interruptions and crashes:
- all stdout/err output is logged
- resources consumption, i. e. time: execution (wall-clock) and CPU concumption (user, kernel, total), memory (RAM RSS) are traced
automatic extension / backup of the previously existent results to .gzip with the timestamp on the benchmarking reexecution

It is possible to have multiple input directories with similary named files inside, which represent different instances / snapshots of the datasets. In such case, the output results are provided per each snapshot, plus aggregated weighted average over all snapshots. This is useful to avoid occasional bias to the specific instance or to analize evolving networks.

In case of the measured application crash, the crash is logged and has no any impact on the exectuion of the remaining applications.

Benchmark of the Hierarchical Overlapping Clustering Algorithms

The benchmark is implemented as customization of the Generic Benchmarking Framework to evaluate Hierarchical Overlapping Clustering Algorithms:

produces synthetic networks with specified number of instances for each set of parameters, generating them by the extended LFR Framework ("Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities" by Andrea Lancichinetti and Santo Fortunato)
shuffles (reorders nodes) specified networks specified number of times, which is required to evaluate stability / determinism of the clustering algorithms
executes
- HiReCS (www.lumais.com/hirecs)
- SCP (Sequential algorithm for fast clique percolation)
- Louvain (original and igraph implementations)
- Oslom2
- GANXiS/SLPA (but this algorithm is not uploaded into the repository, because it was provided by the author Jerry Xie for "academic use only"; deterministic algorithm LabelRankT is a modification of GANXiS, but LabelRankT is not publicly available)
- Randcommuns - generation of random communities (clusters) with struture of clusters similar to the ground-truth: the same number of random connected nodes in the number of clusters taken from the ground-truth
clustering algorithms on the generated synthetic networks (or on any specified directories and files). Outputs results (clusters/communities structure, hierarchy, modularity, nmi, etc.) of the clustering algorithms are stored in the corresponding files.

Features \ Algs | HiReCS | SCP | Louvain | Oslom2 | GANXiS | --- | :-: | :-: | :-: | :-: | :-: Hierarchical | + | | + | + | Multi-scale | + | + | + | + | + Deterministic | + | + | | | With Overlaps | + | + | | + | + Parameter-Free | + | | + | |

evaluates results using:
- extrinsic measures - NMIs for overlapping communities, extended to have uniform input / output formats:
  - NMI - gecmi (https://bitbucket.org/dsign/gecmi/wiki/Home, "Comparing network covers using mutual information" by Alcides Viamontes Esquivel, Martin Rosvall)
  - NMI_s - onmi (https://github.com/aaronmcdaid/Overlapping-NMI, "Normalized Mutual Information to evaluate overlapping community finding algorithms" by Aaron F. McDaid, Derek Greene, Neil Hurley)
- intrinsic measure - Q (standard modularity value, but applicable for overlapping communities), evaluated by HiReCS (http://www.lumais.com/hirecs)
resources consumption is evaluated using exectime profiler (https://bitbucket.org/lumais/exectime/)

All results and traces are stored into the corresponding files even in case of internal (crash) / external termination of the benchmarking applications or the whole framework.

Note: valuable extensions of the employed external applications are uploaded into ./contrib/

Basically the framework executes a set of applications on the specified datasets in interactive or daemon mode, logging the resources consumption, output and exceptions, providing workflow management (termination by timeout, resistance to exceptions, etc.) and results aggregation.

Dependencies

Prerequisites

Be sure that the operational system allows to work with lots of opened files:

Max number of the opened files in the system$ sysctl fs.file-max should be at least 1048576
Max number of the opened files per a process$ ulimit -n should be at least 4096, may be higher depending on the evaluating datasets and algorithms.

To setup fs.file-max permanently in the system add the following line to the /etc/sysctl.conf:

fs.file-max = 1048576

and then reload it by # sysctl -p.
To setup the ulimit permanently add the following lines to the /etc/security/limits.conf:

*               hard    nofile          524288
*               soft    nofile          4096

And then execute ulimit -n 65536 to set this value for the current process.

Fundamental

Python 2.7+ (or PyPy JIT for the fast execution).

Note: It is recommended to run the benchmark itself under PyPy. The measured algorithms can be ran either using the same python or under the dedicated interpreter / script / executable.

Libraries

hirecs for modularity evaluation of overlapping community structure with results compatible to the standard modularity value. It depends on:
- libstdc++.so.6: version GLIBCXX_3.4.20 (precompiled version for modularity evaluation). To install it on Ubuntu use: sudo apt-get install libstdc++6 or

$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test 
$ sudo apt-get update
$ sudo apt-get install libstdc++6

Note: This functionality is available in the dev version of the HiReCS 2 and have not been pushed to the public hirecs repository yet. Please write me if you need it.

python-igraph for Louvain algorithm evaluation by NMIs (because the original implementation does not provide convenient output of the communities to evaluate NMIs): $ pip install python-igraph. It depends on:
- libz and libxml2, which are installed in Linux Ubuntu executing:
  $ sudo apt-get install lib32z1-dev libxml2-dev
gecmi for the NMI_ovp evaluation depends on:
- libboost_program_options, to install execute: $ sudo apt-get install libboost-program-options. The older version of gecmi compiled under Ubuntu 14 depends on libboost_program_options.so.1.54.0, the newer one compiled under Ubuntu 16 depends on libboost_program_options.so.1.58.0.
- libtbb.so.2, to install execute: sudo aptitude download libtbb2; sudo aptitude install libtbb2
Note: gecmi dependencies are uploaded to ./algorithms/gecmi_deps/.
PyExPool for asynchronious jobs execution and results aggregation via tasks of jobs

Note: it is uploaded to ./contrib/.

External tools that are used as executables

Extended LFR Benchmark for the undirected weighted networks with overlaps (origins are here: https://sites.google.com/site/santofortunato/inthepress2, https://sites.google.com/site/andrealancichinetti/files)
Tiny execution profiler to evaluate resources consumption: https://bitbucket.org/lumais/exectime/
Clustering algorithms, used in the benchmarking: HiReCS, SCP Louvain (original and [igraph](http://

PyCABeM

Install / Use

README