ParallelSubmodularMaximization
Parallel Double Greedy for Unconstrained Non-monotone Submodular Maximization
Install / Use
/learn @pxinghao/ParallelSubmodularMaximizationREADME
This repository contains the code for the parallel algorithms presented in our <a href="paper/pan-etal-nips2015.pdf">NIPS 2014 paper</a> for unconstrained maximization of non-monotone submodular functions. We adopt a database-transactional view of the serial double greedy algorithm which provides a provably optimal 1/2 approximation. Each set of operations on an element is treated as a transaction to be executed in parallel with other transactions.
Our first parallel algorithm, CF-2g (coordination-free double greedy) simply executes the transactions in parallel without coordination between processors / transactions. CF-2g has better scalability but suffers from possibly worse approximation.
On the other hand, CC-2g (concurrency control double greedy) ensures coordinates transactions by maintaining upper and lower bounds on the running output of the algorithm. This ensures serializability, that is, the output of CC-2g is equivalent to some order of processing elements by serial double greedy, and hence maintains the optimal 1/2 approximation. However, the greater need for coordination results in potentially poorer scalability.
<H3>Running examples without compilation</H3> As a demonstration, <a href=src/main/scala/Examples.scala>Examples.scala</a> implements both max graph cut and set cover. Data for the examples are provided in the <a href=data/>data</a> directory, sourced from <a href="http://law.di.unimi.it/datasets.php">Laboratory for Web Algorithms</a>.To run max graph cut, use the command line arguments to Examples
<pre>java -cp bin/submodmax.jar Examples inputfile=<filename> maxNThreads=<maximum number of threads></pre>For example,
<pre>java -cp bin/submodmax.jar Examples inputfile=data/eswiki-2013_rand_symm maxNThreads=4</pre> <H3>Preprocessing graph data for examples</H3> Graph data downloaded from <a href="http://law.di.unimi.it/datasets.php">Laboratory for Web Algorithms</a> can be pre-processed (randomizing order and symmetrizing graph) for use by running <a href=src/main/scala/Preprocess.scala>Preprocess.scala</a> with the command line argument <pre>java -cp bin/submodmax.jar PreprocessGraph inputfile=<filename></pre> For example, <pre>java -cp bin/submodmax.jar PreprocessGraph inputfile=data/eswiki-2013</pre> <H3>Running examples using SBT</H3>If you want to modify the code and run the examples again you can either recompile using bin/update_bin or using the sbt launch scripts as below:
To run max graph cut, use the command line arguments to Examples (this will recompile the code if necessary)
<pre>sbt/sbt "run-main Examples inputfile=<filename> maxNThreads=<maximum number of threads>"</pre>For example,
<pre>sbt/sbt "run-main Examples inputfile=data/eswiki-2013_rand_symm maxNThreads=4"</pre> <H3>Preprocessing graph data for examples</H3> Graph data downloaded from <a href="http://law.di.unimi.it/datasets.php">Laboratory for Web Algorithms</a> can be pre-processed (randomizing order and symmetrizing graph) for use by running <a href=src/main/scala/Preprocess.scala>Preprocess.scala</a> with the command line argument <pre>sbt/sbt "run-main PreprocessGraph inputfile=<filename>"</pre> For example, <pre>sbt/sbt "run-main PreprocessGraph inputfile=data/eswiki-2013"</pre>