Beehive

Beehive is a distributed programming framework that comes with built-in transactions, replication, fault-tolerance, runtime instrumentation, and optimized placement.

Installation
Hello World
Deep Dive
Projects using Beehive
Mailing List
Publications

Installation

Option 1. Install Beehive using goget:

curl -sL https://git.io/goget | bash -s -- github.com/kandoo/beehive

Option 2. Install go 1.4+, set up your GOPATH, and install Beehive using go get:

go get github.com/kandoo/beehive

Test Your Setup. Enter Beehive's root directory ($GOPATH/src/github.com/kandoo/beehive) and run:

go test -v

Hello World

Let's write a simple example that counts the number of times we have said hello to each person. You can find the complete example in the GoDoc.

Message

Beehive is based on asynchronous message passing. Naturally, the first step is to define a Hello message:

// Hello represents a message in our hello world example.
type Hello struct {
	Name string // Name is the name of the person saying hello.
}

Message Handler

To handle Hello messages, we need to write an application that has a message handler for Hello. Pretty analogous to HTTP handlers, except we are processing messages not HTTP requests.

A message handler in Beehive consists of two functions: (i) Rcv and (ii) Map. Rcv is the function that actually processes a message. Since Beehive provides a generic runtime Map function that works for all applications, let's skip the Map function for now, and we will explain it in next section.

This is a simple Rcv function that handles Hello messages (don't panic it's all comments ;) ):

// Rcvf receives the message and the context.
func Rcvf(msg beehive.Msg, ctx beehive.RcvContext) error {
    // msg is an envelope around the Hello message.
    // You can retrieve the Hello, using msg.Data() and then
    // you need to assert that its a Hello.
    hello := msg.Data().(Hello)
    // Using ctx.Dict you can get (or create) a dictionary.
    dict := ctx.Dict("hello_dict")
    // Using Get(), you can get the value associated with
    // a key in the dictionary. Keys are always string
    // and values are generic interface{}'s.
    v, err := dict.Get(hello.Name)
    // If there is an error, the entry is not in the
    // dictionary. Otherwise, we set cnt based on
    // the value we already have in the dictionary
    // for that name.
    cnt := 0
    if err == nil {
        cnt = v.(int)
    }
    // Now we increment the count.
    cnt++
    // And then we print the hello message.
    ctx.Printf("hello %s (%d)!\n", hello.Name, cnt)
    // Finally we update the count stored in the dictionary.
    return dict.Put(hello.Name, cnt)
}

To register a message handler, we first create an application and then we register the Hello handler for our application:

// Create the hello world application and make sure .
app := beehive.NewApp("hello-world", beehive.Persistent(1))
// Register the handler for Hello messages.
app.HandleFunc(Hello{}, beehive.RuntimeMap(Rcvf), Rcvf)

Note that our application is persistent and will save its state on 1 node (i.e., persistent but not replicated).

Emit Hello

Now, to send a Hello message, you can emit it:

// Emit simply emits a message, here a
// string of your name.
go beehive.Emit(Hello{Name: "your name"})
// Emit another message with the same name
// to test the counting feature.
go beehive.Emit(Hello{Name: "your name"})

Whenever you emit a Hello message, it will be processed by all applications that have a handler for Hello. Here, we have only one application, but you could create different applications with different handlers for Hello. All of them would receive the Hello message.

Start

Finally, we need to start Beehive:

beehive.Start()

When you run the application (say go run helloworld.go), you will have the following output:

bee 1/HelloWorld/0000000000000402> hello your name (1)!
bee 1/HelloWorld/0000000000000402> hello your name (2)!

When you run the application one more time, you will see the following output:

bee 1/HelloWorld/0000000000000402> hello your name (3)!
bee 1/HelloWorld/0000000000000402> hello your name (4)!

Note that the counter is saved on disk, so you can safely restart your application.

Run a Cluster

This simple hello world application is actually a distributed application. The message handler is automatically sharded by Hello.Name. Later, we will explain how that happens. For now, let's just try to run our hello world application in a cluster.

Run the first node as you have done previously (go run helloworld.go). Wait until you see the hello messages:

bee 1/HelloWorld/0000000000000402> hello your name (5)!
bee 1/HelloWorld/0000000000000402> hello your name (6)!

Then, run a new node using the following command:

go run helloworld.go -addr localhost:7678 -paddrs localhost:7677 -statepath /tmp/beehive2

After you connect the second node, the first node should generate the following output:

bee 1/HelloWorld/0000000000000402> hello your name (7)!
bee 1/HelloWorld/0000000000000402> hello your name (8)!

Note that in the last command, -addr sets the listening address of the beehive server, -paddrs sets the address of the peers (the first node is listening on the default port, 7677), and -statepath sets where beehive should store its state and the dictionaries.

Note: You can reinitializing the cluster by removing both /tmp/beehive and /tmp/beehive2, and re-running the commands.

Deep Dive

Hives

A Hive is basically a Beehive server, representing one logical unit of computing (say, a physical or a virtual machine). Hives can form, join to, and leave a cluster. Beehive clusters are homogeneous, meaning that all hives in the same cluster are running the same set of applications.

Applications, Dictionaries, and Message Handlers

A Beehive application is a set of asynchronous message handlers. Message handlers simply process async messages and store their state in dictionaries. A dictionary is basically a hash map. Behind the scenes, these dictionaries are saved to disk and are replicated. A message handler is composed of a Rcv function that actually processes the message and a Map function declaring how messages should be sharded/partitioned. Beehive provides a generic Map functions (as you saw in the Hello World example) and also has a compiler that can generate Map functions based on your Rcv functions. Having said that, Map functions are almost always one-liners and are pretty easy to implement.

`Map` and Consistent Concurrency

To make the distributed and concurrent version of of message handlers, we want to balance the load of message processing among multiple go-routines across multiple hives. We need to do this in a way that the application's behavior remains identical to when we use only a single, centralized go-routine. To do so, we need to preserve the consistency of application dictionaries.

In other words, we want to make sure that each entry (or as we call them, cell) in an application dictionary is always accessed on the same logical go-routine. Otherwise, we can't guarantee that the application behaves consistently when distributed over multiple hives. For example, what would happen to our hello world application if two different go-routines could read and modify the same entry concurrently?

To that end, for each message, we need to know what are the keys used to process the message in the Rcv function of a message handler. We call this the mapped cells of that message. Each message handler, in addition to its Rcv function, needs to provide a Map function that maps an incoming message to cells or simply keys in application dictionaries. Map functions are usually very simple to implement, but you can also use Beehive's generic RuntimeMap function or the Beehive compiler to generate the Map function.

Bees

Applications and their message handlers are passive in Beehive. Internally, each hive has a set of go-routines called bees that run the message handlers for each application. Each bee exclusively owns a set of cells. These cells are the cells that must be accessed by the same go-routine to preserve consistency. Cells are locked by bees using an internal distributed consensus mechanism implemented using Raft. Bees persist their cells if needed and, when a hive restarts, we reload all the bees.

Moreover, for replicated applications, bees will form a colony of bees (itself and some other bees on other hives) and will consistently replicate its cells using raft. When a bee fails, we hand its cells and workload to other bees in the colony. The size of a colony is equal to the a

Beehive

Install / Use

README

Beehive

Installation

Hello World

Message

Message Handler

Emit Hello

Start

Run a Cluster

Deep Dive

Hives

Applications, Dictionaries, and Message Handlers

`Map` and Consistent Concurrency

Bees

Beehive

Install / Use

README

Beehive

Installation

Hello World

Message

Message Handler

Emit Hello

Start

Run a Cluster

Deep Dive

Hives

Applications, Dictionaries, and Message Handlers

Map and Consistent Concurrency

Bees

`Map` and Consistent Concurrency