Beehive
Distributed Programming Framework in GoLang
Install / Use
/learn @kandoo/BeehiveREADME
Beehive

Beehive is a distributed programming framework that comes with built-in transactions, replication, fault-tolerance, runtime instrumentation, and optimized placement.
Installation
Option 1. Install Beehive using goget:
curl -sL https://git.io/goget | bash -s -- github.com/kandoo/beehive
Option 2. Install go 1.4+, set up your GOPATH, and install Beehive using
go get:
go get github.com/kandoo/beehive
Test Your Setup. Enter Beehive's root directory
($GOPATH/src/github.com/kandoo/beehive) and run:
go test -v
Hello World
Let's write a simple example that counts the number of times we have said hello to each person. You can find the complete example in the GoDoc.
Message
Beehive is based on asynchronous message passing. Naturally, the first step is
to define a Hello message:
// Hello represents a message in our hello world example.
type Hello struct {
Name string // Name is the name of the person saying hello.
}
Message Handler
To handle Hello messages, we need to write an application that
has a message handler for Hello. Pretty analogous to HTTP handlers,
except we are processing messages not HTTP requests.
A message handler in Beehive consists of two functions: (i) Rcv and
(ii) Map. Rcv is the function that actually processes a message.
Since Beehive provides a generic runtime Map function that works for
all applications, let's skip the Map function for now, and we will
explain it in next section.
This is a simple Rcv function that handles Hello messages (don't panic
it's all comments ;) ):
// Rcvf receives the message and the context.
func Rcvf(msg beehive.Msg, ctx beehive.RcvContext) error {
// msg is an envelope around the Hello message.
// You can retrieve the Hello, using msg.Data() and then
// you need to assert that its a Hello.
hello := msg.Data().(Hello)
// Using ctx.Dict you can get (or create) a dictionary.
dict := ctx.Dict("hello_dict")
// Using Get(), you can get the value associated with
// a key in the dictionary. Keys are always string
// and values are generic interface{}'s.
v, err := dict.Get(hello.Name)
// If there is an error, the entry is not in the
// dictionary. Otherwise, we set cnt based on
// the value we already have in the dictionary
// for that name.
cnt := 0
if err == nil {
cnt = v.(int)
}
// Now we increment the count.
cnt++
// And then we print the hello message.
ctx.Printf("hello %s (%d)!\n", hello.Name, cnt)
// Finally we update the count stored in the dictionary.
return dict.Put(hello.Name, cnt)
}
To register a message handler, we first create an application and then we
register the Hello handler for our application:
// Create the hello world application and make sure .
app := beehive.NewApp("hello-world", beehive.Persistent(1))
// Register the handler for Hello messages.
app.HandleFunc(Hello{}, beehive.RuntimeMap(Rcvf), Rcvf)
Note that our application is persistent and will save its state on 1 node (i.e., persistent but not replicated).
Emit Hello
Now, to send a Hello message, you can emit it:
// Emit simply emits a message, here a
// string of your name.
go beehive.Emit(Hello{Name: "your name"})
// Emit another message with the same name
// to test the counting feature.
go beehive.Emit(Hello{Name: "your name"})
Whenever you emit a Hello message, it will be processed by all applications
that have a handler for Hello. Here, we have only one application, but
you could create different applications with different handlers for Hello.
All of them would receive the Hello message.
Start
Finally, we need to start Beehive:
beehive.Start()
When you run the application (say go run helloworld.go),
you will have the following output:
bee 1/HelloWorld/0000000000000402> hello your name (1)!
bee 1/HelloWorld/0000000000000402> hello your name (2)!
When you run the application one more time, you will see the following output:
bee 1/HelloWorld/0000000000000402> hello your name (3)!
bee 1/HelloWorld/0000000000000402> hello your name (4)!
Note that the counter is saved on disk, so you can safely restart your application.
Run a Cluster
This simple hello world application is actually a distributed application.
The message handler is automatically sharded by Hello.Name. Later,
we will explain how that happens.
For now, let's just try to run our hello world application in a cluster.
Run the first node as you have done previously (go run helloworld.go).
Wait until you see the hello messages:
bee 1/HelloWorld/0000000000000402> hello your name (5)!
bee 1/HelloWorld/0000000000000402> hello your name (6)!
Then, run a new node using the following command:
go run helloworld.go -addr localhost:7678 -paddrs localhost:7677 -statepath /tmp/beehive2
After you connect the second node, the first node should generate the following output:
bee 1/HelloWorld/0000000000000402> hello your name (7)!
bee 1/HelloWorld/0000000000000402> hello your name (8)!
Note that in the last command,
-addr sets the listening address of the beehive server,
-paddrs sets the address of the peers (the first node is listening on the
default port, 7677), and
-statepath sets where beehive should store its state and the dictionaries.
Note: You can reinitializing the cluster by removing both
/tmp/beehive and /tmp/beehive2, and re-running the commands.
Deep Dive
Hives
A Hive is basically a Beehive server, representing one logical unit of computing (say, a physical or a virtual machine). Hives can form, join to, and leave a cluster. Beehive clusters are homogeneous, meaning that all hives in the same cluster are running the same set of applications.
Applications, Dictionaries, and Message Handlers
A Beehive application is a set of asynchronous message handlers.
Message handlers simply process async messages and store their state in
dictionaries. A dictionary is basically a hash map.
Behind the scenes, these dictionaries are saved to disk and are replicated.
A message handler is composed of a Rcv function that actually processes
the message and a Map function declaring how messages should be
sharded/partitioned. Beehive provides a generic Map functions (as
you saw in the Hello World example) and also has a
compiler that
can generate Map functions based on your Rcv functions.
Having said that, Map functions are almost always one-liners and
are pretty easy to implement.
Map and Consistent Concurrency
To make the distributed and concurrent version of of message handlers, we want to balance the load of message processing among multiple go-routines across multiple hives. We need to do this in a way that the application's behavior remains identical to when we use only a single, centralized go-routine. To do so, we need to preserve the consistency of application dictionaries.
In other words, we want to make sure that each entry (or as we call them, cell) in an application dictionary is always accessed on the same logical go-routine. Otherwise, we can't guarantee that the application behaves consistently when distributed over multiple hives. For example, what would happen to our hello world application if two different go-routines could read and modify the same entry concurrently?
To that end, for each message, we need to know what are the keys used
to process the message in the Rcv function of a message handler.
We call this the mapped cells of that message. Each message
handler, in addition to its Rcv function, needs to provide a
Map function that maps an incoming message to cells or simply
keys in application dictionaries. Map functions are usually
very simple to implement, but you can also use Beehive's generic
RuntimeMap function or the Beehive compiler to generate the
Map function.
Bees
Applications and their message handlers are passive in Beehive. Internally, each hive has a set of go-routines called bees that run the message handlers for each application. Each bee exclusively owns a set of cells. These cells are the cells that must be accessed by the same go-routine to preserve consistency. Cells are locked by bees using an internal distributed consensus mechanism implemented using Raft. Bees persist their cells if needed and, when a hive restarts, we reload all the bees.
Moreover, for replicated applications, bees will form a colony of bees (itself and some other bees on other hives) and will consistently replicate its cells using raft. When a bee fails, we hand its cells and workload to other bees in the colony. The size of a colony is equal to the a
