Captain
distributed, light-weight java workflow engine for a microservice architecture
Install / Use
/learn @LiveRamp/CaptainREADME
Captain
O Captain! my Captain! our fearful trip is done! -- Walt Whitman, "O Captain! My Captain!"
Overview
Captain is a distributed, light-weight java workflow engine designed for use in a microservice architecture. It's primary purpose is to make it easy compose microservices into a workflow. It is heavily opinionated towards building simple workflows. For example, it heavily encourages linear workflows, though one can dynamically build DAGs as necessary.
At LiveRamp, Captain supports multiple pipelines that regularly have on the order of tens of thousands of active requests at a time. We hypothesize that it's at least scalable up to another order of magnitude.
To this end, here are the goals of Captain:
- abstract away coordination logic from individual services
- reduce boilerplate around coordination, making it faster to compose existing tools to build new products
- increase visibility into the steps in a pipeline
- encourage simplicity in system design
How is Captain different from other workflow engines?
- Captain relies on you to handle your persistence of a request and its metadata. You don't have to set up special DB for captain. For some this will be really valuable and others it will be a deal breaker.
- Captain is really easy to set up. We've provided an ExampleCaptainWorkflow
- Like daemon_lib, the core library has no infrastructure dependencies beyond access to a working directory on disk.
- It's opinionated towards linear workflows.
Adding the dependency
In Maven, make your project section look like this:
<project>
<!-- All the other stuff -->
<dependencies>
<!-- All your other dependencies -->
<dependency>
<groupId>com.liveramp</groupId>
<artifactId>captain</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>maven-snapshots</id>
<url>http://oss.sonatype.org/content/repositories/snapshots</url>
<layout>default</layout>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
<updatePolicy>always</updatePolicy>
</snapshots>
</repository>
</repositories>
</project>
The repository section is necessary because this project has not been published to Maven Central yet.
Getting Started
Captain requires only 3 inputs from the developer (4 if you are running it in a distributed fashion). We're going to focus on the barest bones approach in this section of to see how we can run Captain on a single node. In later sections we cover all of the other features of Captain and discuss more complicated use cases.
- manifest: a list of the steps a request should follow in your workflow
- config producer: a way for captain to find requests to process
- request updater: a way for captain to interact with your persistence layer
All it takes to get Captain running is the following: e.g.
CaptainBuilder.of("example-captain-daemon", configProducer, manifestManager, requestUpdater)
.setNotifier(daemonNotifier)
.setConfigWaitTime(1, TimeUnit.SECONDS)
.setNextConfigWaitTime(1, TimeUnit.SECONDS)
.build()
.run();
You can find the rest of this example in ExampleCaptainWorkflow.
Let's walk through what each of the components we're passing into this builder are.
Config Producer
Create a class that implements CaptainConfigProducer. This is your opportunity to tell Captain how to find requests to process. Commonly this will just be a db query to wherever you store requests or read from some sort of event queue. Regardless of the implementation, you're trying to pull the requests that are (or may be) ready to progress in your pipeline.
e.g. If I store my requests in a db table, I'll be looking for requests that have a status of ready, pending, in_progress or completed that are not in a step of done or cancelled. Usually you're going to ignore requests that are quarantined or failed or requests that you've already completed processing (e.g. step: done or cancelled). If you've implemented a retry policy with FailedRequestPolicy then you should still pick up requests in a failed status.
Check out ExampleConfigProducer for a simple example.
Request Updater
Create a class that implements RequestUpdater iface. The Request Updater is your opportunity to tell Captain how it can change the step and status on your request.
In the case where you're interacting with a db or crud service, you're implementing pretty orthodox state changes like setStatus, setStepAndStatus, quarantine, etc.
Check out ExampleRequestUpdater for an example.
Manifest
A manifest enumerates the steps (Waypoints) of a given Captain App.
e.g.
waypoint 1: ingest and parse data
waypoint 2: run analysis on data (let's say this calls some service to kick off a map-reduce job)
waypoint 3: report on output of analysis
Waypoint Components
For each Waypoint you provide Captain with an implementation of how it should execute the step. This is accomplished via three components:
Request Submitter
Implementing CaptainRequestSubmitter allows you to tell a Captain Waypoint how to build a config and then submit it. It takes in the id of your request, and it optionally returns a request handle (which will be explained in the next section)
An example can be found here in MapReduceJobSubmitter.
A pretty common pattern for a Submitter is that it will build a config by pulling information out of the db or talking to previously used services before submitting it to some new service.
Handle Persistor
Implementing CaptainHandlePersistor allows you to instruct Captain on how to save the id (or request handle) for the work that it triggered in the request submitter. It takes in a request handle and does not return anything.
e.g. When one submits a request to the analytics service for it to kick off a spark job to do some analysis, the service returns a job id, so that the progress of that request may be tracked. The handler persistor allows the user to save that handle as they see fit. Here's a code An example can be found here in MapReduceJobHandlePersistor.
This class is not required in any Captain waypoint. If you do not need to track the request id of your work, you don't need to implement this class. If this is the case, you can just have your Request Submitter return null as it won't be read anywhere else.
Status Retriever
Implementing CaptainStatusRetriever allow you to tell a Captain Waypoint how to check the status of a request. It takes in an id and returns a CaptainStatus.
The most common use case is to poll a service to which you've submitted work as to the status of your request. Based on the status returned by the service, you tell Captain what status it should set for your request.
Waypoint Types
There are 3 types of waypoints. Each use some or all of these Waypoint Components described above (Request Submitter, Handle Persistor, Status Retriever)
Asynchronous Waypoint
This Waypoint submits work to a service and saves the request id it receives from the service. It then uses the status retriever to check back that that service fulfilled its request. It accepts a Request Submitter, Handle Persistor, and Status Retriver. The Handle Persistor is optional and can be skipped in the cases described in the Handle Persistor section.
e.g. I submit a request to the analytics service that kicks off some long running job. Save the job id. I poll until my job is done.
note: "submits work to a service" technically your waypoint can just do work in the process it's spun up on whatever local hardware its running on. Doing so subverts the point of Captain. The goal is to make coordination between components cheap so that we make good choices in separating our concerns. As a rule of thumb, the less code you're writing in the Captain classes, the more you're adhering to the intended goals of the project. Nothing is stopping you from doing horrible things, Captain's trying to give you every opportunity to make a good choice.
Synchronous Waypoint
This Waypoint submits work to a service, and then, as long as the submission of that work does not fail, moves the request on to the next step in your manifest.
It accepts a Request Submitter.
It optionally accepts a Handle Persistor, in the case where you want to save a request id, but don't want to wait for that request to be processed.
e.g. I submit a request to a service to report that new stats have been generated. As long as the service returns no error, I'm good to go.
Control Flow Waypoint
This Waypoint is designed for:
- forcing a request to wait for pre-conditions to be met before progressing in your pipeline.
- making it easy to add validations to your Captain workflow.
It accepts a Status Retriever only. Here are a couple examples of how it could be used:
e.g. Hold the re
