Groupon’s Workflow Service, “Backbeat”, Now Open Source

By
at April 14th, 2016

Screen Shot 2016-04-14 at 3.24.13 PM
Groupon operates globally in 28 countries. In order to stay financially nimble at this scale, the company needs automated processes for just about everything. Groupon’s Financial Engineering Developers (FED) team developed and maintain our system for automating merchant payments. The system has paid millions of merchants billions of dollars. One of the tools that helped us reach this scale is a workflow service we created called Backbeat. We recently open-sourced this service and want to share how it helped us and how it can help you.

Let’s see how Backbeat helps out with a simplified example of Groupon’s merchant payment process. For this example we will say there are 3 distinct steps for making a payment-

1. Get Sold Vouchers – We query our inventory systems to discover Groupon vouchers that were sold for a particular contract.

2. Calculate Payment – We get the relevant payment term on the contract from the contract service. Payment term describes how we pay the merchant for that particular contract. We multiply the number of unpaid vouchers by the amount we owe the merchant per voucher. We take this total and create a payment object in our database.

3. Send to Bank– Here we take any payment objects that have not been sent to the bank and call the bank’s API telling them where we want to send the money and how much. If successful, we mark the payments as “sent_to_bank” so that we do not accidentally send them again.

These separate steps, or activities as we call them, need to happen in a specific order. It can be best organized into a simple workflow. A workflow is the series of activities that are necessary to complete a task. Our workflow for this example would look like this-

Screen Shot 2016-04-14 at 3.19.19 PM

We decided to break the payment process into separate steps because it is easier to retry at an individual activity level if they fail. For example: If the bank’s API cannot be reached for whatever reason, we will want to retry the “Send to Bank” activity, but not the whole “Make Payment” process. It also allows for resource level throttling. For example, it may be the case that as a client of the bank API you are only allowed X number of connections to their banking service. Now that “Send to Bank” is its own step, we can have it run asynchronous and limit the number of processes that can run that type of activity.

An alternative approach may be to maintain the workflow implicitly in our application code. Various activities would check the status of previous activities before executing, and we could use Zookeeper or database locks to ensure activities don’t race. This is a lot of bookkeeping for authoring new activities. Additional bookkeeping increases the risk of introducing a bug, and in payments, that might mean making a double payment or none at all! Splitting the business logic from the workflow state provides more visibility into the state of the system, simplifies authoring new workflows, and makes the system less brittle when changes are required.

Introducing Backbeat

Backbeat is a free to use, open-source, workflow service. It allows applications to process activities across N processes/machines while maintaining the order and time in which activities will run. Activities can spawn new activities as they run and will be automatically retried if they fail. Activities can be blocking or non-blocking so you can run things in a parallel or sequentially at any level in a workflow.

It is Backbeat’s responsibility to tell the application when it is time to run an activity and the application responds with the status of the activity (completed or errored) along with any child activities it may want to run. When Backbeat tells the client application to run an activity, the application can put it onto a queue and process it later. Here is sequence diagram of a client interacting with backbeat for our “Make Payment” example above-

Screen Shot 2016-04-14 at 3.20.28 PM

In our example we were running things sequentially, one activity after another. There may be cases where you want to run some activities in parallel and when all of those activities finish you want to perform some other business logic-

Screen Shot 2016-04-14 at 3.21.27 PM

Backbeat can do this by allowing activities to have different modes like blocking, non-blocking, or fire-and-forget. It has other features like scheduling activities to run in the future and linking activities across workflows.

Backbeat was written entirely in Ruby and uses PostgreSQL for its data-store. It uses Redis along with Sidekiq to do its async processing and scheduling. Several engineering teams at Groupon are using Backbeat in production to solve their asynchronous workflow problems. We invite you to start using it for your applications today. The software is free to use and customize with directions on how to set it up on your own server. We have a ruby client for your ruby applications that allows for easy interaction with Backbeat, and we have other language clients in development. We invite the open-source community to contribute and work with us to continually improve it.

Here is the Github Repo and Wiki Page for getting started.


No comments yet


Leave a Reply

Your email address will not be published. Required fields are marked *