Maven and GitHub: Forking a GitHub repository and hosting its Maven artifacts

at July 2nd, 2014

Two tools that the Android team frequently uses here are GitHub and Maven (and soon to be Gradle, but that’s another story). Groupon has one of the most widely deployed mobile apps in the world, and with more than 80 million downloads and 54% of our global transactions coming from mobile, it’s important that we have the right tools to keep our operation running smoothly all the time.

The Groupon Android team recently tried to move our continuous integration builds from a Mac Mini on one of our desks to the companywide CI cluster. We got held up during the process because the build simply wouldn’t run on the new VMs. We discovered the problem was a bug in the Maven Android plugin, which neglected to escape the filesystem path when executing ProGuard and caused our build to fail for paths that contained unusual characters. Instead of looking to change the paths of all of our builds company wide, we focused on what it would take to fix the bug in the plugin. Read on for details on how we managed this process.

Fixing the bug itself was easy. The harder part was figuring out how to distribute the fix while we waited for the plugin maintainer to review and merge it in. One of the great advantages of GitHub is that it makes it easy to fork a project and fix bugs. However, if you’re using Maven, you also need a way to host your fork’s artifacts in a Maven repo, and there isn’t a way to automatically do that with GitHub.

You could host your own Nexus server, but that’s a lot of overhead if you don’t already have one for just a simple fork. You could set up a cloud storage solution to hold your Maven repo, but the internet is already littered with defunct and broken Maven repo links – why add one more that you’ll have to maintain it forever?

A better solution to this problem is to use GitHub to host your Maven repositories. A Maven repository is, at its heart, just a structured set of files and directories that are publicly available via http, and GitHub allows you do this easily with its raw download support. The same technique is used by GitHub itself to serve up GitHub Pages websites.

The basic solution involves three steps:

  1. Create a branch called mvn-repo to host your Maven artifacts.
  2. Use the Github site-maven-plugin to push your artifacts to Github.
  3. Configure Maven to use your remote mvn-repo as a maven repository.

There are several benefits to using this approach:

  • It ties in naturally with the deploy target so there are no new Maven commands to learn. Just use mvn deploy as you normally would.
  • Maven artifacts are kept separate from your source in a branch called mvn-repo, much like github pages are kept in a separate branch called gh-pages (if you use github pages).
  • There’s no overhead of hosting a separate Maven Nexus or cloud storage server, and your maven artifacts are kept close to your github repo so it’s easy for people to find one if they know where the other is.

The typical way you deploy artifacts to a remote maven repo is to use mvn deploy, so let’s patch into that mechanism for this solution.

First, tell maven to deploy artifacts to a temporary staging location inside your target directory. Add this to your pom.xml:

<distributionManagement>
    <repository>
        <id>internal.repo</id>
        <name>Temporary Staging Repository</name>
        <url>file://${project.build.directory}/mvn-repo</url>
    </repository>
</distributionManagement>

<plugins>
    <plugin>
        <artifactId>maven-deploy-plugin</artifactId>
        <version>2.8.1</version>
        <configuration>
            <altDeploymentRepository>internal.repo::default::file://${project.build.directory}/mvn-repo</altDeploymentRepository>
        </configuration>
    </plugin>
</plugins>

Now try running mvn clean deploy. You’ll see that it deployed your maven repository to target/mvn-repo. The next step is to get it to upload that directory to github.

Add your authentication information to ~/.m2/settings.xml so that the github site-maven-plugin can push to github:

<plugins>
    <plugin>
        <artifactId>maven-deploy-plugin</artifactId>
        <version>2.8.1</version>
        <configuration>
            <altDeploymentRepository>internal.repo::default::file://${project.build.directory}/mvn-repo</altDeploymentRepository>
        </configuration>
    </plugin>
</plugins>

(As noted, please make sure to chmod 700 settings.xml to ensure no one can read your password in the file.)

Then tell the github site-maven-plugin about the new server you just configured by adding the following to your pom:

<properties>
    <!-- github server corresponds to entry in ~/.m2/settings.xml -->
    <github.global.server>github</github.global.server>
</properties>

Finally, configure the site-maven-plugin to upload from your temporary staging repo to your mvn-repo branch on github:

<build>
    <plugins>
        <plugin>
            <groupId>com.github.github</groupId>
            <artifactId>site-maven-plugin</artifactId>
            <version>0.9</version>
            <configuration>
                <message>Maven artifacts for ${project.version}</message>  <!-- git commit message -->
                <noJekyll>true</noJekyll>                                  <!-- disable webpage processing -->
                <outputDirectory>${project.build.directory}/mvn-repo</outputDirectory> <!-- matches distribution management repository url above -->
                <branch>refs/heads/mvn-repo</branch>                       <!-- remote branch name -->
                <includes><include>**/*</include></includes>
                <merge>true</merge>                                        <!-- don't delete old artifacts -->
                <repositoryName>YOUR-REPOSITORY-NAME</repositoryName>      <!-- github repo name -->
                <repositoryOwner>YOUR-GITHUB-USERNAME</repositoryOwner>    <!-- github username  -->
            </configuration>
            <executions>
              <!-- run site-maven-plugin's 'site' target as part of the build's normal 'deploy' phase -->
              <execution>
                <goals>
                  <goal>site</goal>
                </goals>
                <phase>deploy</phase>
              </execution>
            </executions>
        </plugin>
    </plugins>
</build>

The mvn-repo branch does not need to exist, it will be created for you.

Now run mvn clean deploy again. You should see maven-deploy-plugin “upload” the files to your local staging repository in the target directory, then site-maven-plugin committing those files and pushing them to the server.

[INFO] Scanning for projects...
[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building DaoCore 1.3-SNAPSHOT
[INFO] ------------------------------------------------------------------------
...
[INFO] --- maven-deploy-plugin:2.5:deploy (default-deploy) @ greendao ---
Uploaded: file:///Users/mike/Projects/greendao-emmby/DaoCore/target/mvn-repo/com/greendao-orm/greendao/1.3-SNAPSHOT/greendao-1.3-20121223.182256-3.jar (77 KB at 2936.9 KB/sec)
Uploaded: file:///Users/mike/Projects/greendao-emmby/DaoCore/target/mvn-repo/com/greendao-orm/greendao/1.3-SNAPSHOT/greendao-1.3-20121223.182256-3.pom (3 KB at 1402.3 KB/sec)
Uploaded: file:///Users/mike/Projects/greendao-emmby/DaoCore/target/mvn-repo/com/greendao-orm/greendao/1.3-SNAPSHOT/maven-metadata.xml (768 B at 150.0 KB/sec)
Uploaded: file:///Users/mike/Projects/greendao-emmby/DaoCore/target/mvn-repo/com/greendao-orm/greendao/maven-metadata.xml (282 B at 91.8 KB/sec)
[INFO] 
[INFO] --- site-maven-plugin:0.7:site (default) @ greendao ---
[INFO] Creating 24 blobs
[INFO] Creating tree with 25 blob entries
[INFO] Creating commit with SHA-1: 0b8444e487a8acf9caabe7ec18a4e9cff4964809
[INFO] Updating reference refs/heads/mvn-repo from ab7afb9a228bf33d9e04db39d178f96a7a225593 to 0b8444e487a8acf9caabe7ec18a4e9cff4964809
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 8.595s
[INFO] Finished at: Sun Dec 23 11:23:03 MST 2012
[INFO] Final Memory: 9M/81M
[INFO] ------------------------------------------------------------------------

Visit github.com in your browser, select the mvn-repo branch, and verify that all your binaries are now there.

mvn-repo

Congratulations!

You can now deploy your maven artifacts to a poor man’s public repo simply by running mvn clean deploy

There’s one more step you’ll want to take, which is to configure any poms that depend on your pom to know where your repository is. Add the following snippet to any project’s pom that depends on your project:

<repositories>
    <repository>
        <id>YOUR-PROJECT-NAME-mvn-repo</id>
        <url>https://raw.github.com/YOUR-USERNAME/YOUR-PROJECT-NAME/mvn-repo/</url>
        <snapshots>
            <enabled>true</enabled>
            <updatePolicy>always</updatePolicy>
        </snapshots>
    </repository>
</repositories>

Now any project that requires your jar files will automatically download them from your Github Maven repository.

Here at Groupon, we used this technique to fix our bug in Maven Android plugin, and then easily shared our fork with the wider Android community until the fix was incorporated upstream. This works particularly well for projects that are no longer being maintained since those projects may never get around to merging in your pull requests.

We hope this makes forking Maven repos in GitHub as easy for you as it has been for us!


Introducing Geekon Talk – Seattle

at July 2nd, 2014

GEEKon_Talk_logo

Groupon Engineering Seattle is an important hub for us with continued recent expansion and growth. Groupon Engineering generally thrives on the exchange and cross-connection of ideas leading to new technologies, approaches and breakthroughs.

Accordingly, we’re excited to be launching our Geekon Talk series in our Seattle office.

The Geekon Talk series provides a platform for outside speakers to come into Groupon to present ideas and information on everything related to tech, entrepreneurship, tools, product design and start-ups.

Join us for our kick-off talk Tuesday, July 8. David Blevins will provide a code-first tour of TomEE, including quickly bootstrapping REST projects, doing proper testing with Arquillian, and setting up environments.

Apache TomEE is the Java EE version of Apache Tomcat and offers out-of-the-box integration with JAX-RS, JAX-WS, JMS, JPA, CDI, EJB and more. Finally Tomcat goes beyond just Servlets and JSP. Save time chasing down blog posts, eject libs from your webapps and start development with a more complete stack that is extremely well tested.

This will be a fun and lively talk with many engineering cross-connections. Sign up here.

When: Tuesday, July 8, 12:00pm -1:00pm Pacific. Where: Groupon Seattle, 505 Fifth Avenue South, Suite 310, Seattle, WA, 98104. What: Lively presentation and conversation (food will also be provided.)


Mobile Test Engineering – Odo

at June 26th, 2014

Earlier this year we told you about Odo, an innovative mobile test engineering tool we developed here at Groupon to overcome some of the challenges involved in testing our mobile app, which more than 80 million people have downloaded worldwide. We’re excited to tell you that Odo is now available at our Groupon Github home!

The struggle is real

Our engineering teams had to come up with a way to build the engaging mobile experiences fast to our end users. Along the way, our Mobile Test Engineering team started to come across common problems in the space and we were not able to find a solution outside that would fit our needs. Specifically, we attempted to tackle some of these challenges:

  1. Development and testing of new features without dependent service API’s being available yet.
  2. We gravitated towards using realistic data, but needed more than a traditional mock server. While a mock may work, you will need to update the stub data whenever the service changes. In a SOA world, the maintenance can come at a huge cost.
  3. We needed a mock server that has an HTTP based interface so we can integrate with our end to end test suites as well as a mock server that could easily integrate into development builds. We wanted to be able share the manipulated data corpus across dev and test. What’s the point in reinventing the wheel, right?
  4. We had to simulate a complex scenario that may not be possible using static data. Example: three requests are made to an API. The first two are successful, but the third fails.

Oh boy! It’s Odo!

Odo is a man-in-the-middle for client and server communication. It can easily be used as a mock/stub server or as a proxy server. Odo is used to modify the data as it passes through the proxy server to simulate a condition needed for test. For example, we can request a list of Groupon deals and set the “sold out” condition to true to verify our app is displaying a sold out deal correctly. This allows us to still preserve the state of the original deal data that other developers and testers may be relying on.

The behaviors within Odo are configurable through a REST API and additional overrides can be added through plugins, so Odo is dynamic and flexible. A tester can manually change the override behaviors for a request, or configurations can even be changed at runtime from within an automated test.

Odo In A Nutshell

As a request passes through Odo, the request’s destination host and URI are matched against the hosts and paths defined by the configuration. If a request matches a defined source host, the request is updated with the destination host. If the host name doesn’t match, the requests is sent along to its original destination. If the URI does not match a defined path, the request is executed for its new host. If the host and path match an enabled path, the enabled overrides are applied during its execution phase. There are two different types of overrides in Odo.

A Request Override is an override applied to a request coming from the mobile client on its way to the server. Generally, an override here will be for adding/editing a parameter or modifying request headers.

A Response Override executes on the data received from the server before passing data back to the client. This is where we can mock an API endpoint, change the HTTP response code, modify the response data to simulate a “sold out” Groupon deal, change the price, etc.

Automation Example

Client client = new Client(“API Profile", false); client.setCustomResponse("Global", "response text”); client.addMethodToResponseOverride("Global”, "com.groupon.proxy.internal.Common.delay"); client.setMethodArguments("Global", "com.groupon.proxy.internal.Common.delay", 1, 100);

In this example, we are applying a custom response (stub) with value “response text” to a path defined with a friendly name “Global”. Then we add a 100ms delay to that same path. Go nuts! Try adding a 10 second delay to simulate the effects of network latency on your app.

Benefits

Odo brings benefits to several different areas of development:

  • Test automation owners can avoid stub data maintenance, and tests gain the ability to dynamically configure Odo to simulate complex or edge-case scenarios.

  • Manual Testers gain the same ability to configure Odo through the UI. No coding knowledge is required.

  • Developers can use Odo to avoid dependency blocks. If a feature depends on an API currently in development, Odo can be used to simulate the missing component and unblock development.

  • Multiple client support. With this feature, a single Odo instance can be run, but have different behaviors for each client connected. This allows a team to run a central Odo instance, and the team members can modify Odo behavior without affecting other members’ configurations.

Broader Odo Adoption

With Odo’s flexibility, our Test Engineering teams adopted the usage of it for testing our interaction tier applications built on Node.js (you know, the ones powering the frontend of groupon.com). We even have an internal fork that scales so we can use it as part of our capacity testing efforts. We’ll push that update in the near future as well.

Simple as 1-2-3

Our Github page provides some getting started instructions to get you up and running quickly. All you need is Java(7+) and odo.war from our release. There is a sample to give you an idea of how to apply everything Odo has to offer to your projects.

Links

Github – https://github.com/groupon/odo

Readme – https://github.com/groupon/odo/blob/master/README.md

Download – https://github.com/groupon/odo/releases

Community Collaboration

We have a roadmap of new features we’ll be adding to Odo over time. We’d love your feedback and contribution towards making Odo a great project for everyone.


Kratos – Automated Performance Regression Analyzer

at June 26th, 2014

Groupon’s architecture comprises a number of services that are either real-time or offline computation pipelines. Like any huge data centric technology company, we have scaling challenges both in terms of growing data (more bytes) and increasing traffic (more pageviews). The gift of scale is that understanding the performance of our services is critical.

We constantly track performance deltas between code running in production and release candidates running in test environments. Release candidates may include both infrastructure and algorithmic changes. Manual inspection of hundreds of key metrics, like end-to-end latencies to serve a user request, is cumbersome and prone to human error, yet we must ensure new release candidates meet our SLA.

Introducing Kratos

Kratos is a platform that does differential analysis of performance metrics that originate from two different environments: production and release candidate. We had three clear goals in mind for it:

  • Providing a way to automatically measure, track and identify regressions in key performance metrics as changes are made to services in production.
  • Ability to provide an early (pre-commit/pre code-push) feedback mechanism to the developers in terms of the impact their changes have on the service performance.
  • Building a streamlined way of revisiting performance baselines for release candidates so that we are not in for a surprise when launching to production.

Architecture

At Groupon, all of our services have their performance related metrics written in real-time to a super cool platform that converts them into graphs. Hence before a new code goes live, it is hosted in either a developer vm or a dedicated staging environment. We bombard this service with a replay of few hours of production traffic and the performance graphs are being drawn out in real-time. Typical metrics include latency metrics (e.g. service latency, caching latency), counter metrics (e.g. requests, failures), and system stats (e.g. cpu, disk, and memory utilization).

Kratos_Architecture

Let’s look at all the data stores that are needed by Kratos.

Service Metadata

Contains metadata about every service that plugs into the platform. This contains information such as name of the service, the url where the service’s graphs reside.

Metrics Metadata

Contains metadata about the metrics which need to be compared. It contains information such as name of the metric, value of each metrics, the corresponding graph end points for the metric, service it is associated with and the acceptable delta for every metric.

Performance Analyzer Run Metadata

Stores all data regarding the current performance test i.e value for all metrics, the differentials and also if the differentials are within the threshold for every metric

Metrics Pipeline

Metrics filter
Applies a bunch of filters to decide and select all metrics related to a service by querying from service and metrics metadata stores.
Metrics Crawler
Metrics are then passed on to the metrics Crawler which grabs and extracts metrics from the graphs which then are pushed forward to a metrics Cleanser.
Metrics Cleanser
Cleanser looks at the time series of metrics and rejects any noise (if there are inconsistencies in the graph or lack of data) and accounts for them in the computation.

Differential Analyzer

The final piece in the pipe is the Differential which computes the delta’s between production and release candidate build metrics and then flushes them into the data store.

Performance Data Set This is the core data structure passed around between the various stages of the metrics pipeline and used by every stage to store metadata.

Findings

We did a study on how our current approach (manual inspection of graphs) compares to the new Kratos Platform. Here is what we found.

Scaling with increasing # of metrics

X axis is time in weeks. As you go from left to right in the graph, it is extremely clear that the time taken to analyze and report performance results have gone down drastically ( more than 30x gain).Also the gain is consistent with increasing no of metrics.

PerformanceMetrics

Number of performance regressions caught in local/staging performance tests

As the adoption of the platform increased, the number of performance regressions identified during local/staging automated performance tests has increased. This has resulted in fewer production crashes/regressions.

X axis is time in weeks and Y axis is the no of performance regressions identified (pre production) with old vs new approach.

RegressionsandScale

Kratos improves the quality of our services and eliminates human errors in the judgement of performance of a service by creating a framework that does a differential analysis of the performance graphs between 2 versions of the service and produces and stores the result in a user-friendly fashion.

We’ve seen great results so far, with the time taken to analyze at 20x gain. We also have greater ease in catching issues before production deploy – Relevance Service was at 100% availability during the last holiday season, owing to data feedback from Kratos regarding performance metrics.


PCI at Groupon – the Tokenizer

at June 17th, 2014

Any successful e-commerce company invariably has to become PCI compliant. The Payment Card Industry (PCI) is a credit card industry consortium that sets standards and protocols for dealing with credit cards. One of these standards targeted at merchants is called the PCI-DSS, or PCI Data Security Standards. It is a set of rules for how credit card data must be handled by a merchant in order to prevent that data from getting into the wrong hands.

Groupon has designed a PCI solution called Tokenizer – a purpose-built system, built with performance, scalability, and fault-tolerance in mind as a way to streamline this process and we are seeing great results.

The PCI Security Standards Council certifies independent auditors to assess merchants’ compliance with the PCI-DSS. Companies that handle more than six million credit card transactions in a year are held to the highest standard, known as PCI Level 1.

When faced with becoming PCI-compliant, most e-commerce businesses that need access to the original credit card number usually follow one of two typical paths:

  • Put all of their code and systems under PCI scope. Development velocity grinds to a halt due to the order-of-magnitude increase in overhead compared with typical agile processes.
  • Carve up their app into in-scope (checkout page, credit card number storage, etc.) and out-of-scope (everything else) portions. The once-monolithic application becomes a spaghetti mess of separation and interdependency between very different systems. The “checkout” page stagnates and becomes visually inconsistent with the rest of the site due to the order-of-magnitude increase in overhead noted above.

Groupon faced this dilemma a few years back. Prior to designing a solution, we set out our design goals:

  1. Have as few devices/services in PCI scope as possible.
  2. Make as few changes to existing services as possible.
  3. Maximize protection of our customers’ credit card data.

After many hours meditating at the Deer Head Center for Excellence (see the end of this post) we arrived at our solution.

Groupon’s PCI Solution

For the core tenet of the solution we would focus on the credit card data that we had to protect. Rather than re-implement our checkout application, we would shield the application from the credit card data. How would we do this? Read on…

Prior to becoming PCI compliant, users would submit their credit card information directly to the checkout application. With the PCI-compliant solution in place, there is an HTTP proxy layer that transforms the customer’s web request, replacing the credit card number with an innocuous token. The modified request is forwarded to the checkout app, who processes the request and responds to the user via the proxy. Since the proxy’s job is to replace sensitive data with a token, we call it the Tokenizer.

Tokenizer High Level Diagram

Enter the Tokenizer

With this architecture, our PCI footprint is minimal, flexible, and rarely needs to change. The Tokenizer is a purpose-built system, built with performance, scalability, and fault-tolerance in mind. Its only job is to find credit card numbers in specific kinds of requests, replace the credit card numbers with something else, store the encrypted credit card number, and forward the request to another system.

The Tokenizer does not render any HTML or error messages — all of that comes from the application that it is proxying to. Therefore, the user interface components are under complete control of systems that are not in PCI scope, thereby preserving the normal high iteration velocity of a modern e-commerce company.

“Yes, But…” or All About Format-Preserving Tokens

You may be asking, “So the credit card number is replaced with something else right? What is it replaced with? How can you satisfy your design goal of having minimal changes to the rest of the system?”

A credit card is defined as a numeric sequence of 13 to 19 digits. Credit card companies are assigned a number space in which they may issue cards. You may have noticed that most VISA cards start with 4, MasterCard with 5, etc.

There is a special number space that begins with “99″ that is by definition unassigned. The tokens that the Tokenizer generates are 19 digits long, and begin with “99″. Because the token follows the official definition of a credit card, it is called a Format-Preserving Token.

Within the 19 digit number, we encode a token version, the original card type, datacenter that created the token, a checksum digit, a sequence number, and the original last four digits of the customer-entered credit card number.

Groupon Token Breakdown:

9911211234567891234
 | || ||        lastfour
 | || |seqno(9)
 | || luhn
 | |card type(2)
 | version/colo
 cardspace

The only modification required of other systems at Groupon is that they need to recognize these tokens as valid credit cards, and look at the card type field to know the original card type, rather than running their normal pattern matching code for Visa, Mastercard, Discover, AmEx, etc. Usually this is a minor change within a CreditCard class.

Database schemas can stay the same, masked card display routines (only showing the card type and last four digits) may stay the same, even error cases can stay the same.

Error Handling

Since the Tokenizer has pattern matching rules for detecting card types, it can detect invalid card numbers as well. Rather than encrypting and storing invalid card numbers, the Tokenizer will generate a token that is invalid in the same way that the original input was invalid and replace the invalid card number with the generated invalid token.

For instance, if the credit card number is too short (e.g. a VISA with 15 digits instead of the typical 16) the Tokenizer will generate a token that is too short (e.g. 18 digits instead 19). If the card number the customer typed in does not pass the Luhn checksum test, then the token will not pass the Luhn checksum test. If the card is an unrecognizable type, then the token will be an unrecognizable type.

This allows all end-user messaging around card errors to be generated by the out-of-scope systems that the Tokenizer is proxying to. The checkout app will see an invalid card number, and return an appropriate error message via the Tokenizer back to the user. Less PCI-scoped code means that development teams may iterate on the product with less friction.

Security

Most similar tokenizing systems will use a cryptographic hashing algorithm like SHA-256 to transform the card number into something unrecognizable. The issue with hashed card numbers is that the plaintext space is relatively small (numbers with well known prefixes having 13-19 digits that pass the Luhn checksum test). Therefore, reverse-engineering of hashed credit card numbers is fairly simple with today’s computers. If the hashed tokens fall into the wrong hands, the assumption needs to be that the card numbers have been exposed.

This risk can be mitigated by using a salted hash or HMAC algorithm instead, with the salt treated as an encryption key and split between layers of the system, but the downside of having a token in a very different form than the original card number remains.

Aside from the card type and last four digits, the Groupon token is not derived from the card number at all. There is no way, given any number of tokens, to generate any number of credit card numbers. The information simply isn’t there for the bad guys to use. Less information is more secure.

Closing The Loop

At this point we have described how card numbers are replaced with tokens by the Tokenizer. Now we have many backend systems that have tokens stored instead of credit card numbers. You may be wondering how we collect money from tokens. Seems like squeezing water from a rock, right?

There is a complementary, internal-only system called the Detokenizer. The Detokenizer functions in a very similar way to the tokenizer, just in reverse.

We use our payment processor’s client library to construct a request back to the processor. Usually these requests are SOAP over HTTP. In the credit card number field in the SOAP payload, the out-of-scope system will insert the token. Naturally, we cannot send this request as-is to the payment processor, since the token is meaningless to them.

<?xml> <transaction> <card_number>990110000000121111</card_number>…

“Bzzzzt! That’s not a credit card number!”

Rather than sending the request directly to the payment processor, it is sent via the Detokenizer proxy. The Detokenizer recognizes the destination of the request and looks in a specific place in the payload (the card number field) for the token. If it’s found, the token is used to look up the original encrypted card number from the database, the card number is decrypted, and it replaces the token in the original request payload.

<?xml> <transaction> <card_number>4111111111111111</card_number>…

The Detokenizer then forwards the HTTP request to the payment processor and Groupon gets paid. Make it rain, Groupon.

Overall

Groupon’s current production footprint is on the order of thousands of servers. Our global PCI scope extends to one small source code repository, a dozen application servers, four database servers, and some network gear. Development, code deploys, and configuration changes follow the regimented rules of the PCI-DSS. But since the in-scope footprint is so small, the impact on the rest of the organization is minimal to nil.

There is a very small group of developers and operations staff who maintain these systems on a part-time basis. The rest of the time, these employees enjoy working on more high-velocity projects for the company. This design has meant that Groupon can still run at the speed of a startup, even though it is compliant with these strict industry rules.

And finally, The Deer Head Center For Excellence is Groupon’s in-office cocktail bar. Put up a monitor and serve up your own Deer Head.

Deer Head


DotCi and Docker

By
at June 12th, 2014

This week, I presented at Dockercon14 on a new project that we just open-sourced at Groupon called DotCi. DotCi is a Jenkins plugin that makes job management easy with built-in GitHub integration, push-button job creation, and YAML powered build configuration and customization. It comes prepackaged with Docker support as well, which means bootstrapping a new build environment from scratch can take as little as 15 minutes. DotCi has been a critical tool for us internally for managing build and release pipelines for the wide variety of technologies in our SOA landscape. We found it so useful that we wanted to share the benefits of DotCi with the wide world of Jenkins users out there. Here are just a few of those benefits:

  • Deep Integration with Source Control – for us that’s Github Enterprise
  • Integration with Github webhooks
  • Feedback sent to the committer or pusher via Email, Hipchat, Campfire, etc.
  • Setting of commit and pull request statuses on Github
  • Push button build setup
  • Dynamic dependency loading at build runtime
  • Easy, version controlled build configuration
  • Simple parallelization
  • Customizable behavior based on environment variables (e.g. branch name)
  • Docker support!

Groupon Engineering is an early adopter of Docker and we were featured in their news this week.

Go check out DotCi and the Plugin Expansion pack on our public Github:

https://github.com/groupon/DotCi
https://github.com/groupon/DotCi-Plugins-Starter-Pack

And try out the 15-minute Cloud CI setup with DotCi, Docker, Jenkins and Digital Ocean here.

Please contact me for comments and any questions.

Happy Testing!

DotCi


Groupon hosts Berlin Android Meetup Wednesday, May 28

at May 26th, 2014

Every last Wednesday of each month the Google Developer Group Berlin invites developers, testers, designers and Android enthusiasts for sessions and chats about Android development in an engaging atmosphere.

gdg-android-in-berlin-voltron_outlined

Next Wednesday, May 28, Groupon will be hosting this Meetup in its Berlin office in Mitte: Wednesday, 28th May, 7.00PM

Groupon GmbH

Oberwallstr 6

10117 Berlin

In this talk we will show for the first time in Germany an internal tool we are open sourcing, used by our automation engineers, manual testers and developers to simulate complicated testing scenarios with our mobile applications.
Join us for:

Bye Bye Charles, Welcome Odo!

Click here to register.

Foods and drinks will be provided.

More talks will be announced. Vote on them, suggestions welcome.


Enterprise UX at Groupon SF – designers in the house!

at May 16th, 2014

Groupon Engineering recently hosted an Enterprise UX meetup in our new SF office. Enterprise UX gathers UX designers, product managers, developers, executives and all around tech folk to discuss and share innovations in User Experience. This installment focused on Engagement by Design and the event ‘sold out’ with close to 200 people in attendance. Tweets and comments on the Enterprise UX meetup page captured the excitement of this event throughout the evening.

Ray Harris

Event looks awesome, someone get me off this waitlist ;)

We kicked off with networking over food and drink followed by presentations showcasing some seriously cool work in UX and the engineering that brings it to life. Our own Jadam Kahn, Director of Product Design, Internal Tools at Groupon delved into the tools we developed in house in “Design behind the G-wall.

Another topic focused on how creating engagement on mobile is driving Groupon’s evolution with 54% of global transactions on mobile. Finally, Janaki Kumar, Head of Strategic Design Services at SAP Labs, framed her ideas around “Gamification at Work.”

The open, creative space of our new SF office inspired continued dialogue way into evening. The event was a great success.

Christopher Russell This meetup was a stunner. Internal Tools? Perfect. Gamification? Janaki, it was like a revelation. Forgive the hyperbole, but it was really great.

Xavier Renom Portet Great event! Very cool stuff on UX for internal tools and Groupon mobile strategy

Quinne Fokes Terrific. Best meetup ever.

Thanks to Joe Preston, Enterprise UX founder and CXO at Momentum Design Lab, for collaborating with Groupon Engineering to make for a memorable evening.

Check back on this space for the recording of the evening’s presentations.


Groupon at VentureBeat Mobile Summit

at April 24th, 2014

Last week we sponsored the VentureBeat Mobile Summit in Sausalito to support thought leadership in mobile and the innovations in mobile and tech. As part of our participation I co-chaired a boardroom session on the tools and analytics that enable Groupon to optimize the mobile experience for the consumer. Mobile has always been considered strategic at Groupon as a driver of the evolution from an email business to a local curated marketplace. We’ve approached mobile as a way to experiment with new products and new types of engagement models. With the need to experiment, the ability to track and measure usage has always been instrumental to allow fast and iterative experimentation. Today, we launch 10-15 experiments in each mobile release. The amount of data we have and the size of our mobile operation is huge – we’re currently at 70 million downloads with nearly 50% of our global transactions on mobile. And we can continue to grow and scale in mobile.

nearby Groupon app on ios

Mihir Shah, our VP and GM of Mobile at Groupon, co-chaired the session on Local Mobile Commerce to explain how Groupon is redefining the intersection between local and mobile commerce. At Groupon, the ability to tailor our inventory based on where you are or what you need at any given moment makes mobile inherently local. Together, local and mobile are about interacting with your surroundings and using geographic data to make information more relevant, more personalized, more immediate.

We’ve already built the world’s largest local marketplace with deals that our customers want to buy when they’re out and about: restaurants, spas, live events, hotels, products and so on. We now need to enhance customer awareness and get people to check Groupon first, making us an integral part of their daily lives.

We’ll do this in two ways: by extending the reach and capabilities of our top-ranked mobile app to better capitalize on these trends and by methodically experimenting to enhance the user experience. We will be sharing the broader boardroom session takeaways from the Summit and the next part of our mobile story at the VentureBeat MobileBeat conference in July.


Groupon Test Engineering showcases its mojo!

at March 26th, 2014

Earlier this month we held our first ever Groupon Test Engineering Meetup at our Palo Alto office. We focused on mobile test automation, which the team has been working on the past few months.

David Willson talked about Odo, a proxy server that allows testers to be able to manipulate HTTP based traffic both via a GUI and REST interface to drive complex testing scenarios while David van der Bokke delved into Robo-Remote, a remote control framework for Robotium/UiAutomator which he open sourced in 2012 as our first open-sourced testing tool! Stay tuned for news on Odo as we work towards open-sourcing that as well.

We had about 45 attendees from Netflix, Bitbar, Google, Bill.com, Yahoo, and more come by while some of our very own Grouponers came for support as well. All in all, it was a great success. There were a few people who reached out for opportunities with us, and a mobile testing start-up talked with us about partnering on Odo development.

At Groupon, we’re not just ensuring our site’s capacity can support extreme bursts in traffic or that our features released on www.groupon.com and mobile apps are delivering great experiences for our consumers and merchants; we’re also firms believers that as testers, we know what other testers need to get the job done. We’ve been solving complex testing problems with innovative testing solutions and we’re working on open-sourcing them to give back to the test community at large.

Check out our Groupon Test Engineering Meetup page for information about the event.

And this is just the beginning!

Groupon Test Engineering's first Meetup event - Palo Alto

Groupon Test Engineering’s first Meetup event – Palo Alto