On The Subject of Girls, Technology, and Marshmallow Or: how the Evolution of Girl Scouts and STEM is evident at Groupon

at November 14th, 2014

IMG_7901

Groupon recently opened its green doors to some of the Girl Scouts’ best and brightest for our Scout Out Engineering event. For the second and consecutive year, Groupon Engineering and the Groupon Employee Volunteer Program partnered with the Girl Scouts of Greater Chicago and Northwest Indiana to welcome 5th and 6th graders into the Chicago Groupon office for a morning of learning, fun, and tech engagement.

Scout Out Engineering introduces girls to engineering concepts through a combination of presentations and hands-on learning. Groupon’s goal is to excite these girls about technology and keep them interested in engineering and STEM education.

IMG_7898 (1)

IMG_7905 (1)

One tenet of the Girl Scouts that makes them great is their all inclusive, ‘every girl’ approach. For the Girl Scouts, every girl should be able to participate in any activity regardless of her background or skillset. Last year, Groupon was advised to plan for girls with no internet in their homes, no experience with computers, and no idea who – or what – Groupon was as a tech company. With those guidelines in mind, we planned the program as a hands-on engineering centered event that, for a tech company, was strangely void of computers.

If the focus of last year’s program was to introduce the idea of STEM education and emphasize its importance, then this year’s focus was to build on that foundation and actually do something about it.

In the six months leading up to our 2014 planning, ideas incubated and matured, technology advanced, and the profile of the ‘every girl’ evolved. In 2014 ‘every girl’ used a computer, a smartphone, and got exposed to some aspect of STEM education daily. The Girl Scouts encouraged Groupon to incorporate computers into the program–many of the girls may have already done some form of coding–and there were no limits on what technology the girls could be exposed to.

With these new guidelines we designed a program with a tech heavy core that better represented the work that happens here at Groupon. Hands-on computer learning took center stage and the focus on coding allowed participants the chance to code alongside top engineers and continue their learning outside Groupon’s green walls. A bridge building activity became an opportunity for girls to work cross functionally and employ a few of the key concepts that keep Groupon Engineering running. Girls learned about agile methodologies, iterated on their work, and closed the day with a real, live white boarding retrospective session (and, of course, pizza.)

IMG_7521 (1)

IMG_7907 (1)

Scout Out Engineering at Groupon exposed girls to technology in an immediate and accessible way. It became an event for Groupon employees to use their talents to spark interest in subject matter that they are passionate about, and it gave everyone the opportunity to realize how essential empowering young girls can be. When it comes to STEM education at Groupon, there has always been an abundance of employee support and our support for the Scout Out Engineering event was no different. From the planning team, to speakers, to volunteers, Groupon Engineering was ready and willing to donate time, energy, and resources to teach these girls a thing or two about tech.

IMG_7522 (1)


Gnome Foundation and Groupon product names (UPDATED)

at November 11th, 2014

UPDATE: There is some recent confusion around Groupon’s intended use of a product name that the GNOME Foundation believes infringes on their trademarks. While notified by the GNOME Foundation directors that they believed this was the case, we were not able to come to an agreement and were proceeding with the registration of our marks. We apologize for any distress this has caused GNOME Foundation and the open source community.

We love open source at Groupon. We have open-sourced a number of projects on Groupon’s github. Our relationship with the open source community is more important to us than a product name.

After additional conversations with the open source community and the Gnome Foundation, we have decided to abandon our pending trademark applications for “Gnome.” We will choose a new name for our product going forward. We will continue to work with the Gnome Foundation as we rebrand our product.

Please see our joint statement on the GNOME Foundation’s website and below:

“Groupon has agreed to change its Gnome product name to resolve the GNOME Foundation’s concerns. Groupon is now abandoning all of its 28 pending trademark applications. The parties are working together on a mutually acceptable solution, a process that has already begun.”

No Tags


Groupon’s Geekon project adds Apache Kafka Support to Facebook’s Presto exabyte scale analytic SQL engine

at November 9th, 2014

Started as a project at Groupon’s global Geekon hackathon, support for Apache Kafka adds real time querying capabilities to Presto SQL query engine.

Presto is an open source distributed SQL query engine for running interactive analytic queries against data sources of all sizes ranging from gigabytes to exabytes, originally released by Facebook. Apache Kafka is a high-throughput distributed messaging system.

With the ability of live data queries, Presto can now support use cases that were traditionally only available to special tools such as Splunk.

Groupon Engineering is planning to use Presto to analyze its real time event data streams and will replace an existing legacy system. Using Presto will allow engineers and data analysts to correlate current (live) data from Apache Kafka and historic data stored in Apache Hadoop. This capability will allow Groupon to shut down a number of existing legacy systems and reduce operating costs while improving insight into our real time data flows.

Groupon Engineering is engaged with the community to deliver excellence in open source development.

… and clearly, we are always hiring!


Groupon adopts Kill Bill, the open-source Payments Platform

at November 3rd, 2014

Groupon has always been a committed player in the open source community, both by releasing our tools and libraries to a larger audience and by using popular open source projects. So when we took a step back earlier this year to re-assess our global payments infrastructure, we naturally looked at what the community had to offer. We’re now pleased to announce that we have successfully integrated Kill Bill, the open source billing and payments platform, with a subset of our services, and we are planning a wider rollout.

Kill Bill provides a platform for building billing and payments infrastructures. It offers a framework for handling recurring subscriptions as well as unified APIs to support virtually any kind of payment gateway and payment method in the world, from wire transfers to credit card payments, as well as crypto-currencies and even Apple Pay.

While Kill Bill has been deployed in large scale infrastructures before (such as at Ning), the Groupon environment is truly unique; Groupon as the world’s largest marketplace of deals is present in 45 + countries, with more than 240,000 global, active deals, supporting over 120 payment methods. Our team focused on performance testing the system and made sure that each and every single payment handled is secure and reliable. As part of this process, we discovered the limits of some of the libraries we use, and reported and helped fix bugs in Java 8, JRuby, ActiveRecord and more. The community has been outstanding in this process, thanks to all of you!

We believe strongly in the exchange of ideas and cooperation between people. If this sounds good to you, we are hiring!

No Tags


Local Hotspots and Travel Flows, Directly from the Data

at October 29th, 2014

It’s a crucial question in ecommerce: How likely is customer X to buy product Y? For Local, we must of course consider the physical locations of both X and Y. This is the location relevance problem, which is one of the most important ingredients in determining the best deals for each of our users. When we send out emails or return search query results, the deals that we display have to be relevant. To solve this problem we need to know our users’ propensity to travel for the different services that we offer, and having an accurate measurement of these travel patterns helps us to understand demand and thus optimize our sales force.

We cannot simply assume that users will want to stay in their home neighborhoods. People want to get out and explore, and we want them to check Groupon first. One approach is to determine location relevance based on simple distance, but this is an over-simplification. We know that people flock to local hotspots and avoid certain neighborhoods. They are also more likely to travel farther for a rare service, a pricey restaurant, and many leisure activities such as museums and waterparks. Fortunately, we can capture these trends directly from the data. Here’s how.

Getting the data

For this analysis, we require data pertaining to where our users are located and what they have purchased. For this we leverage the fact that users can voluntarily provide us with their locations in the form of zipcodes or full addresses. For each historical order data point, there are several variables that we want to track due to their importance in determining whether or not someone is willing to travel for a Local deal. For this analysis we consider the following:

  1. the location of the user
  2. the location of the merchant
  3. the service that the merchant provides
  4. the price of the deal

In order to compute useful empirical probabilities regarding how likely users are to travel, we need to group the data points into bins. For the locations, we partition the world into markets (metropolitan areas plus the surrounding suburbs), which are further divided into “submarkets.” For the deals, we have a hierarchical taxonomy with three levels of merchant attribution: the type of services offered (e.g. milkshakes or body wraps), the merchant type (e.g. “Bakery & Desserts” or ”Spa Services”), and the general merchant category (e.g. “Food & Drink” or “Beauty / Wellness / Healthcare”). We further define price bins such as “$0-25″ on the low end and “$100+” on the high end.

Thus for each user we have a market and a submarket, and for each deal we also have a market and a submarket, in addition to a service (with the associated merchant type and category) and a price bin. Then for each <service, price bin, deal location, user location> combination we can empirically determine the odds that the user will be willing to travel for the type of deal in question, based on what has occurred in the past.

Based on these odds we can determine the most popular travel patterns, which will tell us where each city’s hotspots are located. We can further define an effective radius for each individual service, thereby determining how far users are typically willing to travel for, say, aerobics versus paintball.

Not so fast…

There are several subtleties to this analysis. For starters, many users and merchants will have multiple locations in our database. This can happen for instance when a user has multiple addresses registered with us and when merchants have multiple locations. To work around this, we assume the <user location, deal location> pair that corresponds to the shortest distance. In other words, if a user buys a deal that is closer to their home than to their workplace, we assume that they’re traveling from the former. Similarly, if someone buys a deal that has multiple locations, we assume that they’re going to redeem it at the location that is closest to them.

A bigger problem is data sparsity. Given the extremely broad variety of services that we offer, we find that some of our <service, price bin, deal location> combinations have too few data points, and thus a poor sampling of the relevant locations of the travel-willing users. For instance, in Chicago and the surrounding suburbs we have 14 submarkets. Thus if we wanted to determine which submarkets’ users are buying mid-price downtown yoga deals, we need to have far more than 14 data points to get a good sampling. We work around this problem by using geography-independent fallbacks, utilizing our taxonomy. For instance if we lack sufficient data at the <service, price bin> level, then we collapse out the price bin and only consider the service. If we still lack sufficient data, we then fall back a level in our taxonomy and use the merchant type or the even more granular merchant category.

Another important issue is outlier deals. Especially amazing deals might draw users from a much wider radius than is typical, which would skew our results. To deal with this we use outlier removal to exclude the very top- and bottom- performing historical deals from our dataset.

Results

For each <service, price bin, user location, deal location> combination, our result is the probability that a user from that location would be willing to travel for a deal of that service, price bin, and location. To be sure that we’re not just seeing noise and that these travel flows are actual organic tendencies, we say that a flow is important enough to be deemed “travel-worthy” if this probability reaches a threshold of 20% or more. This level was found to be aggressive enough to leave us with only the truly statistically significant flows, yet low enough to give us sufficient useful information on the travel patterns for each city and service.

As expected, we find that users are indeed inclined to travel beyond their home neighborhoods, and that those travel propensities depend on where they live. For instance, the median distance Chicago users are willing to travel is about 5 miles. However, this median depends strongly on user submarket, and is under 2 miles for downtown users but approximately 12 miles for users from the South suburbs.

chicago

Where are these users traveling, and for what? Our results tell us these hotspots as a function of service, and we find that they depend on a combination of merchant density and merchant quality. For more common services, such as steakhouses, we find that users generally travel from areas of lower to higher merchant density. However, users are also willing to travel for particularly great merchants, and this preference is more likely to dictate the travel patterns for more unique services, like museums.

For example, one service in our “Food & Drink” category is “Cupcake.” We can query our results to give us the travel worthy patterns for this service for Chicago and the lowest price bin.

cupcake4

Chicago’s cupcake hotspots, here marked with red dots and defined as submarkets having at least 3 travel patterns ending there, all contain regions with the highest densities of cupcake-providing merchants. Similarly, for steakhouses we find that the submarkets containing the Magnificent Mile and Naperville are the major hotspots, as any Chicagoan might expect.

Which <service, deal location> combinations draw the most travelers? In Chicago in particular, we find that people are flocking to a Murder Mystery dinner spot in the West suburbs, a Hawaiian restaurant in the North suburbs, and the Field Museum and Segway tours downtown.

Despite our strict travel-worthy threshold, we still must verify that these travel patterns are actual organic travel tendencies, as opposed to being due to gaps in our inventory. Fortunately using census data for each submarket we find a strong correlation between the resident and business densities and the number of travel patterns that end there, indicating that our hotspots reflect actual travel patterns and are not biased by Groupon offerings. Still there are two behaviors that we need to stay on the lookout for:

  1. regions with low business density but high travel worthiness that are getting more than their fair share of deals
  2. regions with high business density but low travel worthiness that are getting less than their fair share of deals

In this way we can keep an eye on our inventory and troubleshoot as needed.

As mentioned above, we can also define an effective radius for each <service, price bin, location> combination, to determine how far users from said location are willing to travel for certain types of deals. We define this to be the 75th percentile of all of the relevant user-deal distances in our historical data set. By doing so we find that the lowest-radius services tend to include everyday fitness activities like aerobics classes, gym memberships, and spinning classes, whereas the highest-radius services tend to include weekend leisure activities such as white water rafting, off-roading, and skiing.

Next steps

This analysis was performed entirely with subscription and order data, and thus it was limited to a study of the interplay of merchant and user home locations. The expansion of our mobile business provides a huge opportunity for further tracking of travel patterns. Assuming users have given us their explicit consent to track it, GPS data gives us a much finer-grained picture of their behavior, thereby enabling us to learn where users are when they open the app and where they are when they place orders. Coupled with Gnome, we can further empower merchants to build stronger ties with customers who routinely travel to their neighborhoods.


Meet Groupon Engineering: Tim Macdonald

at October 23rd, 2014

Tim Macdonald

LR: How long have you been at Groupon? TM: Just over a year; I started part-time work in September 2013, and then switched to full-time last January.

LR: What are some of the challenges and problems you get to solve in your work as a software engineer at Groupon? TM: I’ve been on several teams, but so far have spent most of my time doing some interesting work with analyzing past deals we’ve run and using that data to predict how new deals would do in the market. It’s of course both challenging and exciting to be working on a suite of new products, and more specifically there were some good problems related to parsing deals—that is, figuring out what exactly they’re selling and how that stacks up to other ones. From a more technical standpoint, I’ve learned a new language that I really love (Clojure), massively improved my skills with another (Javascript), and generally be more involved with full-stack development.

LR: What do you enjoy most about working at Groupon? TM: Lots of great things about working at Groupon but I think the best part is the people I get to work with. It amazes me that the same group of people making great technical decisions and giving great product pushback can always make me laugh or be up for a beer!

LR: You recently took first place at the 2014 U.S. Scottish Fiddle Championship! How did you get into fiddle playing? TM: I’ve been playing the (classical) violin since I was four—I went to see the Tegucigalpa Philharmonic and was entranced. When I was twelve, I was at a Scottish festival, saw the fiddle competition, and thought, “That’s awesome! Wait a second…I play the violin…I could totally learn how to play like that.” So I took private lessons for a year and a half and started going to a summer fiddle camp, and now here I am! For this contest, I was specifically trying to make a point about historically informed performance. All of my pieces were written before 1793, and I was playing on period instruments. The baroque violin is very different from the modern one.

Tim’s first-place finish qualifies him to compete in the 2015 Glenfiddich Fiddle Championship in Scotland next October when he’ll go up against other maestros from around the world.

Tune in to Tim’s winning performance below, check out this article featuring the fiddle aficionado himself and leave any and all accolades in the comments.


Meet Groupon Engineering Operations: Erica Geil

at October 22nd, 2014

Erica Geil has been at Groupon for over four years. “The thing I love most about my job is what I learn from the people I work with at Groupon. Everyone has different experiences, different interests and that keeps me engaged with the initiatives I oversee. I never have the same day twice,” Erica said. As Sr. Director, Global Engineering Operations, Erica’s role gives input to the project management team responsible for driving global, cross-functional initiatives. She also oversees ETHOS, Groupon’s team focused on engineering training and culture, and she gives guidance to Salesforce engineering and product development – these groups provide critical data to many Groupon tools and systems.

Groupon will be 6 years old in November and Erica has seen the company grow exponentially in her time. Groupon started as a daily deals business and is evolving to a Pull model where users can check Groupon first to discover an inventory of services, goods and travel deals. Watch below to learn more about how Erica’s work in Engineering Operations is part of that growth.



Sharing is Caring: Open Source at Groupon

at October 7th, 2014

Groupon is fueled by open source software. We run on software built in the open, supported by the community, and shared to move technology forward. While we give back when we can to the projects we use and share new creations, the true value resides in the people that make it happen, and that’s why we are pleased to announce Groupon’s new OSS home. It pulls information from Github, Open Source Report Card, Stack Overflow , and Lanyrd, with plans to add more information in the future. The purpose is to highlight the people that are actively contributing within Groupon Engineering and celebrate the things they do. This page is the result of Groupon Engineering’s most recent GEEKon, our bi-annual internal hackathon that gives our devs a chance to innovate bottoms up. We feel that it is important to share this information not only to show the world that we care about open source software, but to encourage more participation and find other amazing people who want to work with us.

Of course, this wouldn’t be in good taste unless we open sourced the code that generates this page. You can find it on Github.

We are excited to continue work on this project and even more excited to see additional contributions from folks outside of Groupon! We hope that it can help highlight the people in every company that make open source software what it is and find more meaningful connections between other like-minded people in their communities.

To get a complete picture of what we have already released you can visit the Groupon Github page. Be sure to keep an eye out for new releases and updates to our projects. And if you like the work that we have done and would like to work with us, check out our Groupon tech jobs.

No Tags


Meet Groupon Engineering: Candice Savino

at October 7th, 2014

Candice

LR: How long have you been at Groupon? CS: I’ve been at Groupon for almost 3 years! I started the day after we IPO’d.

LR: What do you do at Groupon? CS: Currently I’m a Sr. Engineering Manager, but I started here as a Software Engineer.

LR: What are some of the challenges you get to solve in your work? CS: Building platforms that allow Groupon to give customers the best experience possible has been challenging but rewarding work. My team built the platform that powers the Holiday shop, Valentine’s day shop and countless other themes in 33 countries. The challenge behind this was to build a tool flexible and useful to the business without needing constant deployment cycles from engineering teams. This platform gave the control and flexibility to the business while keeping the engineers out of monotonous work.

LR: What do you like most about working at Groupon? CS: The best thing about Groupon is getting the opportunity to rebuild the tech platform at the ground level. Groupon is the fastest growing company ever and keeping up with the load from a technical perspective is a challenge. Being at Groupon gives you the opportunity to be part of rebuilding a company which you don’t get to do everywhere. My team is one of the best teams I’ve worked with in my career. Getting to work with each of these talented engineers is always the best part of my day!