Database-driven web applications often maintain a pool of database connections that can be reused for future requests. Re-establishing a new database connection for every web request can be costly for both the web application and the database server.
Yet, an existing database connection in a pool can sometimes be dropped for various reasons. For example, network interruption, idle connection expiration, or failover to a standby replica will necessitate reconnecting to the database.
When an application tries to query the database using a dropped connection, the database client library may raise an exception and the web request will probably fail. The worst case is when naive database client cannot detect a dropped connection. Then the application would keep failing on future requests until the application is restarted manually.
How should we prevent failures due to a dropped database connection?
Typical Rails Approach
Rails, by default, attempts to detect if a connection is active with a ping query at the beginning of every web request. If the ping fails, a reconnection is attempted. Unfortunately, this ping query creates performance overhead. The overhead can be large. The MySQL adapter pings via the MySQL statistics command which is an expensive call. Of course, there’s a race condition here too since a successful ping query cannot guarantee the database connection is still active for the actual query.
Here are some code snippets from various versions of Rails that highlight this issue
Rails 3 ActiveRecord mysql_adapter.rb (source at GitHub)
def active? if @connection.respond_to?(:stat) @connection.stat else @connection.query 'select 1' end # mysql-ruby doesn't raise an exception when stat fails. if @connection.respond_to?(:errno) @connection.errno.zero? else true end rescue Mysql::Error false end
Rails 3 ActiveRecord mysql2_adapter.rb (source at GitHub)
def active? return false unless @connection @connection.ping end
At Groupon, we have implemented a more efficient solution. Instead of pinging the database on every web request, we simply try the first query. The database connection is valid the majority of the time a the ping is just a wasteful expense. If the query fails, we catch the “lost connection” exception and retry the query.
We only retry the first query of a web request. A more sophisticated solution could retry any query, but retrying a query in the middle of a transaction can be dangerous. Our solution is like using the first actual (safe) query in place of the ping query, but without the performance overhead. ;-)
Example patch for Rails ActiveRecord::ConnectionAdapters
This monkey patch works with both the mysql2 and mysql gem.
module ActiveRetryConnection def self.included(base) base.class_eval do alias_method(:verify!, :verify_with_deferred_retry!) alias_method_chain(:execute, :active_retry) end end # verify is called (checkout_and_verify from connection_pool) at the beginning of request cycle # no longer calls active? for pinging the database def verify_with_deferred_retry! # handle nil @connection for mysql2_adapter if @connection.nil? reconnect! end @__retry_ok = true end def execute_with_active_retry(sql, name = nil) # if this is the first sql statement since a verify, it's ok # to retry the connection if it's gone away retry_ok = @__retry_ok @__retry_ok = false # do not retry query in a transaction if retry_ok && open_transactions > 0 retry_ok = false end begin return execute_without_active_retry(sql, name) rescue ::ActiveRecord::StatementInvalid, ::Mysql::Error => exception raise if !(exception.message =~ /(not connected|Can't connect to MySQL|MySQL server has gone away|Lost connection to MySQL server|Packet too large)/i) raise unless retry_ok retry_ok = false # avoid retry loop; retry exactly once reconnect! retry end end end
We ran a load test before and after applying this patch. On this application, throughput (measured as requests per minute via NewRelic) doubled!
The before version shows the load test peaking at 50K RPM.
The after version shows the load test peaking at 100K RPM.