Performance Zone is brought to you in partnership with:

I have been working for almost two years now on infrastructure and deployment automation, exploring programmatic solutions to traditional systems administration problems and configuration management. I'm fanatical about testing, the scientific method and building good tools to support awesome   Oliver is a DZone MVB and is not an employee of DZone and has posted 28 posts at DZone. You can read more from them at their website. View Full User Profile

Asynchronous MySQL Queries with Non-Blocking Readiness Checks

02.20.2013
| 2295 views |
  • submit to reddit

Well, despite my best intentions, here I am again writing Ruby. I decided to automate a small part of some data analysis I’ve had to do a few times, starting with the database queries themselves. Unfortunately the data is spread over several hosts and databases and the first implementation simply queried them serially. The next iteration used the mysql2 gem‘s asynchronous query functionality but still naively blocked on the results retrieval rather than polling the IOs to see when they could be read from.

It doesn’t actually add anything to my script to do this, but it seemed like a small learning opportunity and somewhat interesting so here is the guts of that code:

# Set up the three different clients to different hosts, and a starting time
client1 = Mysql2::Client.new()
client2 = Mysql2::Client.new()
client3 = Mysql2::Client.new()
start_time = Time.now()

# Run the first query
query1 = "SELECT SLEEP(5);"
puts "Running Query:\n >> #{query1}\n\n"
client1.query(query1, :async => true)

# Run the second query
query2 = "SELECT SLEEP(10);"
puts "Running Query:\n >> #{query2}\n\n"
client2.query(query2, :async => true)

# Run the third query
query3 = "SELECT SLEEP(15);"
puts "Running Query:\n >> #{query3}\n\n"
client3.query(query3, :async => true)

# Wait for results. Select on query IO sockets to rapidly determine
# when the results are ready
puts "Waiting for results..."
query_ios = [IO::open(client1.socket), IO::open(client2.socket), IO::open(client3.socket)]
orig_ios = query_ios.length
received = 0

while true
  # select with a timeout of 1 second
  ready = IO::select(query_ios, nil, nil, 1)
  next if ready.nil? # next iteration if nothing is ready
  ready_reads = ready[0] # IO::select returns [[rd],[wr],[ex]]

  received += ready_reads.length
  print "\rReceived #{received}/#{orig_ios} results after #{(Time.now - start_time).round} seconds" if ready.length > 0

  # Remove any IOs from the list to select from, if they are ready.
  query_ios -= ready_reads
  break if query_ios.length == 0 # Nothing left to select from
end 
puts ""

# Now all results are ready
res1 = client1.async_results
res2 = client2.async_results
res3 = client3.async_results

The code is pretty simple and the comments should reveal the intent of any confusing lines. The only part that was slightly irritating was receiving file descriptor numbers from Mysql2::Client#socket rather than the IO itself, hence having to re-open the same file descriptor.

In this case I haven’t done anything fancy after checking when the results are ready, but you can see how this could be trivially turned into a system for querying multiple backends for the same data and returning the fastest result which is a quite popular pattern at the moment.



Published at DZone with permission of Oliver Hookins, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)