Performance Zone is brought to you in partnership with:

I specialise MySQL Server performance as well as in performance of application stacks using MySQL, especially LAMP. Web sites handling millions of visitors a day dealing with terabytes of data and hundreds of servers is king of applications I love the most. Peter is a DZone MVB and is not an employee of DZone and has posted 263 posts at DZone. You can read more from them at their website. View Full User Profile

Sphinx search performance optimization: multi-threaded search

01.17.2013
| 2965 views |
  • submit to reddit

 This post comes from at the MySQL Performance Blog. 

Queries in MySQL, Sphinx and many other database or search engines are typically single-threaded. That is when you issue a single query on your brand new r910 with 32 CPU cores and 16 disks, the maximum that is going to be used to process this query at any given point is 1 CPU core and 1 disk. In fact, only one or the other.

Seriously, if query is CPU intensive, it is only going to be using 3% of the available CPU capacity (for the same 32-core machine). If disk IO intensive – 6% of the available IO capacity (for the 16-disk RAID10 or RAID0 for that matter).

Let me put it another way. If your MySQL or Sphinx query takes 10s to run on a machine with a single CPU core and single disk, putting it on a machine with 32 such cores and 16 such disks will not make it any better.

But you knew this already. Question is – can you do something about it?

In case of Sphinx – indeed you can! And with very little effort. In fact, it does not require any changes to your application or database whatsoever, it is only a matter of small changes to the sphinx configuration.

The Plan

First of all, let me explain what we are trying to achieve here.

Sphinx has the ability to do distributed search out of the box – it was designed to scale out that way very early on. If your sphinx index does not fit to one machine, you would index different parts of it from different machines and then you would have an aggregator node that receives the request from application, issues search requests to all data nodes in parallel, merges results from all of the data nodes and returns results back to the application as if it was just one server serving the request in the first place.

Well, guess what – you can actually utilize this feature to your advantage even if your data can easily fit into one machine and all you want is your queries to be many times faster. Even more so, Sphinx now supports this out of the box, so you don’t need to pretend you are querying remote nodes.

Also, you get a bonus: once you configure server for distributed search, you can do indexing in parallel too!

Word of caution: while this technique will improve most types of search queries, there are some that aren’t going to benefit greatly from parallel execution. The reason is that while search is done on data nodes (even if local ones) and in parallel, merging of results is done by the aggregator and therefore it is single-threaded. Merging includes some CPU-intensive operations such as ranking, ordering or even COUNT with GROUP BY and if data nodes return large amounts of data to post-process, aggregator may well become a bottle-neck due to its single-threaded nature.

This is however easy to check – look at your Sphinx query log and look at the number of results matched per query – this should give you a clue.

Execution

Let us assume you have this one index on one server with the following basic configuration (many irrelevant details omitted):

source src1
{
	type = mysql
	sql_query = SELECT id, text FROM table
}

index idx1
{
	type = plain
	source = src1
}

searchd
{
	dist_threads = 0 # default
}

And now we want it to utilize 3 CPU cores and/or disks on a local machine for this index of ours – idx1. Here’s how we would change the configuration:

source src1
{
	type = mysql
	sql_query = SELECT id, text FROM table
}

source src1p0 : src1
{
	sql_query = SELECT id, text FROM table WHERE id % 3 = 0;
}

source src1p1 : src1
{
	sql_query = SELECT id, text FROM table WHERE id % 3 = 1;
}

source src1p2 : src1
{
	sql_query = SELECT id, text FROM table WHERE id % 3 = 2;
}

index idx1_template
{
	type = plain
	source = src1
}

index idx1p0 : idx1_template
{
	source = src0
}

index idx1p1 : idx1_template
{
	source = src1
}

index idx1p2 : idx1_template
{
	source = src2
}

index idx1
{
	type = distributed
	local = idx1p0
	local = idx1p1
	local = idx1p2
}

searchd
{
	dist_threads = 3
}

And you’re done. Of course, you need to reindex all of the indexes, but you can now do it in parallel – just run a separate indexer for every index idx1p0 through idx1p2.

BTW, using div operator is not necessarily the best way to split data, you may want these to be ranges by using a helper table in MySQL to define those ranges, mixing it with sql_query_range or something else, depending on how your data looks like.

Finishing line

I always loved how Sphinx scales out easily with as many machines you need and have been running it this way for many years now, however I think I don’t utilize this feature to make queries even faster on a one-machine show nearly as often as I should. Well, it’s not like it is slow or anything, but queries are never too fast, are they? :)

Published at DZone with permission of Peter Zaitsev, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)