Performance Zone is brought to you in partnership with:

I specialise MySQL Server performance as well as in performance of application stacks using MySQL, especially LAMP. Web sites handling millions of visitors a day dealing with terabytes of data and hundreds of servers is king of applications I love the most. Peter is a DZone MVB and is not an employee of DZone and has posted 238 posts at DZone. You can read more from them at their website. View Full User Profile

Load Management Techniques for MySQL

06.04.2012
| 6811 views |
  • submit to reddit
One of the frequent performance problems with MySQL is when some batch jobs, reports and other non-response-time-critical activities are overloading the system causing user experience to degrade.

The first thing you need to know it is that this is not a MySQL problem. It might not be a problem with your MySQL configuration, queries and hardware, even though fixing these does help in many cases. Whatever powerful and well tuned system you have, if you put a concurrent load on it that is too heavy the response times will increase and user experience will suffer.

So what you can do to prevent this problem from happening ? The answer is easy. Throttle the side load so it does not consume too many system resources. Here are some specific techniques to use.

Do not push concurrency too high:
Many developers will test scripts with multiple levels of concurrency and find out doing work from 32 processes is faster than just having one process. This is true if you have the system completely at your disposal. However, if you need the system to serve other users too you typically need to reduce concurrency to where it does not overload the system. Unless it is a really time critical process I would not use more than 4 parallel processes heavily writing to the database.

Introduce Throttling:
Sometimes even a single process overloads system too much. In this case throttling by having relatively short queries and introducing “sleeps” between them can be a good idea. It also often helps with monopolizing replication threads. For example if I need to delete old data instead of DELETE FROM TBL WHERE ts<"2010-01-01" I’ll do “DELETE FROM TBL WHERE TS<"2010-01-01" LIMIT 1000 in the loop until no more rows need to be deleted. I may inject “sleep” between iterations that are as long as query execution – so the longer queries run and the more the system is loaded the more “rest” it will get. Alternatively you can look at “threads_running” variable which is a very good simple identifier of the current load and sleep based on its value – for example you may want chose to pause the script if the load is too high and wait for threads_running to go below certain value.

Tuning Cron:
It also often helps to look into your cron or other scheduling system you’re using. Frequently way too many scripts can be started at once, or very close to each other so they start to overlap and produce the overload. Solutions could be spacing them out, introducing some “job control” to ensure scripts do not run in parallel if they should not (so you don't get many copies of same script running at once). One simple solution is instead of having a bunch of scripts scheduled at midnight, 1AM, 2AM to start I can put them into nightly.sh one after another and schedule that to run at midnight – this way I get scripts to run one after another at their own pace.

Dedicated Slave:
I remember listening to Cary Millsap’s talk once and he recommended moving the load in time and space as an optimization technique. We spoke about moving load in time before, but we also can move in space – putting it on a different system, which in MySQL space is most commonly a dedicated slave. In a lot of environments, especially with a low level of operational/development discipline to enforce previous solutions, it can be a life saver. Of course, it only works for read jobs, which is an important limitation. Getting slave(s) for batch jobs also can help in other ways too – such as competition for buffer pool between different kinds of workloads is reduced.

innodb_old_blocks_time:
Surprisingly simple but effective, setting innodb_old_blocks_time=1000 can often be very helpful in avoiding batch jobs washing away buffer pool contents and so making normal user queries a lot more disk bound and slower. I wrote about it in more detail few months ago.

Finally lets touch upon the discovery question. To deal with load management you need to understand when the problem is happening in your environment (we want to catch it before users complain right?) and if it does what jobs exactly cause the overload. In complex environments it might be a hard question. pt-stalk is a great tool for this purpose. Getting it running can help you to collect the state of your system when it was overloaded with side load. Analyzing the wealth of data it generates will most likely contain the answers you’re looking for.

Published at DZone with permission of Peter Zaitsev, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Paul Russel replied on Sun, 2012/06/10 - 9:12am

Another option for using a dedicated slave is to write all batch jobs to use Gearman MySQL UDF interface to apply update transactions to the master (but having the batch job only use the slave connection). You only need to implement a generic Gearman Worker to take update transactions to apply and pace MySQL transactions to the real master db. You can configure multiple workers for quicker batch job execution if the load on the DB is light and these workers can easily throttle themselves when the DB load increases.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.