Tuesday, 26 July 2011

Why SSDs kick ass.

Today I decided to install a Crucial M4 SSD into my 2011 Macbook Pro. It only took around 15 minutes to install the drive and around 30 minutes to restore from last night's Time Machine back up.
I have got to say that it's the best upgrade I have ever made to any personal computer (and that's out of a lot of upgrades).

My Macbook is now booting in 15 seconds and everything on the machine is available instantly.
I ran a disk speed test and the drive pulled off an amazing 195MB/s write and 422MB/s read!!





I also went a little crazy and decided to open nearly all my applications at once:





If you want to see the Geekbench results for my Macbook click here.

I always knew SDDs were awesome from using them with SAN arrays, but it's really nice to use them on a laptop or personal PC. The moral of this story is buy an SSD. I will be replacing all my system drives with them.

As always, thanks for reading!

Friday, 22 July 2011

Diagnosing and fixing MySQL I/O Performance.



Slow Inserts/Selects/Joins? High Load? High I/O wait?  


Hopefully this post will provide a few ideas for diagnosing and fixing some MySQL I/O issues.
All examples are based on InnoDB but most of the principals are the same/similar for MySAM.


Disks and storage can be a huge bottleneck for MySQL databases. Due to the nature of random I/O, disks with low seek times or poorly configured SAN infrastructure can really affect the performance of your MySQL server.


The obvious answer is get faster disks or SSD drives. This might not always be cost effective or practical. Who wants to walk down the corridor to visit purchasing and ask for a few thousand pounds when you can make some improvements from terminal?


Diagnosis:


So, let's look at a couple of tools to diagnose I/O problems:


iostat -x 2


This will give you a sample of your I/O wait (percentage of time that your CPUs are idle while waiting for a disk I/O request) every 2 seconds. Depending on how many cores you have this number will vary. 
With more modern CPUs that have multiple cores a higher I/O wait can be tolerated with out this being a problem. 


iostat -xcknd 1 120 > ioout.txt 


This will give you a 2 minute snapshop of your I/O wait stats and redirect it into a file for you check over.


A high I/O wait time coupled with high load can be a bad sign. On our 48 core systems I see a 7% wait time but a load of 0.2 so it's not much of an issue.


iotop -o -b 


This will give you a list of the processes that are writing to disk and the -b switch will format it nicely for logging over time.


ps -axu


Using the command ps -axu  and looking for commands with the state D will show you processes waiting for disk. Or you can get some help from watch and awk -


watch -n 1 "(ps aux | awk '\$8 ~ /D/  { print \$0 }')"


There are many more tools you could use: vnstat, top, nmon, systat, etc..


These examples will point you in the right direction when looking for I/O issues. The important thing is to remember is that all systems are different and you will need to look into the system and analyse what tolerance levels you can accept.
I am not going to be reckless and tell you what results you should be expecting to see. Performance is subjective right?




Possible Tweaks


Indexes/Queries: 


I will only touch on this subject in this post but a very important part of making sure you are getting the best MySQL performance is to make sure you are indexing on the correct fields for your application.
It is also important to make sure your queries are as optimized and application aware as possible. Make sure you are always using explain extended to make sure you are hitting your indexes.


explain extended SELECT foo FROM bar;


I am going to do another post where I will cover Indexes and queries in much more detail.


Memory:


Adding RAM to your server is a very easy and often cost effective way of reducing the disk I/O dependency of MySQL.
InnoDB uses it's buffer pool to buffer reads and writes, the more you can cache the less strain you put to disk.
If your server is dedicated to MySQL it's normally safe to allocate a large chunk of your memory to your innodb_buffer_pool. You can change this value by editing your my.cnf file.


In the past I have seen people recommending that you should set your  innodb_buffer_pool limit to around 10% larger than your data. This is a good figure but can be difficult if you store a large amount of data, so experiment with the amount of memory you can afford to allocate.
It's very important to make sure you never let your system start swapping, this will cause a huge drop in performance and will just act against the goal you have. If this happens reduce the buffer pool size until you reach the sweet spot.


Splitting I/O:


This will only really work for systems that are storing MySQL data on remote storage like a SAN.
InnoDB creates two log files, (ib_logfile0 , ib_logfile1) InnoDB writes to these log files as well as writing data to it's data files. If both the data and log files are stored on the SAN then you are doing twice as many writes to one set of disks than you need to.
You can move these log files to the local storage on the server by shutting down MySQL, editing my.cnf (innodb_data_file_path) and moving the files. You can also just rename the files and symlink them back to local disk and let MySQL recreate them.


If you are using a SAN you should read my post on Pushing your SAN with Open-iSCSI http://www.samlambert.com/2011/07/how-to-push-your-san-with-open-iscsi_13.html


Swappiness:


If you want to help Innodb keep tables and indexes in memory and further reduce I/O you can change the Linux swappiness value.


Swappiness is a value between 0 and 100. 100 means that Linux will be very aggressive at reclaiming memory from applications and will preemptively swap to disk. 0 will mean that memory won't be reclaimed as often allowing you cache better.


Set your swappiness to 0 by running the command


echo 0 > cat /proc/sys/vm/swappiness




RAMdisk:


For those who cannot Split I/O or just want to utilize more RAM you can force mysql to create temporary files on a RAMdisk.


You can create a 2 gb RAMdisk using the following commands:


1.mkdir -p /db/ramfs && chown mysql:mysql /db/ramfs
2.mount ramfs /db/ramfs -t ramfs -o size=2G
3.vi my.cnf file and add: tmpdir = /db/ramfs
4.service mysql restart


I hope that some of you will of found this post helpful and it will have at least provoked some thoughts about MySQL and I/O.


I will carry on updating this blog offering advice and articles on MySQL, Message queuing, Telemetry, Linux, general bits and bobs.


Thanks for reading! Don't forget to bookmark and share. Any questions please comment.

Thursday, 14 July 2011

How to push your SAN with Open-iSCSI.

We process and house incredible amounts of data inside MySQL servers running on Ubuntu 10.04 and distributed across an HP P4500 Left hand SAN array.

We are fed a constant stream of data from a StormMQ cluster and we process around 7,000,000,000 inserts per week and extract hundreds of gigabytes of data.

I will cover our use of message queuing in more detail in a later post.

The problem:

Over time our data has grown to many terabytes and this has started to affect performance negatively. When doing a single operation we saw normal performance but concurrency became a real struggle.

Thanks the de-coupled nature of our system we could afford to insert data at a reduced rate because our data would queue and wait to be processed, but our data extraction has to be fast.

The Fix:

Over the last month we have worked through our set up to try and meet the performance levels that we expect from the hardware we are running.
After investigating all the usual suspects; low performance, MySQL configs, hardware, networking, SAN performance and compatibility we were left scratching our heads a little.
We were hammering our servers with data and our SAN did not seem to under any load yet we where unable to pull the kind of speeds we needed.

One of the things we wanted to look at was the Open-iSCSI initiator. We started by checking the compatibility matrix for the P4500 which showed Open-iSCSI as officially compatible.
Next we used some google-fu to research Open-iSCSI settings and possible changes. To our surprise this threw up nothing of real value, prompting us to speak with other SAN and Linux engineers to find out how they tuned Open-iSCSI,. Again we were surprised that most people use the defaults and this was sufficient for most cases.

There was nothing left to but team up and experiment with Open-iSCSI and see if we could push our SAN harder. I started by flying to Kansas City MO to meet with Chris Mende from HP (@csmende) and get set up.

We got to work on setting up close monitoring on key system variables and put our system into a state that was safe to test in.

This is a live system with hundreds of distributed systems sending us data and in any normal set up shutting down would require a maintenance window, or a managed loss of data.

At Smith we don’t see the loss of one byte of data as an option, this we why we message queue.

I brought down our data processing applications on the application servers and left StormMQ to queue up all our data and safe guard our bits and bytes.

Next we decided on the key performance metrics we wanted to record and monitor to best analyze the effect of our changes.

For MySQL we chose concurrent query time and insert rate. For the P4500 we chose IOPS, throughput and Qdepth.
To gather this information we used a combination of HP centralized management console. Mytop, MySQL server output, Cacti, and our own in house developed systems.


Initial recording to define the problem :
















As you can see our inserts per second (NOIPS) take a dive along with our Qdepth as soon as we start our selects.

So now it’s changes time. We started by running our tests with the Open-iSCSI default config:










Next we looked at the node.session.cmds_max and node.session.queue_depth settings in /etc/iscsi/iscsid.conf

The default value for these settings with Open-iSCSI is:

node.session.cmds_max = 128

node.session.queue_depth = 32

However we knew from experience that the default max commands per session setting the Microsoft Iscsi initiator is 256, we also knew this setting can could be pushed so we went straight for a values of:

node.session.cmds_max = 1024

node.session.queue_depth = 128

After this change our results were:








These figures looked good so we started up our data imports and ran some big selects:









Success!! We have achieved true concurrency with inserts and selects. So editing our Iscsi config has worked.

Note: Between each test we disconnected from the target and restarted Open-iSCSI.

We are now inserting at an even more incredible rate. This is not just down to our iSCSI config however, we have a very highly tuned MySQL set up coupled with a message queue.

I will follow up on all of these in later posts.