Who is using all my I/O?

Revision as of 21:44, 30 June 2012
ok, so a customer complained about slow I/O on his domain (which is on bowl) anyhow, it looks like I didn't get to the ticket in time, but I wanted to figure out what I'd do if I had gotten to it in time.

So I log in and look around. according to iostat, the disks are hitting something like 6K blocks per second written, which I don't think will saturate a 4 disk raid 0+1 even randomly. (maybe? but probably not.)

so I use iostat for a few minutes...

 iostat 1

which spews a bunch of data, for every LVM partition...

dm-59            66.67         0.00      5556.22          0      11168

is likely the problem; a single guest is writing something like 5500 blocks per second... far more than anyone else.

the question is... how do I figure out who dm-59 is? I mean, I use LVM partitions named after the username.

as far as I can tell, 59 is the minor number of the device... (note, I am not really sure about this) so I do

dmsetup ls |grep 59

and it says something like

[root@bowl ~]# dmsetup ls |grep 59
guests-larry	(252, 259)
guests-moe	(252, 159)
guests-curly	(252, 59)

(I've changed the usernames... any resemblance to actual usernames is chance.)

so obviously, the user in question is curly.

Now, what to do about this curly? we need a policy for this. I need to find something that will show blocks read/written per hour/day/ whatever, then have some policy for limiting people who use more than their fair share.

You may try http://wiki.prgmr.com/mediawiki/index.php/Mediate_disk_io_with_ionice as a starting point