Michael Tokarev | 2 Jul 14:46 2010

slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3)


I noticed that qcow2 images, esp. fresh ones (so that they
receive lots of metadata updates) are very slow on my
machine.  And on IRC (#kvm), Sheldon Hearn found that on
ext3, it is fast again.

So I tested different combinations for a bit, and observed
the following:

for fresh qcow2 file, with default qemu cache settings,
copying kernel source is about 10 times slower on ext4
than on ext3.  Second copy (rewrite) is significantly
faster in both cases (expectable), but still ~20% slower
on ext4 than on ext3.

Normal cache mode in qemu is writethrough, which translates
to O_SYNC file open mode.

With cache=none, which translates to O_DIRECT, metadata-
intensive writes (fresh qcow) are about as slow as on
ext4 with O_SYNC, and rewrite is expectedly faster, but
now there's _no_ difference in speed between ext3 and ext4.

I did a series of straces of the writer processes, -- time
spent in pwrite() syscalls is significantly larger for
ext4 with O_SYNC than with ext3 with O_SYNC, the diff is
about 50 times.

Also, with slower I/O in case of ext4, qemu-kvm starts more
I/O threads, which, as it seems, slows whole thing down even
further - I changed max_threads from default 64 to 16, and
the speed improved slightly.  Here, the diff. is again quite
significant: on ext3 qemu spawns only 8 threads, while on
ext4 all 64 I/O threads are spawned almost immediately.

So I've two questions:

 1.  Why ext4 O_SYNC is too slow compared with ext3 O_SYNC?
   This is observed on 2.6.32 and 2.6.34 kernels, barriers
   or data={writeback|ordered} had no difference.  I tested
   whole thing on a partition on a single drive, sheldonh
   used ext[34]fs on top of lvm on a raid1 volume.

 2.  The number of threads spawned for I/O... this is a good
   question, how to find an adequate cap.  Different hw has
   different capabilities, and we may have more users doing
   I/O at the same time...