15 Feb 23:44
Challenge: Make Tux3 work well with flash disks
Daniel Phillips <phillips <at> phunq.net>
2009-02-15 22:44:37 GMT
2009-02-15 22:44:37 GMT
Hi all, Please see this well written analysis of performance loss as a new-generation Intel flash disk "ages": http://www.pcper.com/article.php?aid=669 "Long-term performance analysis of Intel Mainstream SSDs" Though I have not really analyzed the issues completely at this time, I have the feeling Intel made a slight mistake in the way they combine writes. I think that what they do is this: they have a "current" flash block, which starts fully erased, then each write transfer is appended until it is full. So writes are combined in write order, which is a lot like the deduplication plan the Pune Institute students are pursuing. The bucket idea is likely to have advantages and drawbacks similar to Intel's SSD write strategy. The problem in both cases is the effect of rewrites, which cause data to be relocated away from its original position, leaving holes at the original position. This may not be as big a problem with deduplication if the target application is mainly archive, but it is a serious and visible problem with a flash device that intends to act like a disk drive. What happens is, when Intel's disk fills and ages, the best candidate block for erasing will have a high percentage of valid data on it, which has to be copied to a new location. The performance of the disk under a steady write load will thus drop to a fraction of the erase speed, because a portion of data recovered by erasing has to be used to store valid data relocated from candidate erase blocks. If my understanding of the issue is correct, then the big problem is that Intel relies only on order written to decide how data should be grouped together on flash blocks. The grouping really needs to incorporate spatial adjacency as well, to maximize the chance that an entire flash block or at least a large portion of it will be rewritten in future, thus lowering the portion of data that has to be relocated. One piece of this story I have not figured out yet, is why combining writes is a big performance win for the Intel flash disk. I suspect that it actually is not a big advantage, and that this technique was just the easiest thing to implement. On an initially empty drive, it benchmarks well, just as our current next-available allocation policy will perform well initially, and steadily worsen as the filesystem ages. I hope somebody will eventually enlighten me about whether there is some other advantage to write combining that I have not yet perceived. Until that happens, I am proceding on the assumption that Intel's strategy is suboptimal and will soon need to be improved to avoid further criticism of long term performance characteristics. Anyway, my tentative conclusion is that flash disk will not in fact completely liberate filesystem designers from issues of spatial organization: Intel will ultimately be forced to redesign their flash write algorithms and filesystem designers will need to keep thinking about layout issues. In other words, as the world moves to solid state storage, the importance of spatial optimization will not be reduced, only the parameters of the problem are changed. For flash, even though seeking is not a problem, we still need to try to maximize the likelihood that physically adjacent data is rewritten at the same time. This assumes that Intel well modify their write algorithm to rely on that, which looks like a pretty safe bet right now. I think we are talking about performance differences approaching an order of magnitude between the best and worst algorithms, making this an important issue that will only get more important. To be sure, we have more pressing issues than flash performance just now. However I would like to see our thinking on this subject progress in the background as we work on other things. Anybody who wants to jump in at this point, please do. Regards, Daniel
RSS Feed