John Caruso | 18 Aug 21:31

Data "corruption" with fastpath caching

Consider the following pseudocode snippet:

     <<...generate file $myfile in some way...>>
     ns_returnfile 200 text/plain $myfile
     ns_unlink $myfile

If this snippet is executed in a tight loop on a Linux system, the chances 
of returning the wrong results are very high due to AOLserver's fastpath 
caching, which requires the following four attributes to be identical to 
consider a new file to be a cache hit (as per the FastReturn function in 
fastpath.c):

1) Same device number
2) Same inode number
3) Same modification time (within one second)
4) Same size

Assuming $myfile is always on the same filesystem, number 1 is taken care 
of, and Linux reuses inode numbers, so the creation and deletion of 
$myfile will typically result in a file with the same inode.  So in this 
example, files created within a given second that contains the same amount 
of data as a preceding file created within that same second will be 
considered identical, and will be erroneously served from cache.

This isn't just a hypothetical, BTW; a client of mine ran into this issue 
and spent many weeks trying to figure out what was happening before 
tracing it back to AOLserver's fastpath caching.  And the issue had 
existed for many years without being detected.

I'm mainly bringing this up to shine a light on the issue and see what 
other people's views are.  It's potentially a very serious issue given 
that it may silently "corrupt" data, and the fact that fastpath caching is 
enabled by default means that people may run into it without even knowing 
they're exposed to the danger.  The best workaround I can think of (short 
of a checksum, which would defeat the purpose of caching in the first 
place) would be to check that the mtime or ctime of the file is some 
threshold number of seconds (e.g. 1 or 2) less than the current time, and 
not serve the file from cache if it's not.  In other words, a file would 
have to be at least X seconds old (which could be a configurable value) 
before it could be served from the cache rather than from disk.

Thoughts?

- John


Gmane