Lasse Kliemann | 9 Mar 2008 22:45

Re: refuse to update certain files upon extraction

* Message by -Lasse Kliemann- from Sun 2008-03-09:

> There was no error message (tried -vv, but not -debug). Some time passed 
> since, and I do not have those files anymore. But I found a way to reproduce 
> this, or at least something similar. How to do this is slightly complicated, 
> and I do not understand it fully myself, so I go right to the results. There 
> is a level 0 dump and a level 1 dump, and a file F that changed between the 
> dumps. If the level 1 dump is extracted with 'star -xpU', then the file F is 
> restored to the state it was when the dump was taken. However, if the two 
> dumps are extracted onto each other with 'star -xpU -restore', mysteriously F 
> is made a hardlink to some other file, and hence of course is not restored to 
> its correct state.
> 
> Now, the dumps are taken off from a Linux LVM filesystem snapshot. As you 
> know, I already discovered irregularities with those (but could not yet
> investigate it further satisfactorily due to time constraints). Would you 
> suggest that the above problem is caused by a faulty filesystem snapshot 
> implementation, or might this be a problem in star? You may inspect the 
> tarfiles if you wish, it's just about 2 MB; I uploaded them to 
> 
> http://unix.plastictree.net/tmp/20080309/dump0.tar
> http://unix.plastictree.net/tmp/20080309/dump1.tar
> 
> The file to look out for is `send-backup-test/supervise/pid'. It is made a 
> hardlink to `send-backup-test/log/supervise/pid' upon restore as described 
> above.

I've tracked this thing down further, and now I have a simple way to 
reproduce this, and it doesn't even involve snapshots. Attached is a small 
program rename.c. Assume that this program is available under the command 
name `rename'. 

1. Take some empty filesystem (or an empty directory; this works with partial 
   dumps roughly as well, skip the mount steps below in that case).

   Let i=0.

2. Mount that filesystem read-write.

2. cd into the filesystem and call rename with two random parameters, e.g.:

  rename $RANDOM $RANDOM

3. Leave the filesystem and remount it read-only. Take a level i dump. I 
always use '-c -acl -link-dirs -xdev -wtardumps'.

4. i=i+1. Goto 2 if i < 3.

Now, I have three dumps: level 0, level 1, and level 2. I restore them onto 
an empty filesystem and compare with the original. Most likely, the result 
will be something like this:

  a: different nlink,data,mtime
  x: different mtime
  sub/: different mtime
  sub/b: different nlink,hardlink,mtime

A closer inspection will most likely reveal that `a' and `sub/b' have the 
same inode.

I've tested this on recent Linux with ext2, ext3, xfs, and reiserfs. 

I've also tested it on SunOS 5.10, but only for the case of partial dumps 
(don't have root access to Solaris machines). There is also one special thing 
about Solaris: the call to rename must happen directly after taking the dump.  
Inserting a 'sleep 1' after the dump seems to be enough to circumvent the 
problem on Solaris (at least for partial dumps). (By the way, I've found more 
problems with such 'rapid' dumps, which I will describe separately.)

Star version is 1.5a87.

I hope there is a way to track this down to its source and fix it. If you 
need more information, please ask.

Lasse

#include <unistd.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

int main(int argc, char** argv) {
	int fd;

	fd = open("x", O_WRONLY | O_NDELAY | O_TRUNC | O_CREAT, 0644);	
	write(fd, "x", 1);
	close(fd);

	fd = open("new", O_WRONLY | O_NDELAY | O_TRUNC | O_CREAT, 0644);
	write(fd, argv[1], 3);
	close(fd);
	rename("new", "a");

	mkdir("sub", S_IRWXU);
	fd = open("new", O_WRONLY | O_NDELAY | O_TRUNC | O_CREAT, 0644);	
	write(fd, argv[2], 3);
	close(fd);
	rename("new", "sub/b");

	return 0;
}
_______________________________________________
Star-users mailing list
Star-users <at> lists.berlios.de
https://lists.berlios.de/mailman/listinfo/star-users

Gmane