Julian Foad | 8 Feb 16:44 2010

Re: [RFC] MTime functional specifications (v2.0)

Ed, thanks for this. Here are lots of comments and questions. If you
could update the doc wherever it makes sense, think about the hard parts
and come up with proposals for them, and ask any questions wherever it's
not clear, that would be great.

On Fri, 2010-02-05, Ed wrote:
> +OUTLINE OF A
> +
> +FUNCTIONAL (BEHAVIOURAL) SPECIFICATION FOR
> +
> +MTIME PRESERVATION (ISSUE #1256)
> +
> +
> +* Introduction
> +
> +Issue #1256 was entered into the Subversion issue tracker on 24th April 
> +2003 and has been opened since then.  A few patches have been submitted; 
> +but, have not been applied to the source code due to the insufficiency of 
> +said patches in one way or another.  In the meantime, users have made 
> +comments on the issue and several discussions have been put forth within 
> +the mailing lists; all to no conclusion.
> + 
> +This functional specification which will be submitted for RFC on the dev 
> +list:
> + 
> +     1) To generate a practical and user-friendly resolution to the issue.
> +     2) To formulate an implementation plan to allow the developers
> +        a clear-cut specification to work with.
> +        
> +
> +* Current Subversion behaviour
> +
> +The mtime of any file is the time when the file is checked out, updated 
> +or merged. Subversion does not save the mtime upon commit, so a file which 
> +is commited loses its mtime and when checked out, the mtime is set to the 
> +checkout time.  This behaviour is nonconducive to users who want to keep 
> +track of the mtimes of each file and/or directory.
> + 
> +There is a work-around to this issue (taken from IcePic's user response 
> +on IRC). Suppose we have a file called foo.txt and we wish to maintain its 
> +mtime attribute.  We create a pre-commit script which takes the mtime (and 
> +other information) of foo.txt and store it in another file, say mtime.txt 
> +and then commit foo.txt plus mtime.txt.  Then when updating or merging, 
> +another script grabs foo.txt and mtime.txt and then gets and saves the 
> +original mtime for foo.txt as retrieved from mtime.txt.  This is not
> +efficient and quite a tedious way of working around this issue.
> + 
> +* High-level semantics we are trying to achieve:
> +
> +    - Whenever Subversion puts or modifies a file (or directory) in 
> +      the WC, it shall set the node's mtime in the WC to the mtime 
> +      recorded for that node as given by the server. 

No, I don't agree. For example, when "svn merge" or "svn patch" modifies
a file, I think Subversion should not set the node's mtime to the mtime
recorded for that node. I would it expect it to use the current time,
probably. Do you agree? Are there other similar situations?

Let's have a "Definitions within this document" section.

Can we define "modifies a file" and "modifies a directory"? How about:

  - Subversion "modifies" a file when Subversion changes the file's text
(i.e. content), even if it is just updating the file's keywords or EOL
style.

  - Subversion "modifies" a directory ... never.

I don't know if those definitions are good, they're just some
definitions to serve as a starting point to decide if they're good.
These assume that we don't want to count property modifications as part
of the data that mtime refers to, but another option would be to count
property mods too. For directories, another option would be to include
changes to the list child entries.

> +    - Whenever Subversion modifies a file (or directory) in the WC, and
> +      there are local changes to that node, such as when updating a file 
> +      that the user has been editing, it shall update the mtime property.

The modification could cause local changes to exist or to stop existing.
To clarify, say "[...], and *after this modification* there are local
changes [...]".

"It shall update the mtime property" ... to what? The current time, I
presume. And I presume it should set the node's mtime to the current
time as well.

For a file with local mods, this rule conflicts with the first rule
above: if Subversion modifies such a file, does it set the node's mtime
to the recorded mtime or set the recorded mtime to the node's mtime?
Please clarify.

> +    - Backward compatibility issues: 
> +    
> +       o If the mtime/origmtime property hasn't been set for a node

What's "origmtime" or "origtime"? They are not mentioned until this
point.

> +         (most probably because it was stored in the repository
> +         prior to this feature being implemented), the mtime/origtime
> +         can be set to the current date of modification.

We'll need to specify what cases this rule will apply to, but at first
it's more important to define how it works among clients that have
implemented this support.

> +* Specification of the behaviour in all the cases:
> +
> +Data Storage:
> +
> +  Mtime shall be stored in a versioned property named 'svn:mtime'. 
> +  Any file or directory may have this property. The format of the 
> +  property value is 'YYYY-MM-DDTHH:MM:SS.UUUUUU' of which the time 
> +  is UTC.
> +
> +Behaviour of Each Action:
> +
> +  The behaviour of each svn action that may affect a node in the 
> +  WC, is for x where x is a member of {file, dir}:
> +    
> +          CT  = current time
> +         M(x) = mtime of x
> +         R(x) = recorded mtime 'svn:mtime'
> +         
> +  All initial values prior to the actions are set to NULL and it is
> +  assumed that all the following functions are done with all the
> +  necessary checking, such that as an example, after doing a
> +  'svn add',  svn will complain if the user repeats the command.

Talking of "svn add", can you add it to this list? And "copy", "update",
and "merge". According to the "High Level Semantics" section, all of
these commands (and more, probably) should have some effect.

> +       - import
> +          Let f_import(x) be the following process:
> +          	 1a) if x is already versioned, exit.
> +          	 1b) Otherwise, get M(x)
> +              2) Set R(x) = M(x)
> +       
> +          o if x = file, then f_import(x).
> +          o if x = dir, then recursively f_import(x)

OK.

> +       - checkout
> +          Let f_checkout(x) be the following process:
> +              1) Get R(x)
> +              2) Set M(x) = R(x)
> +              
> +          o if x = file, then f_checkout(x).
> +          o if x = dir, then recursively f_checkout(x)

OK.

> +       - commit
> +          Let f_commit(x) be the following process:
> +          	 1) Get M(x)
> +          	 2) Set R(x) = M(x)
> +          	   
> +          o if x = file, then f_commit(x).
> +          o if x = dir, then recursively f_commit(x)

Potentially OK, but it doesn't seem to match up very well with how I
understood the "High Level Semantics" section.

> +       - rename
> +          [Feature not yet available.  Equivalent to
> +               a) svn cp x y
> +               b) svn del x
> +           but there should be a historical link between
> +           y and x.  See issue #898.]
> +          i.e. svn rename x y
> +          Let f_rename(x,y) be the following process:
> +          
> +              1) Get R(x).
> +              2) Set R(y) = R(x)

OK.

> +                 
> +  If the 'svn:mtime' property does not exist for a file or
> +  directory, it means either it (file or directory) is not 
> +  versioned, or it existed in the repository and WC before 
> +  this feature was created.  Since the original information
> +  pertaining to the file/directory is lost, the options are
> +  either to store the current mtime as the original mtime
> +  or completely ignore the 'svn:mtime' property for this
> +  file or directory.  This functional specification takes 
> +  the tact of setting the 'svn:mtime' property to the current 

("takes the tack")

> +  mtime as it will give the user at least a starting point 
> +  to which to make their statistical/informational mtime references.

I'm not sure whether we'll want it to do that by default. We can come
back to this after we've got the basic operation solid.

> +* Controlling the behaviour:
> +
> +  Since 'svn:mtime' is a property, its behaviour can be controlled
> +  with any command that deals with properties, i.e. svn pset.

What we want to know about controlling the behaviour is how we can
command Subversion to start using svn:mtime properties or to stop using
them, and at what granularity (per user, per repository, per directory,
per file?). For example, a software developer would often want to be
able to specify that the properties shall not be used (at least for
updates), whereas a documentation person using the same repository would
probably want them enabled.

Look at the way that optional features such as "svn:ignore", auto-props
and use-commit-times are controlled. There is a mixture of client-config
options, per-directory/per-file properties, and command-line switches.

> +  For instance, the user wanting to set the 'svn:mtime' for an 
> +  already versioned file which has none of the 'svn:mtime' properties 
> +  set. 
> +         
> +         svn pset svn:mtime '11-11-2009T12:22:00.000000' foo.txt
> +
> +  "How does it interact with the "use-commit-times" option?"     
> +
> +    With this option, use-commit-times will override all Subversion   
> +  conditions as mentioned above.  

Sorry, I don't understand that sentence. As far as I can see,
use-commit-times was not mentioned above. Please give more details
(maybe in the detailed description section).

> +* Concerns
> +
> +  From the desc4 of Issue #1256,
> +  
> +   " 1) Preserving file timestamps on import and add, so when
> +       the file is checked out again, the working file gets the
> +       same timestamp as when it was imported/added. "
> +       
> +  When it means 'added', the author takes the position of when
> +  the file/directory is 'committed'.  The rationale is as
> +  follows.  The question is whether the user wishes to keep the 
> +  mtime of the add or is the commit 'mtime' sufficient enough?  
> +  This stems from the fact that the scheduled 'add' datetime 
> +  is not the same as the actual commit datetime.  A developer 
> +  can schedule an add and one day later, commit the actual changes. 
> +  Should there be another option for allowing the user to use the
> +  add 'mtime' and not the commit 'mtime'?  If the add 'mtime'
> +  is requested, where is this information stored?  It does
> +  require a temporary storage.  This convolutes the whole
> +  issue and since only one 'mtime' was requested, it would
> +  simply matters by just using the commit 'mtime'.

It sounds like you are saying that the combination of "svn add FILE"
followed by "svn commit FILE" should lead to the file's "mtime" property
recording the time of the commit.

When an unversioned file is added, and thus becomes versioned, its mtime
already exists (in the operating system) and I believe that is the time
we want to record as the file's "mtime" property. I don't understand why
we would want to record the commit time instead.

Note: The commit time is already recorded for every commit, in the
"svn:date" revision property.

- Julian


Gmane