Paul Anderson | 1 Oct 08:42
Picon
Picon

Re: Config-mgmt Digest, Vol 31, Issue 6


On 30 Sep 2008, at 20:00, Lamont Granquist <lamont@...>  
wrote:

> if you want to measure a useful statistic, measure the complexity  
> of the configuration state when it comes to deltas between boxes.
> as an example:
>
> if you have 30,000 servers with the same sshd_config file that is  
> highly compressible information and is a very low amount of state.
>
> if you have 30,000 servers with the same sshd_config file except  
> one group of boxes which have a different sshd_config file then  
> that is only slightly less compressible, but it roughly doubles the  
> compressed state (considering the sshd_config files to be opaque).
>
> if you have 30,000 servers with most having the same sshd_config  
> file except you've got 10 unique groups but those groups all share  
> the same slightly different sshd_config file, then that is even  
> less compressible.
>
> if you have 30,000 servers and have a standarized default, but 10  
> different groups have 10 differently unique sshd_config files then  
> you've degraded your compressibility even more.
>
> if you have 30,000 servers and have 10 different sshd_configs, but  
> have 6,000 different groups of servers and have no sane default and  
> each server group gets one of the 10 different config files, then  
> you've got a huge amount of config state (equivalent to 6,000  
> individual servers with each server having one of 10 different  
> config files, all maintained by hand).
>
> then you can measure commits as to if they increase or decrease the  
> complexity of the config state.

I think this is going in the right direction, but I still think that  
dealing with "files" as opaque objects is misleading - a "file" is  
rather an arbitrary unit - it is just a convenient way of bundling  
together some information - which may, or may not be conceptually  
related in configuration terms.

Consider this example ...

If your 30,000 servers included 100 arbitrarily different sendmail.cf  
files, then there could be very big (complex) differences between  
the  configuration states. You might say that the configuration of  
the mail subsystem was quite complex (in fact, you don't actually  
*know* how complex it is without studying the contents of the files).

However, if all those machines shared a common sendmail.cf "template"  
which was guaranteed to vary only in the name of the mail relay, then  
this would seem to be conceptually much simpler.

If the name of the mail relay was automatically computed by the  
configuration system, by some simple algorithm (depending on, say the  
network segment), rather than being entered by hand, I would consider  
the configuration even less complex still .....

I don't think you can get away from the fact that we are interested  
(usually, I think) in the *meaning* of the configuration, rather than  
the bits on the disk. Big differences in the meaning can often lead  
to quite small changes on the disk (a different LDAP server, or  
automount map?) and visa-versa ....

   Paul

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Gmane