1 Oct 08:42
Re: Config-mgmt Digest, Vol 31, Issue 6
On 30 Sep 2008, at 20:00, Lamont Granquist <lamont@...> wrote: > if you want to measure a useful statistic, measure the complexity > of the configuration state when it comes to deltas between boxes. > as an example: > > if you have 30,000 servers with the same sshd_config file that is > highly compressible information and is a very low amount of state. > > if you have 30,000 servers with the same sshd_config file except > one group of boxes which have a different sshd_config file then > that is only slightly less compressible, but it roughly doubles the > compressed state (considering the sshd_config files to be opaque). > > if you have 30,000 servers with most having the same sshd_config > file except you've got 10 unique groups but those groups all share > the same slightly different sshd_config file, then that is even > less compressible. > > if you have 30,000 servers and have a standarized default, but 10 > different groups have 10 differently unique sshd_config files then > you've degraded your compressibility even more. > > if you have 30,000 servers and have 10 different sshd_configs, but > have 6,000 different groups of servers and have no sane default and > each server group gets one of the 10 different config files, then > you've got a huge amount of config state (equivalent to 6,000 > individual servers with each server having one of 10 different > config files, all maintained by hand). > > then you can measure commits as to if they increase or decrease the > complexity of the config state. I think this is going in the right direction, but I still think that dealing with "files" as opaque objects is misleading - a "file" is rather an arbitrary unit - it is just a convenient way of bundling together some information - which may, or may not be conceptually related in configuration terms. Consider this example ... If your 30,000 servers included 100 arbitrarily different sendmail.cf files, then there could be very big (complex) differences between the configuration states. You might say that the configuration of the mail subsystem was quite complex (in fact, you don't actually *know* how complex it is without studying the contents of the files). However, if all those machines shared a common sendmail.cf "template" which was guaranteed to vary only in the name of the mail relay, then this would seem to be conceptually much simpler. If the name of the mail relay was automatically computed by the configuration system, by some simple algorithm (depending on, say the network segment), rather than being entered by hand, I would consider the configuration even less complex still ..... I don't think you can get away from the fact that we are interested (usually, I think) in the *meaning* of the configuration, rather than the bits on the disk. Big differences in the meaning can often lead to quite small changes on the disk (a different LDAP server, or automount map?) and visa-versa .... Paul -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
RSS Feed