The aim of this talk is to examine a couple of basic examples of every-day work that a System Administrator has to carry out. The hope is that this will illuminate the short comings of many, typically home-brew, solutions based on what I am referring to as "hacky scripting". The "hacky scripting" bit is deliberately intended to be mildly provocative, by the way! After looking at the examples we will discuss the LCFG system configuration framework and how it can help the system administrator avoid problems and tedium.

Above all I strongly believe that the life of a System Administrator should never be dull!

Joke with appropriate screenshots: Life should be full of  watching soaps on the BBC iplayer, reading xkcd, drinking coffee and playing xbill!
( Or if you are a googler, it's got to be pool, table football and Mario Kart!)

Jokes aside, a good System Administrator will always develop solutions that allow them to avoid doing tediously repetitive tasks time and time again. They will also develop solutions which avoid common errors and reduce the likelihood of the brown stuff hitting the proverbial. Any person or team who has been in the business for a few years will have an arsenal of scripts ready for any job. Typically these will be written in shell, perl, python or ruby, depending on their age and heritage.

Example - Managing /etc/resolv.conf

As a first example of why hacking a few scripts together is rarely enough to achieve all the outcomes desired by the System Administrator I want to look at managing the resolver configuration file, /etc/resolv.conf. I suspect/hope all Unix system administrators have at least encountered this at some point. It's a nice simple text file, which has even been designed to be humanly readable, with just 5 options according to the manpage on my SL5 machine. Here's a brief glimpse in case you're none the wiser:


We can imagine a situation in which the SA who manages the local DNS service has to change the IP address of a DNS server. The editting of the resolv.conf file which must accompany this change can then be thought of as two simple operations:

  1. Remove old nameserver entry
  2. Add new nameserver entry

Manual Approach

Initially this might appear like such a trivial problem that anything beyond just editting the file seems like a total waste of time. If you don't already have a suitable script in your toolbox then the time taken to hack together a script to complete the task might take quite a bit longer than having to manually edit it on quite a few boxes. Considering the completely manual approach I think it stacks up a bit like this:

  • One machine == easy, no point doing it any other way
  • Several machines == boring
  • Many machines == utterly tedious

I am deliberately considering this in terms of how an SA is likely to feel, I know I have a short attention span when it comes to jobs like this. I'd much rather be coding up a smart solution. The more dull a task is the more likely I am to do something about creating some tools to do the job for me. A great idea has come out of the pugs part of the Perl6 project: "Optimize for fun", that's often how I think SA life should work.

For this example I am going to consider a set of machines which is comparable to that in the School of Informatics, we probably have about 1000 managed machines, that's a very long way into the "tedious" category.

Joke slide: "I know Shell/Perl/Python/Ruby* and I'm not afraid to use it!!"

* Delete as appropriate

A "hacky" solution


  1. Create a list of all your machines
  2. Write a new config file
  3. Write short script based around scp

I am going to assume that the SA has created a simple way in which the user can become root that does not require a password. It doesn't matter how good your script is if you still need to enter the root password 1000 times, it's still eyewateringly dull. A possible solution:


set -e

for machine in `cat machines.txt`; do
 echo $machine
 scp $machine:/tmp
 ssh $machine "nsu -c 'cp /tmp/ /tmp/foo'"


  • What about machines which are down or uncontactable?
    • Need to timeout the scp command
    • Need to keep a list so we can go back later
  • What about machines for which you are not directly responsible?
    • Not easy to delegate parts of root access
  • What about new machines?
    • Need to guarantee that new machines are configured appropriately
    • Preferably automatically
    • Need to avoid people not reading updated howto pages and doing it the "usual way"

Extending the Example

What happens if a new policy is introduced on how the contents of the resolv.conf file is handled? The ordering of the nameserver options is important and controls which DNS server is contacted first by the resolver. It is often considered a good idea to randomize the order of the nameservers across the set of machines so that if a DNS server goes AWOL then not all machines will be affected.

This introduces a new level of complexity to the problem, but with the power of your favourite scripting language it is not unsurmountable.

You could, for instance, generate a list of all possible files (2 servers gives 2 files, 3 servers gives 6) and have a script randomly select one to be pushed out by scp. You still have the various management problems related to machines being turned off or uncontactable though.


  • Imagine having to generate files which contain information related to each individual host
  • Do you have a script for every managed file?
  • How do you share your scripts?
  • Documentation?

Managing Daemons

Handling packages

-- Main.squinney - 12 Mar 2008

Edit | Attach | Print version | History: r2 < r1 | Backlinks | Raw View | More topic actions...
Topic revision: r1 - 2008-03-12 - squinney
  • Edit
  • Attach
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback