Shell programming

Bourne shell

The Bourne shell was creeated early in Unix's history. It gave a command line environment that is heavily used throughout Unix and Linux. It also gave rise to the Obfuscated C contest.

Origins of Obfuscated C contest

From the http://www.ioccc.org/faq.html

Q: How did the IOCCC [Obfuscated C competition] get started?
A: One day (23 March 1984 to be exact), back Larry Bassel and I (Landon Curt Noll) were working for National Semiconductor's Genix porting group, we were both in our offices trying to fix some very broken code. Larry had been trying to fix a bug in the classic Bourne shell (C code #defined to death to sort of look like Algol) and I had been working on the finger program from early BSD (a bug ridden finger implementation to be sure). We happened to both wander (at the same time) out to the hallway in Building 7C to clear our heads. We began to compare notes: ''You won't believe the code I am trying to fix''. And: ''Well you cannot imagine the brain damage level of the code I'm trying to fix''. As well as: ''It more than bad code, the author really had to try to make it this bad!''.

"This program is a steganography application for embedding an image or text into another image as well as extracting the embedded image or text back. The program stores the embedded image or text in the low bits of the RGB values."
Also
donut.c

The Bourne shell has been re-written in several different versions: zsh, ksh, ash, ... We will use the Bourne Again Shell - bash

Shell scripts

A shell script is a file containing a sequence of commands. It doesn't need a special filename extension. The first line says which shell to run

#!/bin/bash

echo Welcome to a shell script
cd
echo Home directory has files
ls

If the script is called e.g. my_script invoke it under a new shell process by

bash my_script
or change its mode to executable and run it directly
echo Change its mode once
chmod a+x my_script
my_script
my_script

To run the script within the current shell, use '.':

. my_script
(This is usually used to include shell functions into a shell script - see later)

A script to see current date, time, username, and current directory:

#!/bin/bash

date
whoami
pwd

Common commands used in shell scripts

ls
rm
mv
cp
cd
pwd
mkdir
rmdir
echo
grep
sed
basename
wc
test

Variables

Every programming language has some notion of variables to store and retrieve values. They can be typed or untyped. They may or may not need to be declared. They may or may not need to be initialised. You may or may not be allowed to change the type stored.

Variable in the Unix shells

Setting and accessing

To assign a shell variable use '=' with no spaces

x=1   # ok
xyz="some text" # ok
x =1  # no - has a space
x= 1  # no - has a space

To access the value of a variable, prefix it with a '$'

x=1
y=$x
echo Values of x and y are $x, $y

You can see all the variables defined in the current shell by

set 
Many of these variables are set by bash when it runs, such as BASH_ARGC, HISTFILE, HOSTNAME, MACHTYPE, PS1, etc (see "man bash")

Values of variables by default are not seen in subshells

x=1
echo $x
#should have printed '1'
bash
echo $x
# should have printed blank line

Environment

The environment contains all the "global" variables. These are shown by the command

env
Exported variables are visible in subshells.

You can add to the environment by the command export

x=1
export x
export y=2

A one-off mechanism can add to the environment for one command:

x="local export" bash
echo $x
# should have printed local export

You can print all current environment variables with the command env . You see less than with set as it only includes the variables in the environment, and not the other non-environment variables set by the shell (or yourself).

Common variables

Major variables are

Quoting of variables

Motivation: you may set

my_music="My Music"
and then want to change to that directory
cd $my_music
breaks as it expands to cd My Music and the directory "My" doesn't exist. You need to quote it:
cd "$my_music"

Interpretation of a single character is turned off by prefixing it with a backslash `\' as in

echo the amount is \$20
To echo a `\' itself, use \\.

Enclosing something in single quotes '...' turns off all interpretation, including interpretation of $, * and \.

Enclosing something in double quotes "..." allows $variable substitution but no other.

echo The cost is $20
echo The cost is \$20
echo "The cost is $20"
echo 'The cost is $20'

Arcane stuff

Valid characters in variable names are alphanumerics and '_'. An expression like $x.txt is unambiguously the value of x followed by ".txt". Sometimes you might need other text and need to disambiguate the variable from the text: ${x}_txt

If you have an un-assigned variable then its value is "". If you want a default value, then use ${x:-default}

unset x
echo ${x:-bummer}

If you want to replace some text in the value of a variable (e.g. a file extension) then substitute it

x=abcd.txt
echo ${x/.txt/.doc}

I/O redirection

Example

  1.     ls > tmp
        
    saves ls output in tmp
  2.     ls -l >> tmp
        
    appends long listing to this
  3.     ls | wc -l
        
    counts the number of files in the current directory
  4.     ls -t | head -1
        
    shows the oldest file in the current directory
  5.     ls -t | tail -1
        
    shows the newest file in the current directory
  6.     ls -S | sed 1q
        
    shows the largest file in the current directory
  7.     ls -l |sort -n -k 5 | sed 's/.* //' | tail -1
        
    shows the largest file in the current directory
  8.     man cp | lpr
        
    sends the man page for cp to the printer
  9.     tr '\040\010' '\012\012' < file |
        sort | uniq
        
    produces an alphabetic listing of words in `file' (assuming words separated by spaces).

Arcane redirections

Each Unix process normally opens three "files": standard input (stdin), standard output (stdout) and error output (stderr). Unix keeps track of open files through a table of "file descriptors" which is an array numbered from zero upwards.

File descriptor 0 is stdin.
File descriptor 1 is stdout.
File descriptor 2 is stderr.

You can control the output of stdout and stderr separately by using the file descriptor

rm file 2> /tmp/errors
saves error messages in /dev/errors
rm file 2> /dev/null
discards error messages

This is very obscure: if you want stdout and stderr to be saved to the same file:

ls > tmp 2>&1

If you have a pipeline, it links stdout of one process to stdin of the next one. stderr goes to the console. If you want stderr to go down the same pipeline, use the same trick

ls 2>&1 | ...

More arcane stuff: here documents

Sometimes you know what the input to a program should be, and want to hard-code into your shell script.

wc <<END
hard-coded text
to be the input to
the wc command
END

Command substitution (grave command)

There are many occasions when you want to execute a command and keep the result around. For example, you may want to keep the list of files in the current directory stored in a variable. The grave command `...` runs the command between the accents and leaves the result in place.

Store today's date in a variable

TODAY=`date`
echo $TODAY

How big is the binary for bash?

ls -l `which bash`
ls -l `which bash` | cut -d ' ' -f 5

Create every file listed in the file list_of_files

touch `cat list_of_files`

And remove them all

rm `cat list_of_files`

An alternative notation is "$(...)"

rm $(cat list_of_files)

Positional parameters

A shell script/batch file is a file containing commands to execute. You run the script by calling bash, or by making it executable and running it (see last week).

A command can be run with parameters

#!/bin/bash

echo the number of parameters is $#
echo name of this script is $0
echo the first parameter is $1
echo the string of all parameters is $* ($1 $2 $3...)
echo the strings of all parameters ae $@
Save this as e.g. param_test and run by
param_test a bcd efg

The shift command drops off $1 renames $2 as $1, $3 as $2, etc

#!/bin/bash

echo the number of parameters is $#
echo name of this script is $0
echo the string of all parameters is $*

shift

echo After shift...
echo the number of parameters is $#
echo name of this script is $0
echo the string of all parameters is $*

Arithmetic

In any of the shells, arithmetic may be done using the expr command
x=2
y=`expr $x + 2`
bash also allows you to do arithmetic using the let command
x=3
let y=$x+4

Sequencing

Commands are normally one per line. Multiple sequential commands may appear on one line separated by a semi-colon
ls; echo "files listed"

For loop

The syntax of this command is
for vbl [in values]
do
        commands...
done
NB: the words ``for'', ``do'', ``done'' must be the first words on the line. The commands can be any set of commands.

Example

for i in a b c
do
  echo $i
done

An alternative form uses ';' sequencing

for i in a b c; do echo $i; done

Example

for i in `ls ~/..`
do
        echo "A user is $i"
done
A common case is to loop through all the command line arguments (positional parameters).
for i in $*
do
    echo "an arg was $i"
done
A shorthand for this is to omit the in $*:
for i
do
    echo "an arg was $i"
done

Nested

For loops can be nested

for i in a b c
do
    for j in 1 2 3 4
    do
        echo $i $j
    done
done

Exit codes

There are conditional expressions and while loops which use Boolean values. Commands act on files and produce visible output. Where is the Boolean value from running a command?

Every command succeeds or fails. eg. rm. may succeed at removing a file, or may fail to remove another file because of permission problems. An exit code holds this value.

The exit code is not visible. The exit code of the last command is stored in the shell variable ``?''. A value of zero stands for success, anything else for failure.

cd /tmp
echo > tmp$$
rm tmp$$
echo "Exit code of ok rm is $?"
rm tmp$$
echo "Exit code of failed rm is $?"
Generally, the exit code is not documented anywhere. The commands ``test'' and ``expr'' are the best documented, because they are often used in Boolean expressions.

Test command

Test is used to give boolean values/exit codes for common file tests:

An alternative notation is [ ... ]

[ -f file ]
Note spaces around ' [ ' and ' ] '

Test can also be used for simple string tests and numeric tests

While command

The syntax is
while commands
do
        commands
done
The list of commands in the Boolean part is executed each time round the loop. Usually this list is just one command, but it may be a pipeline. The exit code of the last command in this list is used as the Boolean value. If it is True (exit code zero), the commands in the body are executed.

Example

print the first 20 integers
i=1
while test $i -le 20
do
        echo $i
        let i=$i+1
done

Print the pattern

1
22
333
4444
55555
Script is
n=1
while [ $n -le 5 ]
do
    m=1
    while [ $m -le $n ]
    do
        echo -n $n
        let m=$m+1
    done
    echo
    let n=$n+1
done
I/O can be redirected from an entire while loop:
ls |
while read file
do
  echo "A file is $file"
done

Conditional statement

The syntax of this command is
if commands
then
        commands
fi
with variations such as
if commands
then
        commands
else
        commands
fi

Example

if a command script requires at least one parameter then this test should be used:
if [ $# -lt 2 ]
then
        echo "Usage: $0 files..."
        exit 1
fi
If a command requires that a file given as command line parameter iis readable
if [ -r "$1"]
then
    echo $1 is readable
fi

Example

Report on file types

for f in *
do
    if [ -r $f ]
    then
        echo $f is readable
    fi
    if [ -d $f ]
    then
        echo $f is a directory
    elif [ -f $f ]
    then
        echo $f is an ordinary file
        if [ -s $f ]
        then
            echo $f has non-zero size
        fi
    fi
done

Case statement

The syntax is
case value in
  pattern) commands;;
  ...
  pattern) commands;;
esac
The value is some string or integer. The patterns the use shell globbing mechanism of * and ?. The matches of value to pattern take place strictly top-to-bottom, and the first match is used.

Example

Test if variable x contains a single character value, two characters, or more:
case $x in
  ?)  echo has one character;;
  ??) echo has two characters;;
  *)  echo has more than two characters;;
esac

A simple menu:

echo "Enter a single letter
  d date
  p print working directory"
while read ch
do
  case $ch in
    d) date;;
    p) pwd;;
    *) echo Illegal command;;
  esac
done

grep

grep ("global regular expression print") finds lines in a file that match an expression and prints them

grep root /etc/passwd

sed

Short for "stream editor" allows you to write tiny programs that edit a piece of text or a text file. It can read the text either from a file

sed ... file
or have text sent to its standard input
echo ... | sed ...

Simple substitution is done by s/old/new. Chage the first 'a' to 'A' on each line
echo "a small piece of text" | sed 's/a/A/'
Works on multiple lines
echo "a
and
another
a" | sed 's/a/A/'
Changes all 'a's to 'A' on each and every line
echo "a small piece of a text" | sed 's/a/A/g'

Deleting a piece of text is often done by substituting it by nothing

echo "a small piece of text" | sed 's/a//'

Patterns are often used

Delete from the first space to the end of the line

echo "text with several spaces" | sed 's/ .*//'

Miscellaneous examples

Example

Test a Java program

#!/bin/bash

if [ $# -lt 1 ]
then
   echo Usage $0 java file
   exit 1
fi

sourcefile=$1

javac $sourcefile
execfile=`basename $sourcefile .java`

n=1
while test -f input$n
do
  java $execfile <input$n > output$n
  if cmp output$n result$n
  then
     echo Test $n passed
  else
     echo Test $n failed
  fi
  let n=$n+1
done

Example

read integers one per line from standard input until end-of-file (control-D) and print the sum
sum=0
while read x
do
    let sum=$sum+$x
done
echo "The sum was $sum"

Example

Print all IP addresses. The command ifconfig lists all information about IP addresses. This script extracts just the IP address:

ifconfig | grep 'inet addr' |
while read line
do
    addr=`echo "$line" | sed -e 's/.*inet addr://' -e 's/ .*//'`
    echo $addr
done

Example

Write a shell script that takes one parameter. This parameter is a command that is to be run on all ordinary files in the current directory, and recursively in every subdirectory. (Note that the command must be in one of the absolute directories of your search path because the current directory is changed on each recursive call.)
script=$0
command=$1
for f in *
do
    if [ -d $f ]
    then
        cd $f
        $script $command
        cd ..
    else
        $command $f
    fi
done

Example

Services may need to be stopped, paused or restarted. To do this, the process id is required. Most services store their process ids in files in /var/run/ as <service-name>.pid. This script will find the process ids of all services that are running:


for service_pid in `ls /var/run/*.pid` /var/run/*/*.pid 
do 
        service=`echo $service_pid | sed -e 's/.*\///' -e 's/.pid//'`    
        echo "Service $service has PID " `cat $service_pid 2> /dev/null` 
done 

example

From the startup script for Apache under Ubuntu, checking that that you have Apache setup

[ ! -f /etc/apache2 ] && echo "Apache not installed" && exit 1

if [ `ls -1 /etc/apache2/sites-enabled/ | wc -l | sed -e 's/ *//;'` -eq 0 ] 
then
    echo "You haven't enabled any sites yet, so I'm not starting apache2." 
    echo "To add and enable a host, use addhost and enhost."
    exit 0
fi

echo Apache sites installed