This lecture looks at more shell programming.
It looks at more advanced features which may be needed
on occassions.
Debugging
A shell script can be debugged by turning tracing on. When a script is being
traced, before each command is executed, the command complete with arguments
is printed.
for i in 1 2 3
do echo $i
done
is traced as
+ echo 1
1
+ echo 2
2
+ echo 3
3
Tracing is turned on by
set -x
and off by
set +x
or a shell script, say ``script1'' may be run in debug mode by
bash -x script1
The sequential mechanism is to have commands on successive lines, or separated
by `;'. There are additional ways. To run a command asynchronously
command &
The command then runs in parallel with whatever else is done, timesharing the
computer.
The sequence
command1 && command1
first executes command1. If its exit code is zero (it succeeds) then command
2 is executed. If the exit code is non-zero, then command2 is not executed
and execution continues to the next command. It is equivalent to
if command1
then
command2
fi
It is typically used as a shorthand for things like
test -f file && rm file
which attempts to remove the file only if it exists.
The sequence
command1 || command2
executes command1, and if it fails executes command2
test -f file || \
(echo can\'t find file && exit 1)
Enclosing a command in (...) executes the command in a subshell. This is useful
if you want to make changes to the environment for some commands without affecting
others. For example, to store the parent directory in a variable
ParentDir=`(cd ..; pwd)`
This is easier than using a sed command which looks for a `/' and then
characters that aren't a `/':
ParentDir=`pwd | sed 's/\/[^/]*//'`
Variables that you create are by default local to that shell. To make a variable
global so that it is visible in subshells it must be exported
amt=20
export amt
Read a line of input. Assign the first word to the first variable, the second
to the second and so on. If there are more words on the input line than there
are variables, assign the string of the rest of the words to the last variable.
read word1 word2 rest
Shifts all positional args to the left i.e. loses the old $1, renames $2 as
$1, $3 as $2, etc. As a result, $* now contains as $1, $2, ... what used to
be $2, $3, ...
echo $1
shift
echo 'args $2 upwards are' $*
Trap a signal. Signals are sent to a process usually to tell it to give up.
Some signals are SIGHUP (the serial line has hung up), SIGINT (control-C),
SIGFPE (floating point exception). Signals are known by their numbers, and
SIGINT is 2. If a signal is trapped, execute given code when the signal occurs.
trap 'echo "cleaning up"
rm tmp*
exit 2' 2
Functions may be defined within the shell using the keyword function
function ll {
ls -l $*
}
The positional parameters $1, $2, ..., $*, $# refer now to the arguments with
which the function is called.
A restricted shell that only accepts a small number of commands:
while true
do
echo "Commands are:
edit file
list
quit"
read comm arg
case $comm in
edit) ted $arg;;
list) ls -l;;
quit) exit 0;;
esac
done
Sum the size in bytes of all non-directory files in the current directory
function fourth_arg {
echo $4
}
sum=0
for file in *
do
if [ ! -d $file ]
then
file_ls=`ls -l $file`
size=`fourth_arg $file_ls`
let sum=$sum+$size
fi
done
echo $sum
Automated testing of programs may be done easily. Suppose that you have a set
of files called testdata1, testdata2, ... that are the input files for a program
called ``program''. The correct output results are in files result1, result2,
... To test a program, you need to run the program with input the test files
and compare them to the result files. The file testall contains:
i=1
while [ -f testdata$i ]
do
test1 $i
let i=$i+1
done
The file test1 contains
program < testdata$1 > results
if [ ! -f results ]
then
echo "Test $1 had no result"
exit 1
fi
if cmp results result$1
then
echo "Result of test $1 ok"
rm -f results
exit 0
else
echo "Test $1 failed"
echo "Differences are:"
diff results result$1
rm -f results
exit 2
fi
Your program should exit with 0 if successful, some other number
otherwise. The statement exit without a code leaves
the code undefined. If you use an exit code in one place, you
must give one for all exit statements.
Add comment
Checking arg counts
if [ $# -ne 3 ]
then
echo "Usage: $0 arg1 arg2 arg3
exit 1
fi
or
test $# - ne 3 && echo Usage: $0 a1 a2 a3 && exit 1
Whenever I go for promotions interviews, one question I get is:
"Is there any value in placing lectures, software, etc, on the
Internet?" One way of answering this is by student surveys.
Another is by analysis of the access_log to my
Web server.
(each entry on one line only).
This contains information about machine, date and document accessed.
It does not contain information about the identity of
the user or how long they spent with the document.
The file is created in a time-stamp order, by date+time.
I have written a collection of shell and Perl scripts to attempt
to analyse this data. They are fairly typical scripts.
Daily access
This script takes access_log and writes to standard
output a list of dates and lecture identifiers, organised for each lecture
by date.
The first lines of output are
for year in 1994 1995 1996
do
for lecture in l1_1 l1_2 l2_1 l2_2 l3_1 l3_2 \
l4_1 l4_2 l5_1 l5_2 l6_1 l6_2 \
l7_1 l7_2 l8_1 l8_2 l9_1 l9_2 l12_1 l12_2 \
l13_1 l14_1 l14_2 l15_1
do
grep "$year.*$lecture.html" < access_log
done |
sed 's/.*\[\(...........\).*OS\/\(.*\).html.*/\1 \2/'
done
Explanation: The grep pattern is based on the year plus
intervening text plus the lecture (document name).
The sed pattern looks for the '['
then captures the date as \1. It then skips to OS/ (as OS\/)
and captures the document name as \2. The remainder of the line is
matched. In the replacement pattern, only the date and document name
are retained.
The output is organised by year as primary sort key (outer for loop).
It is then sorted on document name (inner sort key).
Finally it is sorted by date, because that is the original sort order
of access_log.
Add comment
Access by week
This contains more obscure commands. The intent is to take the
output from the last command and produce a count of accesses by
week and year for each lecture (weekly.sh).
The first entry is the count of accesses, given by uniq -c.
The rest is the result of the echo statement.
The input to this script is the output from the last script.
Explanation: The cut command extracts fields of strings,
sometimes simpler than sed. cut -d/ uses '/' as delimiter,
since the dates are of the form 29/Jun/1995. The f1 option
extracts the first field, etc.
The date command handles date formatting. The
+%U option prints the week for that date.
The command uniq manages collections of identical lines.
Usually it collapses duplicates. The option -c counts
the number of duplicates instead.
Add comment
Machine+date+url
The next script (machine+date+url.sh) is a straight sed command,
fed into sort
This script takes the output from the last script and produces a table
of accesses per day per machine, eliminating the UC machines.
This is a straight pipeline (count_day_access.sh).
sed 's/ [^ ]*$//' | grep -v canberra.edu.au |
uniq -c | sed 's/..\/.*//' |
sort | uniq -c
The first sed eliminates the URL at the end.
The grep discards local accesses.
The uniq counts repeats on the day field+machine.
The next sed deletes two chars followed by '/' followed by anything.
Output is