The sequence
The sequence
set
exit without a code leaves
the code undefined. If you use an exit code in one place, you
must give one for all exit statements.
Add comment
if [ $# -ne 3 ]
then
echo "Usage: $0 arg1 arg2 arg3
exit 1
fi
or
test $# - ne 3 && echo Usage: $0 a1 a2 a3 && exit 1Add comment
if [ ! -r $1 ]
then
echo $1 is unreadable
exit 2
fi
Add comment
echo "..." | ... cat file | ...The first sends an explicit string down a pipeline, the second sends the contents of a file down a pipeline. Add comment
sed is often used to delete text from lines, or to keep
some text in lines.
To delete text, you match the text and replace it with nothing.
sed 's/text//'
To keep text, you match it and save it as \1, \2, etc. The rest of the text is matched as well. The replacement text is only the saved stuff.
sed 's/stuff\(text\)stuff/\1/'Add comment
for or while loops.
Write a shell script that lists all the executable files in the current directory
for file in `ls` # or for file in *
do
if [ -x $file ]
then
echo $file
fi
done
or
ls |
while read file
do
if [ -x $file ]
then
echo $file
fi
done
Add comment
access_log to my
Web server.
The access_log contains entries such as
hickory.canberra.edu.au - - [26/Jul/1994:12:50:03 +1000]
"GET /OS/l3_1.html HTTP/1.0" 200 6402
hickory.canberra.edu.au - - [26/Jul/1994:12:50:03 +1000]
"GET /OS/l3_1.html HTTP/1.0" 200 6402
vine.canberra.edu.au - - [26/Jul/1994:13:12:00 +1000]
"GET /OS/l1_1.html HTTP/1.0" 200 8041
ironwood.canberra.edu.au - - [26/Jul/1994:15:56:31 +1000]
"GET /OS/l1_1.html HTTP/1.0" 200 8041
(each entry on one line only).
This contains information about machine, date and document accessed.
It does not contain information about the identity of
the user or how long they spent with the document.
The file is created in a time-stamp order, by date+time.
I have written a collection of shell and Perl scripts to attempt to analyse this data. They are fairly typical scripts.
access_log and writes to standard
output a list of dates and lecture identifiers, organised for each lecture
by date.
The first lines of output are
22/Jul/1994 l1_1 26/Jul/1994 l1_1 26/Jul/1994 l1_1 26/Jul/1994 l1_1 26/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 27/Jul/1994 l1_1 28/Jul/1994 l1_1 . . . 21/Jul/1994 l1_2 27/Jul/1994 l1_2 27/Jul/1994 l1_2 27/Jul/1994 l1_2 27/Jul/1994 l1_2The script (daily.sh) is
for year in 1994 1995 1996
do
for lecture in l1_1 l1_2 l2_1 l2_2 l3_1 l3_2 \
l4_1 l4_2 l5_1 l5_2 l6_1 l6_2 \
l7_1 l7_2 l8_1 l8_2 l9_1 l9_2 l12_1 l12_2 \
l13_1 l14_1 l14_2 l15_1
do
grep "$year.*$lecture.html" < access_log
done |
sed 's/.*\[\(...........\).*OS\/\(.*\).html.*/\1 \2/'
done
Explanation: The grep pattern is based on the year plus
intervening text plus the lecture (document name).
The sed pattern looks for the '['
then captures the date as \1. It then skips to OS/ (as OS\/)
and captures the document name as \2. The remainder of the line is
matched. In the replacement pattern, only the date and document name
are retained.
The output is organised by year as primary sort key (outer for loop).
It is then sorted on document name (inner sort key).
Finally it is sorted by date, because that is the original sort order
of access_log.
Add comment
The output is of the form
1 Week: 29 Year: 1994 l1_1
22 Week: 30 Year: 1994 l1_1
10 Week: 31 Year: 1994 l1_1
1 Week: 32 Year: 1994 l1_1
5 Week: 33 Year: 1994 l1_1
10 Week: 34 Year: 1994 l1_1
7 Week: 35 Year: 1994 l1_1
3 Week: 36 Year: 1994 l1_1
7 Week: 37 Year: 1994 l1_1
3 Week: 38 Year: 1994 l1_1
4 Week: 39 Year: 1994 l1_1
4 Week: 40 Year: 1994 l1_1
3 Week: 42 Year: 1994 l1_1
The first entry is the count of accesses, given by uniq -c.
The rest is the result of the echo statement.
The input to this script is the output from the last script.
while read date lecture
do
day=`echo $date | cut -d/ -f1`
month=`echo $date | cut -d/ -f2`
year=`echo $date | cut -d/ -f3`
echo Week: `date -d "$month $day, $year" +%U` \
Year: $year $lecture
done < daily.sh.out |
uniq -c
Explanation: The cut command extracts fields of strings,
sometimes simpler than sed. cut -d/ uses '/' as delimiter,
since the dates are of the form 29/Jun/1995. The f1 option
extracts the first field, etc.
The date command handles date formatting. The
+%U option prints the week for that date.
The command uniq manages collections of identical lines.
Usually it collapses duplicates. The option -c counts
the number of duplicates instead.
Add comment
sed command,
fed into sort
# "anything" - - [ "11 date chars" chars OS/ "URL" chars sed 's/\(.*\)- - \[\(...........\).*OS\/\(.*\.html\).*/\2 \1 \3/' < access_log | sortThe output is
01/Apr/1996 cherry.canberra.edu.au OS.html 01/Apr/1996 cherry.canberra.edu.au assign1.94.html 01/Apr/1996 cherry.canberra.edu.au assignments.94.html 01/Apr/1996 cherry.canberra.edu.au assignments.94.html 01/Apr/1996 cherry.canberra.edu.au assignments.html 01/Apr/1996 cherry.canberra.edu.au assignments.html 01/Apr/1996 chiron.ringworld.com.au OS.html 01/Apr/1996 chiron.ringworld.com.au OS.html 01/Apr/1996 coho.stanford.edu aut_index.html 01/Apr/1996 dialup.bellatlantic.com l7_2.html 01/Apr/1996 dialup19.x25.infoweb.or.jp OS.htmlAdd comment
sed 's/ [^ ]*$//' | grep -v canberra.edu.au | uniq -c | sed 's/..\/.*//' | sort | uniq -cThe first sed eliminates the URL at the end. The grep discards local accesses. The uniq counts repeats on the day field+machine. The next sed deletes two chars followed by '/' followed by anything. Output is
3794 1
1548 2
941 3
584 4
410 5
277 6
204 7
155 8
120 9
. .
. .
Add comment
This page is http://pandonia.canberra.edu.au/OS/l13_1.html, copyright Jan Newmarch.
It is maintained by Jan Newmarch.
email:
jan@ise.canberra.edu.au
Web:
http://pandonia.canberra.edu.au/
Last modified: 5 August, 1996