CPE 3007 Program Development using Unix

Assignment Two
Semester 1, 2003

Introduction

The purpose of this assignment is to write a simple C program. It will involve checking the local files on a Web site to see if all the links are present.

Details

Your program should consist of a single C source file, which may be made up of as many C functions as you wish.

It should be run with one or more command line arguments, which are the names of files. Each file should end in ".html" or ".htm", in any mixture of upper or lower case.

The program should examine each file and find all the hypertext references in the file. These are of the form <a href="url"> or <img src="url">. For simplicity, assume that the tag and attribute names are exactly as above (all lower case, one space after tag name, no space before or after "="). Also for simplicity you may assume the complete reference is all on one line.

Only look at local references i.e. if a url starts with "http://" then discard it. A local reference will be one of the following

If it is an image reference, just check if the file exists. If it is an html reference (ends in ".htm" or ".html") you should check the file exists and scan that file for references as well (recursively).

If a reference is not found, then a message should be printed, saying what the reference is, and what file contains the reference.

Errors

Your program should handle errors such as incorrect usage, arguments not being filenames, etc. If an error occurs, you should decide if it is a fatal or non-fatal error and treat it accordingly. Your program should exit with meaningful exit codes in all cases.

Due date

The assignment is due on Friday 16 May, in your tutorial


Jan Newmarch (http://jan.newmarch.name)
jan@newmarch.name
Last modified: Wed Mar 14 11:49:08 EST 2001
Copyright ©Jan Newmarch