Introduction
This lecture deals with arrays. While arrays are simple, in C they get
mixed up with pointers, and this takes some adjustmented of concepts.
The use of pointers for arrays is an element of C style.
Strings are arrays of chars, terminated in a special way.
Arrays
Definition
Arrays always have a lower bound of zero (there is a reason, to do with pointers).
So you only have to state the number of elements:
int numbers[20];
char *ch_ptr[30];
declares an array of 20 integers and an array of 30 pointers to characters.
The indices of an array are
0 .. (size-1)
A typical array walk is
for (n = 0; n < 20; n++)
numbers[n] = n;
Initialisation
Arrays can be initialised as above. At definition time this can also be done
if you know all the values.
int numbers[4] = {0, 1, 2, 3};
sets a[0] = 0, etc.
The size can be deduced from the initialiser list:
int numbers[] = {0,1,2,3};
Sizeof
The size in bytes of anything may be found by using the ``sizeof'' operator.
So if ``ch'' is of type char, ``sizeof(ch)'' should be one (byte). The sizeof
an array is the total size in bytes for the whole array.
The size of an element
is the amount of space occupied by one element.
So if you forget how many elements
the array has,
num = sizeof(a)/sizeof(a[0]);
Function parameters
When arrays are used as function parameters, the size is omitted
main(int argc, char *argv[])
makes argv an array (of some size) of pointers to chars.
/* read a set of numbers and
* find the range between
* largest and smallest
*/
#include
#define MAX 100
int spread(int b[], int size);
int
main(int argc, char *argv[])
{
int n;
int count = 0;
int a[MAX];
while (count < MAX &&
scanf("%d", &n) != EOF) {
a[count++] = n;
}
printf("spread was %d\n",
spread(a, count);
}
int spread(int b[], int size)
{
int lo, hi, i;
if (size == 0)
return 0;
lo = hi = b[0];
for (i = 0; i < size; i++) {
if (b[i] > hi)
hi = b[i];
if (b[i] < lo)
lo = b[i];
}
return (hi - lo);
}
Arrays and pointers
The name of an array - just by itself - is the address of the base of the array
a == &a[0]
*a == a[0]
In general.
a + n == &a[n]
*(a + n) == a[n]
The ``spread'' function could have been written
int spread(int b[], int size)
{ int lo, hi, i;
if (size == 0)
return 0;
lo = hi = b[0];
for (i = 0; i < size; i++) {
if (*(b+i) > hi)
hi = *(b+i);
if (*(b+i) < lo)
lo = *(b+i);
}
return (hi - lo);
When an array is passed as a parameter to a function, its address is passed.
This is a pointer value. The function, instead of declaring the parameter as
an array could instead have declared it as a pointer.
This is a very common practice. Many library functions declare their arguments
as pointers, but you have to pass in an array.
Once a parameter is a pointer, you can do pointer manipulations instead of
costly array indexing. Here is another ``spread'':
int spread(int *b, int size)
{ int hi, lo, i;
if (size == 0)
return 0;
lo = hi = *b;
for (i = 0; i < size; i++) {
if (*b > hi)
hi = *b;
if (*b < lo)
lo = *b;
b++;
}
return (hi - lo);
}
This style of coding is very common. Compare these two program fragments
int a[] = {1, 2, 3};
void printit(int *a)
{ int n;
for (n = 0; n < 3; n++)
{
printf("%d\n", *a);
a++;
}
}
int main(int arc, char *argv[])
{
printit(a);
}
versus
int a[] = {1, 2, 3};
int main(int argc, char *argv[])
{
for (n = 0; n < 3; n++)
{
printf("%d\n", *a);
a++;
}
}
The second is wrong, the first is good C style.
Strings
A string is an array of chars, terminated by a null character. These are equivalent:
char str[] = "hello";
char str[] = {'h', 'e', 'l',
'l', 'o', '\0'};
You can always count on the null char being at the end of a string (except
sometimes). If you create a string you should ensure that you null-terminate
it. Otherwise, everything breaks.
Because strings end in null, a string walk looks like
while (*str != '\0')
str++;
Here is strlen
int strlen(char *str)
{ int length = 0;
while (*str != '\0') {
str++;
length++;
}
return length;
}
Now this is where you get to see some of the special lurks that C has. You
can combine the increment into the loop:
int strlen(char *str)
{ int length = 0;
while (*str++ != '\0')
length++;
return length;
}
Indeed, since 0 = '\0' = False,
int strlen(char *str)
{ int length = 0;
while (*str++)
length++;
return length;
}
Here is the compact form of strcpy
void strcpy(char *from,
char *to)
{
while (*to++ = *from++)
; /* empty body */
}
You do eventually get used to this. However, you could always use the more
readable versions!
Library functions
Here are some of the library functions that declare their parameters as pointers
but actually expect an array:
FILE *fopen(char *filename,
char *mode)
int puts(char *s)
char *strcpy(char *s1,
char *s2)
int atoi(char *nptr)
void *bsearch(void *key,
void *base, ...
Here filename, mode, s, s1, s2, nptr are all strings. base is the address of
the array to be binary searched. key, however, is a pointer to an object, not
neccessarily an array.
Note that the functions sometimes return pointers. These may or may not be
arrays. fopen returns a pointer to a structure (record) of type FILE, bsearch
returns the address of the element found, whereas strcpy returns the address
of the array s1.
Command line arguments
When a C program is compiled and run as a command, the command line parameters
are available inside the program by the arguments to the main function
int main(int argc,
char *argv[])
argc is the number of command-line arguments (including the command). For example,
if your C program was called ``ask'', and you called it by
ask anybody there
Then argc would be 3. The argv array contains 3 pointers to chars, which are
3 string arrays. The values are:
argv[0] == "ask"
argv[1] == "anybody"
argv[2] == "there"
A program to print out the command line arguments is
#include
int main (int argc,
char *argv[])
{ int i;
printf("Args to command:\n");
for (i = 0; i < argc; i++) {
printf("arg %d is %s\n",
i, argv[i]);
}
exit(0);
}
An equivalent program, using pointers instead of arrays, is
#include
int main (int argc,
char **argv)
{ int i;
printf("Args to command:\n");
for (i = 0; i < argc; i++, argv++) {
printf("arg %d is %s\n",
i, *argv);
}
exit(0);
}
Dynamic memory allocation
Programs cannot rely on just local and global data structures. It is
often neccessary to create dynamic chunks of memory. The functions to
manipulate dynamic memory are malloc
and free
.
#include
void *malloc(size_t size);
void free(void *ptr);
The malloc function takes one argument, which is the number of bytes of
dynamic storage to allocate. It returns a pointer to the base of this
memory. Note that the return type is void *
. This means
that it is not specified what type the memory is pointing to - you have
to specify this.
For example, to create a block of 20 characters
char *pc;
pc = (char *) malloc(20);
To create a block of 20 integers
int *pn;
pn = (int *) malloc(20 * sizeof(int));
To free these when no longer needed,
free((void *) pc);
free((void *) pn);
You use malloc usually to create dynamic structures or dynamic arrays.
In the second use, you are returned a pointer to a block of memory.
You can treat the pointer as the base of an array (but the pointer can
be changed), or just as a pointer
int *pn;
pn = (int *) malloc(20);
for (n = 0; n < 20; n++)
pn[n] = 0;
/* this loses the base of the block */
for (n = 0; n < 20; n++)
*pn++ = 0;
Example
Here is an array version of strcat
char *strcat(char *s1, char *s2)
{
char *p;
int l1 = strlen(s1);
int l2 = strlen(l2);
int n;
p = (char *) malloc(l1 + l2 + 1);
for (n = 0; n < l1; n++)
p[n] = s1[n];
/* add s2 after s1 */
for (n = 0; n < l2; n++)
p[n + l1] = s2[n];
/* null terminate string */
p[l1 + l2] = '\0';
return p;
}
Here is a pointer version of strcat,
char *strcat(char *s1, char *s2)
{
char *base, *p;
int l1 = strlen(s1);
int l2 = strlen(s2);
base = p =
(char *) malloc(l1 + l2 + 1);
while (*p++ = *s1++)
; /* empty */
/* continue adding s2 to p */
while (*p++ = *s2++)
; /* empty */
return base;
}
Conclusion
An array is a pointer to a fixed address of fixed size.
Arrays can be manipulated using common array notation.
Pointer variables (such as formal function parameters) can be set to
array addresses, allowing C pointer mechanisms to be used to manipulate arrays.
This is confusing, but is the key to C programming style, and an
understanding of C libraries. Strings are arrays of char, null terminated.
Structures
Structures are the C equivalent of records. A structure type is defined by
struct struct-name {
type field-name;
type field-name;
...
}
e.g.
struct student_type {
char name[20];
int ID;
}
Elements of that type are defined by
struct student_type fred, bill,
all_students[100];
Because it is tedious to have to remember to use the word ``struct'' in these,
the stucture is often ``typedef''-ed to avoid this:
typedef struct student_type {
char name[20];
int ID;
} student_type;
student_type fred, bill;
You access fields of a structure with the ``.'' notation:
fred.ID = 891234;
strcpy(fred.name, "fred");
It is common to have pointers to structures. The straightforward notation is
clumsy, so a shorthand is available
(*student_ptr).ID = ...
student_ptr->ID = ...
(Note that the student_ptr must be pointing to a valid record!)
Example:
Some functions to manipulate student structures.
void print(student_type *s)
{
printf("Name: %s\n",
s->name);
printf("ID: %d\n", s->ID);
}
student_type *
read(student_type *s)
{ int ID;
char name[20];
if (scanf("%d %19s",
&ID, name) == EOF)
return NULL;
s->ID = ID;
strcpy(s->name, name);
return s;
}
A sample program is
#define SIZE 100
/* a program to read and print at
* most 100 student records
*/
typedef struct student_type {
char name[20];
int ID;
} student_type;
void print_student(student_type *s)
{
printf("Name: %s\n",
s->name);
printf("ID: %d\n", s->ID);
}
student_type *
read_student(student_type *s)
{ int ID;
char name[20];
printf("Enter ID and name\n");
if (scanf("%d %19s",
&ID, name) == EOF)
return NULL;
s->ID = ID;
strcpy(s->name, name);
return s;
}
int main(int argc, char *argv[])
{
student_type students[SIZE];
int count = 0;
int n;
while (count < SIZE)
{
if (read_student(students +
count) == NULL)
break;
count++;
}
for (n = 0; n < count; n++)
print_student(students + n);
exit(0);
}
Example:
Printing the current date. The standard library has a number of time-related
functions. The first is
#include
time_t time(time_t *timer)
This returns the current time, in some unspecified format. This can be changed
into a known format by functions such as
#include
struct tm *localtime(time_t
*timer)
The structure tm has fields
struct tm {
int tm_sec; /* 0..61 */
int tm_min; /* 0..59 */
int tm_hour;/* 0..23 */
int tm_wday; /* 0..6 */
int tm_mon; /* 0..11 */
...
}
This allows you access to the localtime. For example
int current_day(void)
{ struct tm *local;
time_t t;
t = time(NULL);
local = localtime(&t);
return local->tm_wday;
}
#include <time.h>
int current_day(void)
{ struct tm *local;
time_t t;
t = time(NULL);
local = localtime(&t);
return local->tm_wday;
}
int main(int argc, char *argv[])
{
printf("Today: %d\n",
current_day());
exit(0);
}
Example:
Structures can be recursive, as in lists or trees.
You need to use pointers
inside the data structure. Some dynamic list functions:
typedef struct list {
int elmt;
struct list *next;
} list_elmt, *list_ptr;
list_ptr new_elmt(int n)
{ list_ptr p;
p = (list_ptr) malloc(
sizeof(list_elmt));
if (p != NULL) {
p->elmt = n;
p->next = NULL;
}
return p;
}
void print_list(list_ptr p)
{
while (p != NULL) {
printf(" %d", p->elmt);
p = p->next;
}
}
list_ptr make_list(void)
{ /* create a list storing 0..9
(or as much of it as
possible).
*/
list_ptr start_list, p;
int n;
start_list = p = new_elmt(0);
if (p == NULL)
return NULL;
for (n = 1; n < 10; n++) {
p->next = new_elmt(n);
if (p->next == NULL)
break;
p = p->next;
}
return start_list;
}
typedef struct list {
int elmt;
struct list *next;
} list_elmt, *list_ptr;
list_ptr new_elmt(int n)
{ list_ptr p;
p = (list_ptr) malloc(
sizeof(list_elmt));
if (p != NULL) {
p->elmt = n;
p->next = NULL;
}
return p;
}
void print_list(list_ptr p)
{
while (p != NULL) {
printf(" %d", p->elmt);
p = p->next;
}
}
list_ptr make_list(void)
{ /* create a list storing 0..9
(or as much of it as
possible).
*/
list_ptr start_list, p;
int n;
start_list = p = new_elmt(0);
if (p == NULL)
return NULL;
for (n = 1; n < 10; n++) {
p->next = new_elmt(n);
if (p->next == NULL)
break;
p = p->next;
}
return start_list;
}
int main(int argc, char *argv[])
{ list_ptr p;
p = make_list();
print_list(p);
exit(0);
}
Preprocessor
The first stage of compilation is to pass the source through the preprocessor.
This expands out certain symbols and produces another C source file (that you
do not normally see).
Include files
The statement
#include file
reads in the contents of the file at that point. These should be specification
files, giving details of data-types, function definitions, etc. The filename
can either be enclosed in double quotes "..." or in brackets <...>
#include "myspec.h"
#include
names in quotes normally refer to header files in your current directory, names
in brackets refer to files located in a standard place (usually /usr/include
on Unix).
Defines
If a piece of text is #define'd, then whenever that piece of text is encountered,
the remainder of the line following is substituted for it
#define MAX_SIZE 10
#define WARNING \
printf("Warning!!!\n");
if (x == 0)
WARNING
If the thing being defined has parameters then they act as a macro and parameter
subsitution is performed
#define SUM(x, y) x + y
a = SUM(b, c);
Macros are useful, but they can be a source of obscure errors:
a = SUM(b, c) * d;
becomes
a = b + c * d
Prevent this (and similar things) by enclosing everything in brackets
#define SUM(x, y) ((x) + (y))
Macros that use their arguments more than once can go wrong when used in situations
with side-effects:
#define islower(x) \
((ch) >= 'a' && \
(ch) <= 'z')
if (islower(getchar())) ...
Conditional compilation
The ifdef construct allows the preprocessor to keep or omit pieces of code.
I often have this:
#define DEBUG
#ifdef DEBUG
printf("Reached this bit\n");
#endif
Multiple files
A C program can be across many files.
When a variable or function is declared static, it is not visible outside of
its own file. This allows functions to be grouped together as a ``package''.
Here is a simple stack package:
#define SIZE 10
static int TOS = 0;
static stack[SIZE];
int push(int n)
{
if (TOS == SIZE - 1)
/* full up, return false */
return 0;
stack[TOS++] = n;
return 1;
}
int pop(int *n)
{
if (TOS == 0)
return 0;
*n = stack[--TOS];
return 1;
}
For completeness, this should have a specification file ``stack.h'' containing
extern int push(int n);
extern int pop(int *n);
Multiple files can be compiled all at once by placing them all on the command
line:
gcc -o prog src1.c src2.c ...
Make
There are smarter methods to avoid unneccessary compilations, which avoid
having to compile all the source files at once. By hand, a smarter method
of compiling three files to make one executable is
gcc -c src1.c
gcc -c src2.c
gcc -c src3.c
gcc -o prog src1.o src2.o src3.o
When any one of the files changes, only one of the three ``conditional''
compiles has to be repeated, plus the final link compile.
This can be automated using the ``make'' command. This expects a file
``Makefile'' which contains dependency instructions. These are of the
form
file : files it depends upon
instructions to bring it up to date
For example
OBJS = src1.o src2.o src3.o
CFLAGS = -g
src1.o : src1.c
gcc -c $(CFLAGS) src1.c
src2.o : src2.c
gcc -c $(CFLAGS) src2.c
src3.o : src3.c
gcc -c $(CFLAGS) src3.c
prog : $(OBJS)
gcc -o prog $(CFLAGS) $(OBJS)
Then whenever you make a change, running ``make'' automatically figures
out which commands to run.
make has inbuilt rules about many things, including how to compile C files.
The above can be abbreviated to
OBJS = src1.o src2.o src3.o
CFLAGS = -g
prog : $(OBJS)
gcc -o prog $(CFLAGS) $(OBJS)
System doco
UNAME(2V) SYSTEM CALLS UNAME(2V)
NAME
uname - get information about current system
SYNOPSIS
#include
int uname (name)
struct utsname *name;
DESCRIPTION
uname() stores information identifying
the current operating system in the
structure pointed to by name.
uname() uses the structure defined in
, the members of which
are:
struct utsname {
char sysname[9];
char nodename[9];
char nodeext[65-9];
char release[9];
char version[9];
char machine[9];
}
uname() places a null-terminated character
string naming the current operating
system in the character array sysname;
this string is SunOS on Sun systems.
nodename is set to the name that the
system is known by on a communications
network; this is the same value as is
returned by gethostname(2). release
and version are set to values that further
identify the operating system.
machine is set to a standard name that
identifies the hardware on which the
SunOS system is running. This is the same
as the value displayed by arch(1).
RETURN VALUES
uname() returns:
0 on success.
-1 on failure.
SEE ALSO
arch(1), uname(1), gethostname(2)
This doco defines the header file to use and the calling syntax of the function
(note that it uses ``old style'' C syntax in which the parameter types are
listed after the function).
The description shows what the structure is. If you aren't told it, then you
probably don't need to know it. The return values are shown generally indicating
success or fail. The See Also points you to other relevant functions. From
this we can write
#include <sys/utsname.h>
int
main(int argc, char *argv[])
{
struct utsname info;
if (uname(&info) == -1) {
fprintf(stderr,
"no name??\n");
exit(1);
}
printf("sys name: %s\n",
info.sysname);
exit(0);
}
Advanced stuff - function pointers
A function starts at an address in memory. When it is called,
execution jumps to that address and executes code from that
point. So in C, a function is an address, and calling the
function is a jump to that address.
You can store the address of a function in a function pointer
variable, and call it by dereferencing that address
int f();
int (*fp)();
fp = f; /* assign address of f to fp */
(*fp) (); /* call the function pointed to */
C and objects
C is not an O/O language. It doesn't have classes,
instances of classes (objects) or methods. But you can fake them
by using structures for classes and function pointers for methods
typedef struct person {
int age;
int (*getAge) (person *);
void (*setAge) (person *, int);
} person;
int getAge(person *p) {
return p->age;
}
int setAge(person *p, int age) {
p-≶age = age;
}
person p;
p.setAge = setAge;
p.getAge = getAge;
(*(p.setAge)) (&p, 20);
Faking inheritance
Inheritance can be faked by building up structures that contain
parts for each bit of the inheritance chain. For example, to build
a chain of employee inheriting from person, the following could be done
typedef personPart {
int age;
} personPart
typedef person {
personPart p;
} person;
typedef employeePart {
char *job;
} employeePart;
typedef employee {
personPart p;
employeepart emp;
} employee;
Then you can access new fields and inherited fields of employee
by
employee em;
em.p.age = 20;
em.emp.job = "clerk";
email:
jan@newmarch.name
Web:
http://jan.newmarch.name/
Last modified: 9 April, 2001