Introduction

This lecture deals with arrays. While arrays are simple, in C they get mixed up with pointers, and this takes some adjustmented of concepts. The use of pointers for arrays is an element of C style. Strings are arrays of chars, terminated in a special way.

Arrays

Definition

Arrays always have a lower bound of zero (there is a reason, to do with pointers). So you only have to state the number of elements: int numbers[20]; char *ch_ptr[30]; declares an array of 20 integers and an array of 30 pointers to characters. The indices of an array are 0 .. (size-1) A typical array walk is for (n = 0; n < 20; n++) numbers[n] = n;

Initialisation

Arrays can be initialised as above. At definition time this can also be done if you know all the values. int numbers[4] = {0, 1, 2, 3}; sets a[0] = 0, etc. The size can be deduced from the initialiser list: int numbers[] = {0,1,2,3};

Sizeof

The size in bytes of anything may be found by using the ``sizeof'' operator. So if ``ch'' is of type char, ``sizeof(ch)'' should be one (byte). The sizeof an array is the total size in bytes for the whole array. The size of an element is the amount of space occupied by one element. So if you forget how many elements the array has, num = sizeof(a)/sizeof(a[0]);

Function parameters

When arrays are used as function parameters, the size is omitted main(int argc, char *argv[]) makes argv an array (of some size) of pointers to chars. /* read a set of numbers and * find the range between * largest and smallest */ #include <stdio.h> #define MAX 100 int spread(int b[], int size); int main(int argc, char *argv[]) { int n; int count = 0; int a[MAX]; while (count < MAX && scanf("%d", &n) != EOF) { a[count++] = n; } printf("spread was %d\n", spread(a, count); } int spread(int b[], int size) { int lo, hi, i; if (size == 0) return 0; lo = hi = b[0]; for (i = 0; i < size; i++) { if (b[i] > hi) hi = b[i]; if (b[i] < lo) lo = b[i]; } return (hi - lo); }

Arrays and pointers

The name of an array - just by itself - is the address of the base of the array

a == &a[0] *a == a[0] In general. a + n == &a[n] *(a + n) == a[n] The ``spread'' function could have been written int spread(int b[], int size) { int lo, hi, i; if (size == 0) return 0; lo = hi = b[0]; for (i = 0; i < size; i++) { if (*(b+i) > hi) hi = *(b+i); if (*(b+i) < lo) lo = *(b+i); } return (hi - lo); When an array is passed as a parameter to a function, its address is passed. This is a pointer value. The function, instead of declaring the parameter as an array could instead have declared it as a pointer.

This is a very common practice. Many library functions declare their arguments as pointers, but you have to pass in an array.

Once a parameter is a pointer, you can do pointer manipulations instead of costly array indexing. Here is another ``spread'':

int spread(int *b, int size) { int hi, lo, i; if (size == 0) return 0; lo = hi = *b; for (i = 0; i < size; i++) { if (*b > hi) hi = *b; if (*b < lo) lo = *b; b++; } return (hi - lo); } This style of coding is very common. Compare these two program fragments int a[] = {1, 2, 3}; void printit(int *a) { int n; for (n = 0; n < 3; n++) { printf("%d\n", *a); a++; } } int main(int arc, char *argv[]) { printit(a); } versus int a[] = {1, 2, 3}; int main(int argc, char *argv[]) { for (n = 0; n < 3; n++) { printf("%d\n", *a); a++; } } The second is wrong, the first is good C style.

Strings

A string is an array of chars, terminated by a null character. These are equivalent: char str[] = "hello"; char str[] = {'h', 'e', 'l', 'l', 'o', '\0'}; You can always count on the null char being at the end of a string (except sometimes). If you create a string you should ensure that you null-terminate it. Otherwise, everything breaks. Because strings end in null, a string walk looks like while (*str != '\0') str++; Here is strlen int strlen(char *str) { int length = 0; while (*str != '\0') { str++; length++; } return length; } Now this is where you get to see some of the special lurks that C has. You can combine the increment into the loop: int strlen(char *str) { int length = 0; while (*str++ != '\0') length++; return length; } Indeed, since 0 = '\0' = False, int strlen(char *str) { int length = 0; while (*str++) length++; return length; } Here is the compact form of strcpy void strcpy(char *from, char *to) { while (*to++ = *from++) ; /* empty body */ } You do eventually get used to this. However, you could always use the more readable versions!

Library functions

Here are some of the library functions that declare their parameters as pointers but actually expect an array: FILE *fopen(char *filename, char *mode) int puts(char *s) char *strcpy(char *s1, char *s2) int atoi(char *nptr) void *bsearch(void *key, void *base, ... Here filename, mode, s, s1, s2, nptr are all strings. base is the address of the array to be binary searched. key, however, is a pointer to an object, not neccessarily an array. Note that the functions sometimes return pointers. These may or may not be arrays. fopen returns a pointer to a structure (record) of type FILE, bsearch returns the address of the element found, whereas strcpy returns the address of the array s1.

Command line arguments

When a C program is compiled and run as a command, the command line parameters are available inside the program by the arguments to the main function int main(int argc, char *argv[]) argc is the number of command-line arguments (including the command). For example, if your C program was called ``ask'', and you called it by ask anybody there Then argc would be 3. The argv array contains 3 pointers to chars, which are 3 string arrays. The values are: argv[0] == "ask" argv[1] == "anybody" argv[2] == "there" A program to print out the command line arguments is #include <stdio.h> int main (int argc, char *argv[]) { int i; printf("Args to command:\n"); for (i = 0; i < argc; i++) { printf("arg %d is %s\n", i, argv[i]); } exit(0); } An equivalent program, using pointers instead of arrays, is #include <stdio.h> int main (int argc, char **argv) { int i; printf("Args to command:\n"); for (i = 0; i < argc; i++, argv++) { printf("arg %d is %s\n", i, *argv); } exit(0); }

Dynamic memory allocation

Programs cannot rely on just local and global data structures. It is often neccessary to create dynamic chunks of memory. The functions to manipulate dynamic memory are malloc and free. #include <stdlib.h> void *malloc(size_t size); void free(void *ptr); The malloc function takes one argument, which is the number of bytes of dynamic storage to allocate. It returns a pointer to the base of this memory. Note that the return type is void *. This means that it is not specified what type the memory is pointing to - you have to specify this.

For example, to create a block of 20 characters

char *pc; pc = (char *) malloc(20); To create a block of 20 integers int *pn; pn = (int *) malloc(20 * sizeof(int)); To free these when no longer needed, free((void *) pc); free((void *) pn);

You use malloc usually to create dynamic structures or dynamic arrays. In the second use, you are returned a pointer to a block of memory. You can treat the pointer as the base of an array (but the pointer can be changed), or just as a pointer

int *pn; pn = (int *) malloc(20); for (n = 0; n < 20; n++) pn[n] = 0; /* this loses the base of the block */ for (n = 0; n < 20; n++) *pn++ = 0;

Example

Here is an array version of strcat char *strcat(char *s1, char *s2) { char *p; int l1 = strlen(s1); int l2 = strlen(l2); int n; p = (char *) malloc(l1 + l2 + 1); for (n = 0; n < l1; n++) p[n] = s1[n]; /* add s2 after s1 */ for (n = 0; n < l2; n++) p[n + l1] = s2[n]; /* null terminate string */ p[l1 + l2] = '\0'; return p; } Here is a pointer version of strcat, char *strcat(char *s1, char *s2) { char *base, *p; int l1 = strlen(s1); int l2 = strlen(s2); base = p = (char *) malloc(l1 + l2 + 1); while (*p++ = *s1++) ; /* empty */ /* continue adding s2 to p */ while (*p++ = *s2++) ; /* empty */ return base; }

Conclusion

An array is a pointer to a fixed address of fixed size. Arrays can be manipulated using common array notation. Pointer variables (such as formal function parameters) can be set to array addresses, allowing C pointer mechanisms to be used to manipulate arrays. This is confusing, but is the key to C programming style, and an understanding of C libraries. Strings are arrays of char, null terminated.

http://pandonia.canberra.edu.au/OS/l5_1.html, copyright Jan Newmarch.

It is maintained by Jan Newmarch.
email: jan@ise.canberra.edu.au
Web: http://pandonia.canberra.edu.au/

Last modified: 14 August, 1995