This is a preliminary version!
Hide style3dev.css to get rid of the annotations!

Lesson 5: Linked lists

Purpose: Introduce linked lists.
References: The tutorialspoint about

There is also relevant material in wikibooks: C Programming.

Introduction

One way to keep track of a bunch of values of the same type is to use an array.

The elements in an array are stored in an continuous area in the computer memory.

Example:

int a[5] = {3, 0, 5, 0, 0};
which can be illustrated bild

Advantages with arrays:

  1. Storage efficiency. Almost no storage overhead if we know how many elements there are to be stored.
  2. Fast (constant time) access to elements using index no matter how large the array is.

Disadvantages with arrays:

  1. All elements have to be of the same type.
  2. The size must be known at creation time.
  3. If we want to insert new values or delete old values already stored elements have to be moved.

The first disadvantage can be handled by storing pointers in the array. Example: An array of strings

char *a[5]; a[0] = "Ora"; a[1] = "et"; a[2] = "labora!"; a[3] = NULL;
which can be illustrated bild
Remember that these statements are pointer assignments!

Linked lists

The idea in a linked list is to let every element have a pointer to the following element.

Suppose we want to represent a queue of persons. We can add a pointer in the struct representing a person:

bild linkedList1.png

Each struct in the list contains a field, for example named next, pointing to the successor. The variable first keeps track of the first person.

However, a next field is not a very natural property for person. A more natural solution is to create a special struct for keeping track of the elements a list:

bild

The list structs could for example, be declared:

struct ListNode { struct Person *data; struct ListNode *next; }

Many programming languages have such structures defined but, fortunately :-), in C we have to do everything ourselves.

Example: A list of integers

We will work with a list storing integers. Although it may seem as a very limited example, it will illustrate many of the fundamental techniques for handling lists.

We will build a structure of this type:

bild

To implement this in C we use the following declarations:

struct Node { int data; struct Node *next; }; typedef struct Node Node;

It is convenient to have a function for creating new list nodes, a so called constructor:

CodeExplanation
Node *cons(int data, Node *next) { Returns a pointer to the allocated Node
Node *n = (Node *) malloc(sizeof(Node)); Allocates memory for a Node struct
if (n == NULL) { printf("Out of memory in cons\n"); exit(1); In case of the (very unlikely) event that we ran out of memory.
} else { (*n).data = data; (*n).next = next; } return n; } Stores the parameter values for data and next pointer in the allocated Node and returns the pointer to it.

A function for printing lists is also nice to have:

CodeExplanation
void print(Node *f) { Gets a pointer to the first node
printf("["); Enclose the list by hard parentheses
while (f) { Iterate as long as f does not contain NULL
printf("%d", (*f).data); Print the data value in this node
if ((*f).next) { printf(", "); } Add a comma and a space unless it is the last node
f = (*f).next; } Advance the pointer to the next node and take the next iteration
printf("]"); } Print the closing parentheses and leave

And, finally, a small test program:

CodeExplanation
int main() { Node *list = NULL; for (int i = 10; i > 0; i--) { list = cons(i, list); } print(list); printf("\n"); } Puts the numbers 10, 9, 8, ... , 2, 1 successively as the first element in
the list using the cons function and prints it.
You can get the source code from this link.

An add function

Although a list can be build by the cons-method as done in the test program above, we would like to have a more general function for adding data to the list.

We will write an add function which, in it's first version put the new element at the first position in the list. (This is actually exactly what was done by the statement list = cons(i, list); in the main function above but it is of interest to study the parameter passing technique.)

We would like to be able to say something like  add(list, data)  so that the loop in the main above can be written
Node *list = NULL; for (int i = 10; i > 0; i--) { addFirst(list, i); }

The problem is that the list parameter is passed by value and can not be changed by the add function. We could solve this problem by letting add return a pointer to the new first node in which case we have to call the function   list = new(list, i);. We will, however, take another approach.

Instead of passing a pointer to the first node in the list, we pass the address to the pointer to the first node in the list. This makes it possible for the add function to change the pointer to the first node.

void addFirst(Node **f, int x) { *f = cons(x, *f); }
So, if  f  is a node then  *f  is a pointer to a node and  **f  is a pointer to a pointer to a node.

This function is in the code we linked above. The code also includes two implementations of a function that adds a new node at the end of the list.

Exercise: Make sure you understand the two addLast functions!

Some new notations

There is an operator -> which is quite useful for accessing fields in dynamically allocated structs.

If p is a pointer to a struct with a field named x we can write p->x instead of (*p).x 

We will introduce that in our code but we will also define a new type alias for links to nodes with help of typedef. We can actually define both the Node struct, the Node type and the link in the same typedef statement:

typedef struct Node { int data; struct Node *next; } Node, *link;
With this new notation and typedef the cons function can be written:
New version Old version
link cons(int data, link next) { link n = (link) malloc(sizeof(Node)); if (n == NULL) { printf("Out of memory in cons\n"); exit(1); } else { n->data = data; n->next = next; } return n; }
Node *cons(int data, Node *next) { Node *n = (Node *) malloc(sizeof(Node)); if (n == NULL) { printf("Out of memory in cons\n"); exit(1); } else { (*n).data = data; (*n).next = next; } return n; }

We will use the new notation but you can use either one (even in the same code) - it is mostly a matter of personal taste.

Here is a link to integer list program written with the new notation. It also contains some more functions that we will discuss now.

A remove function

This function removes the first node from the list and returns the value stored in the removed node.
CodeExplanation
int removeFirst(link *first) { We must give the address to the pointer to the first node to be able to change it.
assert(*first); Abort if the list is empty
link toBeReleased = *first; Set a pointer to the node which should be freed
*first = (*first)->next; Change the pointer to the first node
int result = (*toBeReleased)->data; Take out the intended result (must be done before the node is freed).
free(toBeReleased); Release the former first node.
return result; } And return the value store in the released node
Note that it is important that the function releases the area previously allocated by malloc. The following implementation
int removeFirst(link *n) { assert(*n); int result = (*n)->data; *n = (*n)->next; return result; }
will work but will cause a memory leak which, in the end, could cause program failure.

Recursive or iterative functions?

Many function for list handling are elegantly written using recursion.

Example: Make a copy of the list

link cloneRec(link f) { if (f == NULL) { return NULL; } else { return cons(f->data, cloneRec(f->next)); } }
Compare with the iterative variant:
link cloneIter(link first) { if (first==NULL) { return NULL; } else { link result = cons(first->data, NULL); link last = result; for (link n = first->next; n != NULL; n = n->next) { last = last->next = cons(n->data, NULL); } return result; } }

A sorted list

Suppose we want to keep the list sorted with the data in increasing order:

bild

Then we need the function e.g. named addSorted instead of addFirst.

A new node could go into the beginning, somewhere in the interior or at the end of the list:

bild

Obviously, we have to change a pointer to get the new node into the list. It could be the first-pointer or one of the next-pointers in the nodes:

bild

We could the solve the problem by iterating over node pointers. Note that there are 5 nodes but 6 node pointers in the example.

The variable n in the figure above is supposed to point to node pointers. Thus it should be declared link * or, equivalent, Node **.

A pure iteration over the node pointers can be written:

Node **n = &first; while (*n != NULL) { // do something n = &((**n).next); }
link *n = &first; while (*n != NULL) { // do something n = &((*n)->next); }
Explanation of the expression
&((**n).next):

By assigning the value of &((**n).next) to n makes n point to the next pointer to a node.

Explanation of the expression
&((*n)->next):

By assigning the value of &((*n)->next) to n makes n point to the next pointer to a node.

To add a new value x in a sorted list keeping the list sorted we have to find the address of the pointer to the first node with a value greater than x since that is the pointer to be set to the new node. Before we look the value we have to make sure that we are not the last node.

void addSorted(Node **n, int x) { while (*n != NULL && x > (**n).data) { n = &((**n).next); } *n = cons(x, *n); }
void addSorted(link *n, int x) { while (*n != NULL && x > (*n)->data) { n = &((*n)->next); } *n = cons(x, *n); }

Note the order in the while-condition: We must check *n against NULL before we try to access the target of the pointer.

This function can also nicely be expressed with recursion:

void addSortedRecursive(link *n, int x) { if (*n == NULL || x < (*n)->data) { *n = cons(x, *n); } else { addSortedRecursive(&((*n)->next), x); } }
It probably needs some thinking to understand the code.

The program also contains a toString-function which is somewhat more useful than the print function. It makes use of the library function sprintf which works as printf but stores the result in string (character array) instead of writing it to standard output. Try to understand it!

Exercises

Try both iterative and recursive solutions!
  1. Write a function int isEmpty(link first) that returns 1 if the list has no elements, else 0.
  2. Write a function int size(link first) that returns the number of elements in the list.
    Given this function, is there any point in having the previous function isEmpty?
  3. Write a function int contains(link first, int x) that returns 1 if the value x is in the list, else 0.
  4. Write a function int indexOf(link first, int x) that returns the index (position) of the first occurrence of x in the list. What shall the function do if x is not in the list?
  5. Write a function int getLast(link first) that returns the data stored in the last element in the list. What shall the function do if the list is empty?
  6. Write a function int get(link first, int n) that returns the value stored in the n:th position in the list. The positions are numbered 0, 1, 2, ... as in arrays.
  7. Write a function int maximum(link first) that returns the maximum value stored in the list.
  8. Write a function that adds an element at the end of the list. Will it it work if the list is empty?
  9. Write a function that removes the last element of the list.
  10. Write a function add(link *first, int n, int d) that inserts (not replaces) a new element at the n:th position with d as data.
  11. Write a function remove(link *first, int x) that removes the first occurrence of x
  12. A function clear(link *first) that removes all elements in the list.
  13. Suppose list refers a sorted list. Write a function link condense(link first) that returns a copy of the list but without duplicates.
  14. Write a function int equals(link f1, link f2) that returns 1 if the two lists are equal in the sense that contains the same data in the same order else 0.

A general list

We shall now go into the problem of making a list structure for storing anything. We want to create a set of declarations and functions for a list where we can store any type of data, for example a list of strings or a list of structs representing persons.

Instead of storing an int in the list nodes we will store a pointer declared void * :
typedef struct Node { void *data; struct Node *next; } Node, *link;
To C it is just a memory address and it is up to the user to keep track what type is found at the end of the pointer. Illustration:
bild
A constructor function is very similar to the one for the lists of integers:
Node *cons(void *data, Node *next) { Node *n = (Node *) malloc(sizeof(Node)); if (n == NULL) { printf("Out of memory in cons\n"); exit(1); } else { n->data = data; n->next = next; } return n;

Some of the functions in the previous section like isEmpty and size will still work but no function dealing with contents of the nodes (like print and contains) will work and can not even be written since the list designer don't know the data type.

However, the list designer can provide an iterator. An iterator is a mechanism that enables the user to access the data in the list without knowing how the list is constructed.

A very simple iterator mechanism can be implemented with the following function:

void *iterator(link *l) { if (*l == NULL) { return NULL; } else { void *result = (*l)->data; (*l) = (*l)->next; return result; } }

The function takes a pointer to a node pointer as parameter. A call to the function will return the data stored at that node and advances the pointer to the next node.

Example of usage:
int main() { link list = NULL; list = cons("Alpha", list); list = cons("Bravo", list); list = cons("Chalie", list); list = cons("Delta", list); printf("\n\nIterator:\n"); link iter = list; while(1) { char *s = (char *) iterator(&iter); if (s==NULL) { break; } printf("%s\n", s); } }
The point is that the main function can go over the contents in the list without knowing how the list is organized internally and that the list functions does not have to know what type of data the user has stored in the list.

Exercises

Copy the code shown above and use it in the following exercises
  1. Write a function int contains(link first, char *s) that uses the iterator to check if the string s is in the starting with first.
  2. Use the general list for storing structs defining persons (name, age) defined in lesson 4. Write a print function for such a list.

Valid CSS!