Free Lists
Not every algorithm and data structure that is fast and elegant has to be highly sophisticated. Sometimes, there are solutions to problems that are very simple and the hard thing is to realize of effective they despite their simplicity.
Free lists are a very good example of this. Free lists are linked lists that are stored inside your data structures that keep track of unused entries.
Let us consider a simple example to introduce them. Suppose we have a data structure that allows us to store the name and the address of a customer (hopefully correct C98):
1 struct customer_s
2 {
3 char[100] name;
4 char[200] address;
5 };
Now we want to manage an arbitrary number of instances of this data structure. So the obvious thing to do is to define an array of it:
#define MAX_CUSTOMER 1000000 struct data_s arr[] = malloc(MAX_CUSTOMER * sizeof(struct data_s));
We now have an array of our data structure and can fill it. However, as soon as we start to use this array we notice that we cannot decide whether an entry of the array is actually used. So we add a flag to the data structure that stores whether the entry is used. Of course we have to initialize our new array:
1 struct customer_s
2 {
3 int used;
4 char[100] name;
5 char[200] address;
6 };
7
8 struct data_s customers[] = malloc(MAX_CUSTOMER * sizeof(struct data_s));
9
10 for (int i = 0; i < MAX_CUSTOMER; i++) arr[i].used = FALSE;
Jolly good! We can now start to use our customer array. However, after having written our customer management application we realize that it is very slow. Search customers is slow but we expected this for they are not ordered.
However, inserting a new entry is also very slow! We run our code through a profiler and determine the root of all evil in the function that inserts a new customer:
1 /* search free array entry */
2 int next_customer = -1;
3 for (int i = 0; i < MAX_CUSTOMER; i++) {
4 if (!customers[i].used) {
5 next_customer = i;
6 break;
7 }
8 }
When we have a lot of customers in our array then inserting a new one forces us to look at each existing customer entry! We can certainly speed this up a bit. Consider the improved data structures:
1 struct customer_s
2 {
3 int used;
4 int next;
5
6 char[100] name;
7 char[200] address;
8 };
9
10 struct customer_arr_s
11 {
12 int first_free;
13
14 struct customer_s arr[];
15 }
16
17 struct customer_list_s customer_list;
18
19 /* initialization */
20 function create_customer_arr(struct customer_arr_s *customer_list)
21 {
22 customer_list->arr = malloc(MAX_CUSTOMER * sizeof(customer_s));
23
24 for (int i = 0; i < MAX_CUSTOMER; i++) {
25 customer_list->arr[i].used = FALSE;
26 customer_list->arr[i].next = i + 1;
27 }
28 customer_list_s->arr[MAX_CUSTOMER - 1].next = -1;
29
30 customer_list->first_free = 0;
31 }
32
33 int main(/* … */) {
34 struct customer_arr_s customer_arr;
35
36 create_customer_arr(&customer_arr);
37 }
What’s happening? First, look at struct customer_s. We have extended the data struct by one value: next. This value stores the index of the next element in the free list. The elements in this list are the entries of the customer array which are currently not used. The last element in the list has no next element so it’s next value will be -1.
How do we get the first element of the list? We introduce the data structure customer_arr_s which contains the actual array of customer_s elements and also stores the index of the first unused element.
Now the code to allocate the next free element in the list looks as follows:
int next_free = customer_arr.first_free; customer_arr.first_free = customer_arr.arr[customer_arr.first_free].next;
Thus, we can guarantee that allocating a new entry in the list takes constant time. Freeing the ithe element is possible in constant time, too:
customer_arr.arr[i].next = customer_arr.first_free; customer_arr.first_free = i;

Of course, free lists may seem superficial if you are used to dynamic high level like Python or Ruby since you can simply remove elements in the middle of “Array” etc.
But remember that there are places where plain C-style arrays make sense: In operating systems, embedded applications and generally all other places where memory usage and speed is essential.
February 18th, 2007 at 9:58 am
What I like about this approach is the use of an array index as a pointer. By this you can use other data structures for different purposes: e.G. you could us a binary tree for range queries, a hashmap for point queries but still a free list for insertion. You’d just have to adapt your data structure and algorithms “a bit”.
Is this a common C idiom that I passed by because I passed by C?
February 18th, 2007 at 10:01 am
By the way, do you have a good idea to avoid fragmentation due to the deletion of entries?
February 18th, 2007 at 9:12 pm
Free lists are a common idiom in data structures themselves (Cormen calls it “aumenting data structures”, for example).
The indexes are nothing more than simply pointers with &array as their base address. As far as I know, it could be considered “good C style” to drop pointers in favour of indexes when appropriate. Note, however, that using indexes only works best with data structures of static size - e.g. arrays.
When I want to insert the (MAX_CUSTOMER + 1)th customer then I have to reallocate the array since the indexes will not work correctly with “split” data structures. When I have a linked list of “customer tables” (each having MAX_CUSTOMER entries) then pointers would come in handy. Or I simply use a free list for every table.
An remember, array[index] is nothing more than just writing *(&array + index * sizeof(content_type)).
I do not know a good answer to overcome internal fragmentation. Regular compaction - “defragmentation” - would come at an actually noticeable cost. However, amortizedly analyzed, the cost should disappear.