Why you should use talloc for your next project

Memory management is hard. This is one of the first things a programmer learns (usually by trial and much error) when they leave academia and get out into the real world. It is very easy to make mistakes when managing memory, especially when a particular piece of data needs to live beyond the life of the function that created it. It can become difficult to know when the memory is safe to destroy, as well as when it is optimal to destroy it.

In standard C, a programmer would use malloc() and free() to manage their memory. The problem with this is that every section of memory is allocated independently. There are no inherent relationships between bits of data. The programmer is required to maintain any relationships between data in their own code.

Enter talloc, which is a hierarchical memory-management tool wrapped around C’s malloc(). The basics of talloc are easy to pick up. With talloc, you have the option of declaring that the memory you are allocating is a child of another piece of memory. The advantage to this approach is that calling talloc_free() on any piece of talloc-allocated memory will not only delete that memory, but will recursively descend through any children of that memory and free them first.

To provide a trivial example, consider that you wanted to create a new struct containing student data:

struct student {
   char *name;
}

In a traditional C approach, you would allocate memory for a new student in this manner:

student1 = malloc(sizeof(struct student));
student1->name = strdup("steve");

and would sometime later be freed with:

free(student1->name);
free(student1);

That works fine in the trivial case, but start considering what happens when you have much more complicated data structures. It becomes a challenge to ensure that you free all memory in the proper order so as to ensure that you don’t leave any dangling memory behind. Traditionally, this would be done by creating a cleanup function for your structure. Internally, this cleanup function would recursively call the cleanup functions for every subordinate structure, until finally it removed the toplevel memory.

The problem with this approach is that it requires the creation and maintenance of large numbers of cleanup functions.

The same problem with talloc is markedly simpler.

student1 = talloc(NULL, struct student);
student1->name = talloc_strdup(student1, "Steve");

Later, the struct can be freed with the single command:

talloc_free(student1);

Now, in the trivial case this doesn’t look terribly impressive, but try considering when you have nested structs, structs containing large numbers of strings, etc.  talloc_free(<toplevel>) will recursively clean up all of the child memory. No need to write complicated cleanup scripts to ensure that the memory is all gone.

Furthermore, talloc makes it very easy to abort the changes in a function. For example, partway through a complicated function, a fatal error occurs. In a traditional model, one would now need to examine all the memory that has been allocated thus far in the function and free it. A cleanup function may not be of any help here, as it would expect a fully-constructed structure to remove. With talloc, you simply need to delete the parent context and you’ll be certain to know that it will be completely cleaned up, regardless of its partially-constructed state.

So lets talk about more advanced and useful applications of talloc. Consider the case of asynchronous services. A request comes in (on a pipe, a TCP connection, etc.) requesting some information. Assuming that the service is unable to return a reply without performing additional functions (for example, contacting a remote server for authoritative data), the program would allocate memory to hold the data provided for the request, and then queue it up internally, to be processed when resources allow.

This request might require multiple trips to and from a remote server, it might require memory allocation and deallocation in many places, and it could fail with an error or be cancelled if the requesting process disconnects or otherwise indicates that it no longer cares about the reply.

So now we have a new concept: requests. With talloc, the way one would handle a request would be to create a request context. This request context would be a structure containing all of the data necessary to execute the event. As the event is processed by the mainloop, it may have additional subrequests (such as the example remote server query) attached as children to it. If at any time the request needs to be terminated, such as the original client has disconnected, all that is needed is to call talloc_free() on the original request and it will iterate through all of the allocated memory and clean up after itself.

Now, one thing I’ve glossed over is the case where just freeing the memory might not be enough. In the case of a request, before freeing memory it might be necessary to send a disconnect command to a remote server, or close a file descriptor. Talloc makes it easy to add a destructor to any allocated memory, such that when talloc_free() is called, it will first invoke this destructor and allow cleanup to commence. So in the case described above, one might add a destructor to the remote server query sub-request that would terminate the server connection in a non-destructive manner (or cancel a transaction but leave the connection in place, etc.)

By now, I think you begin to see the power inherent in the use of talloc over malloc. It’s five O’clock – do you know where your memory is?

2 thoughts on “Why you should use talloc for your next project

Comments are closed.