recipes : programming : Writing better code : Speeding up code with pre-allocation

Problem

How do I pre-allocate to increase speed?

SolutionPre-allocating is an important concept: properly used it can drastically increase the speed of your code. "Pre-allocation" is the technical term for telling Matlab in advance how much space your variable will occupy. Let's take an example: When you create a variable and fill it with data, Matlab must first reserve a suitably sized block of memory to hold the information. The simple command a=1 creates the variable a and assigns it the value 1. If you type whos a you will see how much space this takes in memory:

>> whos a Name Size Bytes Class Attributes a 1x1 8 double

So a is a double-precision floating point number that occupies 8 bytes in RAM. If we were to create a 10 by 10 matrix by executing the command a=ones(10) then it would occupy 800 bytes, since it contains 100 doubles. Whenever you create a variable Matlab does the following:

- Reserves a region of memory. 8 bytes in the case of a=1
- Stores the data in it
- Provides the user with access to this region in memory using a label (a in this case)

%On each pass through this for loop the vector "a" increases in length by one for ii=1:100 a(ii)=rand; end

The above is *bad* because it's slow an inefficient. Let's look at why. On the first pass through the loop, a will have a length of 1 so Matlab reserves 8 bytes of RAM and shoves a random number into this. On the second pass through the loop, Matlab looks for a new bit of RAM into which it can fit 2 doubles (16 bytes). First it copies a(1) into this and into a(2) it inserts a new random number. Then in frees up the RAM previously occupied by a(1).
So, on each pass through the loop it copies the old version of a into a block of memory that is 8 bytes larger than the previous block then adds one more random number. This is time-consuming because it means Matlab has to find and allocate RAM 100 times and it has copy stuff 99 times.

Pre-allocation is the better solution. When we pre-allocate we tell Matlab *in advance* that we want 100 random numbers. Matlab will therefore reserve an 800 byte block of memory once and then fill it as we go allong. The pre-allocated version of the above code looks like this

a=ones(1,100); %This line pre-allocates the vector "a" for ii=1:100 a(ii)=rand; end

Notice that all we've done is create a variable filled with 1s before the start of the loop. On each pass through a 1 is replaced by a random number. On my machine, the second version runs about 5 times faster than the first version. That's a fairly significant speed increase, but the speed increases are even bigger when concatenating arrays:

%We will make a matrix with 10,000 rows; each row contains the numbers % 1 to 10 % First we make the matrix by concatenating (growing it one row at a time) clear all x=1:10; MATRIX=[]; tic for ii=1:10000 MATRIX=[MATRIX;x]; end toc >> Elapsed time is 2.774238 seconds % Now with pre-allocation clear all MATRIX=ones(10000,10); %pre-allocate the matrix x=1:10; tic for ii=1:10000 MATRIX(ii,:)=x; end toc >> Elapsed time is 0.022790 seconds

In this matrix example we get a 100 x speed increase when we pre-allocate. Larger matrices involve more copying and so stuff takes even longer.

DiscussionIt's good practice to pre-allocate where possible as doing so makes a lot of difference. There may be situations where you don't know the final size of your array. If so, you can try to guess the largest size you array could be, pre-allocate based on this, then trim the unwanted data.

A word of caution, though. There are instances where pre-allocation isn't needed. Here's an example:

a=ones(100); %makes a 100 by 100 array of ones a=someFunction; %someFunction returns a 100 by 100 array