recipes : Statistics : Making simple box plots

Problem

How do I perform a box plot in MATLAB?

Solution

A box plot is a useful non-parametric statistical plot. In other words, it shows the distribution of the data without making any assumptions about its underlying distribution. Box plots are especially useful when data are not normally distributed. Making a box plot in MATLAB is easy.

data=randn(10,5);
data(:,3)=data(:,3)+2;

boxplot(data) 
boxplot figure

Above, we generated a 10 by 5 matrix. MATLAB treats that as 5 groups of data (the columns) each of which contain 10 observations (the rows). The box plot command therefore creates 5 box plots. Let's say, however, that we want to customise our box plot. The plot is built up of individual line elements. If you call the command with one output argument, it returns the handles of the plot elements.

>> H=boxplot(data)                   

H =

  175.0013  176.0013  177.0013  178.0013  179.0013
  180.0013  181.0013  182.0013  183.0013  184.0013
  185.0013  186.0013  187.0013  188.0013  189.0013
  190.0013  191.0013  192.0013  193.0013  194.0013
  195.0013  196.0013  197.0013  198.0013  199.0013
  200.0013  201.0013  202.0013  203.0013  204.0013
  205.0013  206.0013  207.0013  208.0013  209.0013

Ouch! What do we do with all those handles? Well, there are obviously 5 columns worth of handles, so it's obvious that each column relates to one box plot (as we have 5 of those, too.). To figure out what the handles actually relate to, we can do:

>> get(H(:,3),'tag')

ans = 

    'Upper Whisker'
    'Lower Whisker'
    'Upper Adjacent Value'
    'Lower Adjacent Value'
    'Box'
    'Median'
    'Outliers'

Ah... Now it's all making sense. We have used the "tag" string that's part of the structure returned by the get command to tell us what each handle is for. Let's try it and see if it works. Let's change the median bar of the 3rd box plot to a thick green line:

set(H(6,3),'color','g','linewidth',2)

boxplot figure

You now know both how to create and modify box plots in MATLAB!

Discussion

Although bar charts are often used in place of box plots, possibly because they have a less cluttered feel to them and because they are easier to take in quickly, box plots do provide more information. For instance, they show up things such skewness, which a conventional bar chart will not show.

 

Want to continue the discussion?
Enter your comments, suggestions, or thoughts below

comments powered by Disqus