recipes : Statistics : Performing a t-test
Problem
How do I perform a t-test in MATLAB?
SolutionThe MATLAB Statistics Toolbox contains the ttest command, which performs one-sample and paired tests. There is also the ttest2 command for performing an un-paired t-test.
Defining some terms
First of all, let's define some terms. A one-sample t-test asks
whether the mean of a distribution is significantly different from a
particular value (often zero). A two-sample t-test asks whether
the means of two different distributions are the same. A paired
test indicates that the data from the two groups are linked in
some way. For example, you've measured the weight of 10 different
peacocks on day A and again on day B, and you want to know if their
weights have altered. Since you have measured each peacock twice, you
want to do a paired test. An un-paired test is one where the
data from the two groups are not linked. For example, you've measured
10 peacocks from Italy and 10 peacocks from Mongolia and you want to
know if their average weights differ. Since you have measured
different specimens in each group, you perform an unpaired
test. Choosing the right test is important, as we will see below. The
a Wikipedia page
discusses the assumptions of the test. I'm not going into them here,
beyond saying that the data need to be fairly normally distributed and
ideally have have fairly similar variance (although there are versions of
the t-test which waive the equal variance assumption) . At larger
sample sizes, the normality assumption becomes increasingly less
important.
Let's warm up by running a one-sample t-test.
data=randn(1,24)+0.5; %Generate and plot some random data.
boxplot(data)
Now we test if the mean is significantly different from zero.
[h,p,ci,stats]=ttest(data) h = 1 p = 0.0246 ci = 0.0667 0.8852 stats = tstat: 2.4060 df: 23 sd: 0.9691
Understanding the outputs
The ttest command returns a bunch of stuff in those 4 output
arguments. If you look at the help page you will get an overview of
what it all is. Let's quickly go over the output for our
test.
p is the p-value, and it means something very specific. It is the probability that a mean value of the observed size or greater magnitude will occur by chance, given that that the null hypothesis is true and that the data conform to the assumptions of the test. This is all the p-value is. Note that the p-value does not indicate the practical significance of the effect. In other words, a tiny p-value means the effect is statistically very unlikely but it doesn't say anything about how important it is. Neither, as is often thought, does it indicate the probability that the null hypothesis is true or false. Again, the p-value can not tell you whether the hypothesis is true.
Whilst that's a lot to take in, the outputs of the test are the same for the other scenarios (e.g. two-sample, un-paired, etc). We will now look at two further things: testing different null hypotheses and why paired tests matter.
Choosing a suitable null hypothesis
There is no law in statistics that says the null hypothesis must be a
"nil hypothesis", or zero effect. Let's take the one-sample case
above, in that case the null hypothesis was that the mean of the
distribution was zero. But it need not have been. We could have tested
whether the mean was 0.5, or 1.0, or any other number. First I'll tell
you why this matters, then I'll show you how to do it.
Let's take a toy example: say that you work in a lab which studies the forearm strength of the platypus by asking them to push on a lever. You have a great idea to see if platypus forearm strength is improved by homeopathy. You know that homeopathy is not taken seriously by most people so you decide to include lots of statistical tests to make your study look really scientific. You measure the ability of your platypuses to push a bar before they been homeopathised (i.e. this is the control condition), and find that they exert an average force of 15 N. You do a t-test (as above) and show that this is "significant" because it's significantly different from zero. So what? It's a straw-man test: if the platypuses are able to perform the lever-pushing task they will exert some force on the bar. This test will always be significant. So testing a nil hypothesis isn't interesting. Instead, you should test for something meaningful. For example, why not ask whether forearm strength at the start of the study (before the homeopathy) is similar to that found by previous work? That will tell the reader that your platypuses were normal and healthy. You look in the literature and see that platypus forearm strength is on average 15.9 N. Let's test if this is significantly different from the 15 N you measured:
%Make 8 random platypuses with a mean forearm strength of 15N platypus=randn(1,8)+15; %Is that different from 15.9 N? [h,p]=ttest(platypus,15.9) h = 0 p = 0.1341
Nope. It turns out that your sample of 8 animals have a mean strength that's not significantly different from that found previously. That's good! It means you can go ahead with the next phase of your study: testing the effect of homeopathy.
Why paired tests matter
The platypus example is working for me, so let's carry on with
that. The 8 animals were tested before being given the homeopathy
treatment and again afterward. You want to know if there was an
effect. First you make a pair of box plots with the raw data overlaid,
then you do a t-test.
boxplot([before,after]) hold on p(1)=plot(zeros(size(before))+1,before,'o'); p(2)=plot(zeros(size(after))+2, after, 'o'); set(p,'Color','r','MarkerFaceColor',[1,0.5,0.5]) hold off
>> [h,p]=ttest2(before,after) h = 0 p = 0.4341
Oh no! There's no effect of the homeopathy on forearm strength. Now what? Following a sleepless night, you return to the lab and realise that you did an un-paired test. In other words, a test that ignores the fact that you tested the same platypuses in the two conditions. Let's do it right. The first thing to realise is that you can make a better plot. You should link the raw data points in the two graphs:
boxplot([before,after]) hold on for ii=1:length(before) plot([1,2],[before(ii),after(ii)],'-or',... 'MarkerFaceColor',[1,0.5,0.5]) end hold off
Hmm... Look! In all cases but one the forearm strength has gone down. Let's see what a paired t-test tells us:
[h,p]=ttest(before,after) h = 1 p = 0.0422
Ah. Now the difference exceeds the conventional (but arbitrary) 5% significance threshold, indicating that the decrease in forearm strength was significant. The conclusion (based upon this small sample size) is that forearm strength goes down following homeopathic treatment. You decide to repeat the study at a higher drug dilution to see what happens. At this point we leave you to your own devices...
DiscussionIt's easy to do t-tests in MATLAB and there are various options (not covered here) for tweaking the way the tests are done. See the MATLAB help pages (linked to above) for more details. Although the above is a silly example, it does hit a lot of the key points. Remember that producing the most informative plot is always a good way of guiding your analysis to the most appropriate test. Also remember that the results of the stats test should echo what you're seeing in your plot. If the two don't seems to match up, something is wrong.
Want to continue the discussion?
Enter your comments, suggestions, or thoughts below