Plant Biology (SC/BIOL 2010 4.0)
sunflower icon for course

Statistical Analysis in Plant Biology (Chris Luszczek)

The complete tutorial is also avalable in pdf format.

Introduction.

This is a statistical tutorial for Plant Biology. It will provide you with the basics of various common statistical methods and examples of how to perform these tests using SPSS statistical software available in York's computer labs and accessible from home using York's remote Web-based File Access System (WebFAS).
Warning! WebFAS may involve a lengthy installation procedure and I have found it to be finicky, sometimes requiring multiple attempts at installation. Be aware of this if you are downloading the software at home... at midnight the evening before your report is due.
Using York's computer labs avoids any problems you may have using WebFAS.

Outline

  1. Hypothesis Building
    1. Null hypothesis/alternate hypothesis
  2. Hypothesis Testing
    1. Visual summary
  3. Common Statistical tests and how to run them
    1. Summary statistics
    2. T-test
  4. Setting Up a T-test
    1. Paired versus independent t-tests
    2. 1-tailed versus 2-tailed t-tests
  5. Running a T-test in SPSS
    1. Importing the data and analysis in SPSS
    2. Reporting t-test results
  6. Graphing
    • How to present your findings
    • Types of graphs and usage
    • Formatting
  7. Correlations

1. Hypothesis Building

1a. Null hypothesis/alternate hypothesis

  • Null (H0) hypothesis - 'no effect' or 'no difference' between samples or treatments
  • Alternative (HA) hypothesis - experimental treatment has a certain statistically significant effect
  • A claim for which we are trying to find evidence
Some Examples
  • H0: "Different light spectra have no effect on photosynthetic activity" (H0: x2=x1 or x2-x1=0)
  • HA: "Pollen treated with chloramphenicol grow faster than untreated pollen" (HA: x2>x1 or x2-x1>0)

2. Hypothesis Testing

2a. Visual Summary

visual explanation of differing means and statistical significant

3. Common Statistical Tests

3a. Summary statistics

You should already be aware of the basic summary statistics. Usually, scientific data are summarized by reporting the mean, the standard deviation and the sample size.

3b. T tests

For this course you are expected to understand and use t-tests

T-tests are used to determine if two sets of data (2 means) are significantly different from each other. It assumes that the data are normally distributed and samples are equal.

Two decisions must be made when selecting a t-test:
  • Are the samples paired or independent?
  • Is the comparison 1-tailed or 2-tailed?

4. Setting Up a T-Test

4a. Paired versus independent t-tests

  • A One-sample (paired) t-test compares two samples in cases where each value in one sample has a natural partner in the other (data are not independent). It can be used during pre- or post- data analysis. It is also used to compare a sample mean to a specified value.
    One example of paired t-test analysis is comparing patient performance before and after the application of a drug. The data are paired because the same patuent is compared before and after treatment.
  • A two-sample (independent) t-test compares the means for two groups of cases.
    An example of independent t-test analysis is comparing patient performance in a group receiving a drug versus a separate group receiving a trial drug.

4b. 1-tailed versus 2-tailed t-tests

  • A One-tailed/sided t-test expects the effect to be in a certain direction.
    • Is the sample mean greater than μ? (μ is the population mean, the greek letter 'mu')
    • Is the sample mean less than μ?
    H0: μ = μ0 where μ0 is known
    HA: μ > μ0 or μ < μ0
  • A Two-tailed/sided t-test tests for different means regardless of whether it is greater or smaller.
    • Is there a significant difference?
    H0: μ1 = μ2
    HA: μ1 ≠ μ2
    visual explanation of hypothesis testing and the t-test statistic
  • A carefully stated experimental hypothesis will indicate the type of effect you are looking for
  • For example, the hypothesis that "Coffee improves memory" suggests paired, one tailed because you will repeatedly measure the same participants and expect an improvement
  • "Men weigh a different amount from women" suggests an independent two tailed test as no direction is implied.

  • So remember, don't be vague with your hypothesis if you are looking for a specific effect! Be careful with the null hypothesis too - avoid "A does not affect B" if you really mean "A does not improve B".

5. Running a T-test in SPSS

Question: Do two photosynthetic organisms have the same oxygen evolution capability?

Null Hypothesis: HA: μ1 = μ2 (Both photosynthetic organisms produce the same amount of oxygen)

An independent 2-tailed t-test!

Alternative Hypothesis: HA: μ1 ≠ μ2 (the two photosynthetic organisms DO NOT produce equal amounts of oxygen)

5a. Importing the data and analysis in SPSS

In your browser, 'viewing image' should enlarge the screenshots
    visual explanation of importing data into SPSS

    Excel spreadsheets can be imported into SPSS

    1. Make sure your spreadsheet is saved on the C: drive of your computer
    2. Make sure excel file types are selected
    After selecting a file, a window will give you the option of reading Row 1 data as column labels.

    Having difficulty importing the excel file?

    Manually entering the data is possible. Make sure that your first column is set for labels and the second column for the data. visual explanation of data columns in SPSS

    visual explanation of changing to data view in SPSS

    Sometimes you will automatically see a summary of your data rather than the data -- to correct:

    1. Click 'Data view' tab rather than 'Variable view'

    data layout in SPSS

    Data view and layout in SPSS

  • Notice the variable names in the column headers
  • All raw data is listed (SPSS will calculate means for you)
  • Data is listed in one column (all with the same units) with the first column indicating the grouping

    visual explanation of analyzing data in SPSS

    Analyzing data

    1. Select Analyze --> compare means --> independent samples t test
      O2evo is the test variable, species is the grouping variable
    2. Click on define groups, then
    3. type the two names used in the data view

    visual explanation of analysis output in SPSS

    Analysis output

    Interpreting the output in our example
    1. We first check Levene's test --which assesses if variances are equal
      if p > 0.05, then the variances are equal and you can interpret the t results
    2. The t-test result is p = 0.014
      so we can reject the null hypothesis, thus the two photosynthetic organsims DO NOT produce equal amounts of oxygen.

5b. Reporting t-test results

  • All performed statistics MUST be referred to in the text of your report
  • You must indicate:
  • The type of test performed
  • The data the test was performed on
  • The α (-- alpha) level used (0.05 is the default)
  • The p-value outcome of your t-test
  • Whether you accept or reject the null hypothesis
  • Here is an example: "An independent, 2-tailed t-test was performed comparing mean O2 production from species one and species two. A significant difference (α = 0.05, p=0.014) in production was found between species and we therefore reject the null hypothesis of this experiment."
For purposes of this course you are required to take a print screen of your SPSS output (as in the previous slide) and attach this in your report.

6. Graphing

Choosing Graphs


Scatter plots


Line graphs


Bar graphs


Histograms


7. Correlations

a boy learns in class that correlation does not imply causation Correlation analysis measures the strength and direction of a linear relationship between two random variables. But be careful! Two random variables may be strongly correlated, but that does not mean the relationship is causal (as explained by Randall Munroe --xkcd.com).

Conclusions

This tutorial has provided you with the basic theory, mechanics and applications of common statistical tests.

You should now be able to carry out scientific reporting from hypothesis formation to statistical testing and figure formatting