Statistical Tests

#  Continuous Data (random variables)

## Normal Distribution-Based Tests

|Action                                                       |Test                                                  |
|:------------------------------------------------------------|:-----------------------------------------------------|
|Comparing means                                              |t-test                                                |
|Comparing paired data                                        |t-test                                                |
|Comparing variances of two populations                       |"F-test of equality of variances" or Bartlett''s test |                        |
|Comparing means across more than two groups                  |ANOVA or "Welch method"  [1]                          |
|Pairwise t-tests between multiple groups                     |                                                      |
|Testing for normality                                        |Shapiro-Wilk test                                     |
|Testing if a data vector came from an arbitrary distribution |Kolmogorov-Smirnov test                               |
|Testing if two data vectors came from the same distribution  |Kolmogorov-Smirnov test                               |
|Correlation tests                                            |"cor.test"                                            |

[1] Test equality of means across \geq 2 groups:
    if variances across groups are equal -> use ANOVA
        (the ANOVA "F-test" is used ("F-test of equality of means") )
    if not, use the Welch method "oneway.test"
(ANOVA calculations assume variances of groups are equal)



[2] "var.test" performs a different "F-test": the "F-test of equality of variances"
    This test is extremely sensitive to non-normality -> Bartlett''s test is better used.

### Misc notes

#### ANOVA

http://itl.nist.gov/div898/handbook/prc/section4/prc43.htm is a good overview.
(sections 7.4.3.1 ~ 7.4.3.4)

Key equations:
SS (Total) = SST + SSE
MST = SST / (k-1)
MSE = SSE / (N-k)
F = MST / MSE

where k is the number of levels of the factor and N is the total number of data points.
(in the NIST example, k=3 and N=3*5 = 15)

both degrees of freedom (k-1) and (N-k) are used in finding the F-test critical value.
(see http://itl.nist.gov/div898/handbook/eda/section3/eda3673.htm )

ANOVA assumes  that response variable residuals are normally distributed (or approximately normally distributed), but is robust to violation of this assumption (whatever that means).
(see http://stats.stackexchange.com/questions/6350/anova-assumption-normality-normal-distribution-of-residuals for more details)

"residuals... reflect the random part of the model [so their distribution matters for the hypothesis test]"

When the null hypothesis of equal means is true, the two mean squares estimate the same quantity (error variance), and should be of approximately equal magnitude. In other words, their ratio should be close to 1. If the null hypothesis is false, MST should be larger than MSE. [NIST, 7.4.3.3]


#### Correlation and R^2

Pearson''s correlation coefficient squared = R^2

See:

http://economictheoryblog.com/2014/11/05/proof/

http://economictheoryblog.com/2014/11/05/the-coefficient-of-determination-latex-r2/

## Non-Parametric Tests

Used in cases where one knows the data has a distribution other than the Normal distribution, or where one does not know the distribution.

|Action                         |Test                                                    |Notes                               |
|:------------------------------|:-------------------------------------------------------|:-----------------------------------|
|Comparing two means            |Wilcoxon test                                           |non-parametric equivalent to t-test |
|Comparing more than two means  |Kruskal-Wallis rank-sum test                            |non-parametric equivalent to ANOVA  |
|Comparing variances            |Fligner-Killeen (median) test                           |                                    |
|Difference in scale parameters |Ansari-Bradley two-sample test, Mood''s two-sample test |                                    |




# Discrete Data (random variables)

## Proportion Tests

"prop.test"

## Binomial Tests

"binom.test"

## Tabular Data Tests

### Fisher''s exact test ("fisher.test")

#### Tea example

For explanation see http://www.coe.utah.edu/~cs3130/lectures/L15-HypothesisTests1.pdf and https://en.wikipedia.org/wiki/Fisher%27s_exact_test
"this leads under a null hypothesis of independence to a hypergeometric distribution of the numbers in the cells of the table."
(hypergeometric is the same as guessing, i.e. choosing 4 out of 8 when you pick randomly, rather than her having any real information)
so null hypothesis here is like guessing, i.e. either one is equally likely to be chosen.

#### Dieting example

with respect to the dieting example, the null hypothesis is: for a man or a women, either dieting or not dieting is equally likely.
given that that is the case, what is the probability that the distribution these shows up?

#### _R in a Nutshell_ example

For one column test group and one column control group, what is the probability that the two sets of data came from the same population?
Null hypothesis is that they did come from the same population.

### Chi-squared test ("chisq.test")

Overview:

http://pages.cpsc.ucalgary.ca/~saul/wiki/uploads/CPSC681/topic-fahim-CHI-Square.pdf

On statistical background, including assuming binomial or multinomial distribution:

http://hbanaszak.mjr.uw.edu.pl/TempTxt/(Wiley%20Series%20in%20Probability%20and%20Statistics)%20Alan%20Agresti-An%20Introduction%20to%20Categorical%20Data%20Analysis%20(Wiley%20Series%20in%20Probability%20and%20Statistics)-Wiley-Interscience%20(2007).pdf

Excellent explanation:

http://isites.harvard.edu/fs/docs/icb.topic904001.files/Stat104_Lecture26v4_1up.pdf

### Others

*   Cochran-Mantel-Haenszel test (Two-way)
*   McNemar''s chi-squared test (symmetry in a two-dimensional contingency table)


## Non-Parametric Tabular Data Tests

Friedman rank-sum test (non-parametric counterpart to two-way ANOVA tests)



# One way vs two-way ANOVA

(summarized from http://www.differencebetween.net/science/mathematics-statistics/difference-between-one-way-anova-and-two-way-anova/)

## One way

One independent variable with several groups or levels or categories

Example of one-way anova: Consider two groups of variables, food-habit of the sample people the independent variable, with several levels as, vegetarian, non-vegetarian, and mix; and the dependent variable being number of times a person fell sick in a year. The means of response variables pertaining to each group consisting of N number of peoples are measured and compared.

## Two way

Two independent variables each with multiple levels and one dependent variable in question the anova becomes two-way.

The two-way anova shows the effect of each independent variable on the single response or outcome variables and determines whether there is any interaction effect between the independent variables.

Example of two-way anova: If in the above example of one-way anova, we add another independent variable, ‘smoking-status’ to the existing independent variable ‘food-habit’, and multiple levels of smoking status such as non-smoker, smokers of one pack a day, and smokers of more than one pack a day, we construct a two-way anova. 


## More on two-way anova

Each unique combination of levels of the two factors is called a treatment cell

(http://blog.excelmasterseries.com/2014/05/two-factor-anova-with-replication-in.html)

Calculations are at:

http://itl.nist.gov/div898/handbook/prc/section4/prc437.htm

http://itl.nist.gov/div898/handbook/prc/section4/prc438.htm

### Regarding hypotheses

(http://www.real-statistics.com/two-way-anova/two-factor-anova-without-replication/)

(Rows are blends, columns are crop types.)

There are two null hypotheses: one for the rows and the other for the columns. 

The null hypothesis for the rows is

H0: there is no significant difference in yield between the (population) means of the blends

The null hypothesis for the columns is

H0: there is no significant difference in yield between the (population) means for the crop types



# Statistics behind A/B Testing

http://statisticalconcepts.blogspot.jp/2010/03/ab-testing.html