Testing for Normality

The Shapiro-Wilk Test allows us to test the claim that a sample comes from a normally distributed population, an assumption for many hypothesis tests.  This test always has the same hypotheses:

H0: The sample comes from a normally distributed population
H1: The sample does not come from a normally distributed population

In order to conduct the Shapiro-Wilk test we will apply the command shapiro.test to any data set.

Example: The table below provides the data on the number of cats and dogs adopted from Paws Chicago by year.  Test the claim that this sample data comes from a population that is normally distributed?  Use α = 0.05.

201120102009200820072006
42684042346730151666946
Source: www.pawschicago.org

We need to begin by creating a data set.

> paws = c(4268, 4042, 3467, 3015, 1666, 946)

We can now conduct the Shapiro-Wilk Test to test the claim.

> shapiro.test(paws)

        Shapiro-Wilk normality test

data:  paws 
W = 0.9131, p-value = 0.4568

Since our p-value = 0.4568 > 0.05 = α we fail to reject H0.  The data supports the claim that this sample comes from a normally distributed population.

Example: The tweets of 11 casinos were tracked during the week of June 1 - June 7, 2010.  The table below provides the number of positive interactions each casino had with customers during that week.  Test the claim that this sample data comes from a normally distributed population. Use α = 0.02.

CasinoNumber of Positive Interactions
Aria1
Caesars Palace12
Casino Royale0
Excalibur21
Hard Rock Hotel2
Las Vegas Hilton6
Mirage59
Planet Hollywood21
Station Casinos7
Venetian4
Wynn Las Vegas42
Source: gaming.unlv.edu

Again, we start by creating a data set.

> pos.int = c(1, 12, 0, 21, 2, 6, 59, 21, 7, 4, 42)

We can now conduct the Shapiro-Wilk Test to test the claim.

> shapiro.test(pos.int)

        Shapiro-Wilk normality test

data:  pos.int 
W = 0.805, p-value = 0.01096

Since our p-value = 0.01096 ≤ 0.02 = α we reject H0.  The data does not support the claim that this sample comes from a normally distributed population.

No comments:

Post a Comment