Estimating Population Means

We use the t.test command to construct confidence interval estimates for population means, and to test claims about means (when σ is unknown).  The difference will be the arguments we include.
Note: There is also a z.test command (to be used when σ is known) that can be added to R, but it is not included by default.  Most statistical packages do not include functions to do Z tests since the T test is usually more appropriate for real world situations.  The syntax for the z.test command is very similar, but will not be discussed here.
To construct a confidence interval estimate for population means we need a data set and a confidence level (conf.level).

1-Sample (σ unknown)

Example: The table below lists the estimated number of non-occupational, fireworks-related injuries that were treated in U.S. hospital emergency departments by year.  Construct a 99% confidence interval estimate for the number of people injured in non-occupational fireworks-related injuries each year.

2002200320042005200620072008200920102011
8,8009,3009,60010,8009,2009,8007,0008,8008,6009,600
Source: www.cpsc.gov

> injuries = c(8800, 9300, 9600, 10800, 9200, 9800, 7000, 8800, 8600, 9600)
> t.test(injuries, conf.level = 0.99)

        One Sample t-test

data:  injuries 
t = 29.3537, df = 9, p-value = 3.016e-10
alternative hypothesis: true mean is not equal to 0 
99 percent confidence interval:
  8136.975 10163.025 
sample estimates:
mean of x 
     9150 

A fair amount of information is being output, but in this instance we are only concerned with the lines that tell us the 99 percent confidence interval estimate is 8136.975 to 10163.025.

2-Sample (σ unknown)

The process for a 2-sample confidence interval is similar, except that we will include two data sets.

Example: The table below provides data on the number of video games rated T (Teen) and E 10+ (Everyone 10+) per year in the United States.  Construct a 95% confidence interval estimate for the difference in the population means.


TE 10+
2011346266
2010344295
2009322287
2007313234
2006296206
Source: www.esrb.org

> teen = c(346, 344, 322, 313, 296)
> e.10 = c(266, 25, 287, 234, 206)
> t.test(teen, e.10, conf.level = 0.95)

        Welch Two Sample t-test

data:  teen and e.10 
t = 2.5293, df = 4.328, p-value = 0.06008
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
  -7.92183 249.12183 
sample estimates:
mean of x mean of y 
    324.2     203.6 

So the 95 percent confidence interval estimate for μ1 - μ2 is -7.92183 to 249.12183.  Thus, the data suggests that μ1 = μ2.

No comments:

Post a Comment