Estimating Population Means

We use the t.test command to construct confidence interval estimates for population means, and to test claims about means (when σ is unknown). The difference will be the arguments we include.

Note: There is also a z.test command (to be used when σ is known) that can be added to R, but it is not included by default. Most statistical packages do not include functions to do Z tests since the T test is usually more appropriate for real world situations. The syntax for the z.test command is very similar, but will not be discussed here.

To construct a confidence interval estimate for population means we need a data set and a confidence level (conf.level).

1-Sample (σ unknown)

Example: The table below lists the estimated number of non-occupational, fireworks-related injuries that were treated in U.S. hospital emergency departments by year. Construct a 99% confidence interval estimate for the number of people injured in non-occupational fireworks-related injuries each year.

2002	2003	2004	2005	2006	2007	2008	2009	2010	2011
8,800	9,300	9,600	10,800	9,200	9,800	7,000	8,800	8,600	9,600

Source: www.cpsc.gov

> injuries = c(8800, 9300, 9600, 10800, 9200, 9800, 7000, 8800, 8600, 9600)
> t.test(injuries, conf.level = 0.99)

        One Sample t-test

data:  injuries 
t = 29.3537, df = 9, p-value = 3.016e-10
alternative hypothesis: true mean is not equal to 0 
99 percent confidence interval:
  8136.975 10163.025 
sample estimates:
mean of x 
     9150

A fair amount of information is being output, but in this instance we are only concerned with the lines that tell us the 99 percent confidence interval estimate is 8136.975 to 10163.025.

2-Sample (σ unknown)

The process for a 2-sample confidence interval is similar, except that we will include two data sets.

Example: The table below provides data on the number of video games rated T (Teen) and E 10+ (Everyone 10+) per year in the United States. Construct a 95% confidence interval estimate for the difference in the population means.

	T	E 10+
2011	346	266
2010	344	295
2009	322	287
2007	313	234
2006	296	206

Source: www.esrb.org

> teen = c(346, 344, 322, 313, 296)
> e.10 = c(266, 25, 287, 234, 206)
> t.test(teen, e.10, conf.level = 0.95)

        Welch Two Sample t-test

data:  teen and e.10 
t = 2.5293, df = 4.328, p-value = 0.06008
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
  -7.92183 249.12183 
sample estimates:
mean of x mean of y 
    324.2     203.6

So the 95 percent confidence interval estimate for μ₁ - μ₂ is -7.92183 to 249.12183. Thus, the data suggests that μ₁ = μ₂.

Lathrop - Resources

Pages

Estimating Population Means

No comments:

Post a Comment