Measures of Relative Standing

We have already demonstrated how to output the numerical information for a box plot, giving us the 3 quartiles.  We can also output these values, and other measures of relative standing, directly.  To demonstrate these calculations we will use the data set in the example below.

Example: Listed below are the top grossing historical fiction films from 1977-2012.

MovieAdjusted Gross*
The Sound of Music$1,275
Titanic$1,016
Doctor Zhivago$873
The Sting$712
Forrest Gump$634
My Fair Lady$611
Butch Cassidy and the Sundance Kid$568
Grease$564
Notorious$479
The Passion of the Christ$471
*in millions
Source: www.the-numbers.com

We begin by creating a data set.

> gross = c(1275, 1016, 873, 712, 634, 611, 568, 564, 479, 471)

5 Number Summary

Using the summary command we can output the 5 number summary for any data set.  We'll actually end up with 6 numbers, the five number summary and the mean.

> summary(gross)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  471.0   565.0   622.5   720.3   832.8  1275.0 

Quartiles

The quantile command will output the 5 number summary exactly.  However, each value will be identified as a percentile.

> quantile(gross)
     0%     25%     50%     75%    100% 
 471.00  565.00  622.50  832.75 1275.00 

If we only want a specific quantile we can add the probs argument to this command.  For instance, we could output just the second quartile.

> quantile(gross, probs = .5)
  50% 
622.5 

Similarly, we could choose to output the first, second, and third quartiles by combining values in a list for the probs argument.

> quantile(gross, probs = c(.25, .5, .75))
   25%    50%    75% 
565.00 622.50 832.75 

Percentiles

In using the quantile command to calculate quartiles, we are really just calculating percentiles.  We can input any percentage into the prob argument, meaning we can use the quantile command to calculate any percentile.  For instance, we could calculate the 33rd percentile of our data set.

> quantile(gross, probs = .33)
   33% 
567.88 
We can also output multiple percentiles at once, as we did above.

No comments:

Post a Comment