Box Plots

To demonstrate creating both single and multiple box plots we will use the data in the following example.

Example: The tables below list the number of police officer deaths in the United States (as of April 2012) for Northeastern and Midwestern states.

Northeastern States
Midwestern States
Maine85Wisconsin354
New Hampshire33Michigan550
Vermont22Illinois984
Massachusetts320Indiana378
Rhode Island43Ohio763
Connecticut135Missouri628
New York1,327North Dakota50
Pennsylvania722South Dakota52
New Jersey469Nebraska157


Kansas231


Minnesota223


Iowa157
Source: www.nleomf.org

Single Box Plots

To demonstrate creating a single box plot we will use the Northeastern states data.  We'll begin by creating a list.

> northeast = c(85, 33, 22, 320, 43, 135, 1327, 722, 469)

Once we have our list created we can use the boxplot command to create the box plot.

> boxplot(northeast)

By default, the boxplot command separates outliers from the rest of the data, identifying them with an open circle.  We can add the argument range = 0 if we do not want outliers separated.

> boxplot(northeast, range = 0)

We can also add the argument main to title our plot.

> boxplot(northeast, main = "Police Officer Deaths in the Northeastern United States")

We can also output the numerical information from our graph using the plot = FALSE argument.

> boxplot(northeast, plot = FALSE)
$stats
     [,1]
[1,]   22
[2,]   43
[3,]  135
[4,]  469
[5,]  722

$n
[1] 9

$conf
       [,1]
[1,] -89.36
[2,] 359.36

$out
[1] 1327

$group
[1] 1

$names
[1] "1"

The $stats section tells us
  • [1] the minimum usual value
  • [2] the first quartile
  • [3] the median/second quartile
  • [4] the third quartile
  • [5] the maximum usual value

The $out section identifies the values of any outliers.

Multiple Box Plots

To demonstrate creating a multiple box plot we will use both data sets in the problem above.  We'll begin by creating a list for the Midwestern data

> midwest = c(354, 550, 984, 378, 763, 628, 50, 52, 157, 231, 223, 157)

Once we have our lists created we can use the boxplot command to create the multiple box plot.

> boxplot(northeast, midwest)

As with single box plots we can use the main argument to title our plot.  We can also use the names argument to label the individual box plots.  In this case we will create a list containing the names "Northeastern United States" and "Midwestern United States."

> boxplot(northeast, midwest, main = "Police Officer Deaths in the United States", names = c("Northeastern United States", "Midwestern United States"))

No comments:

Post a Comment