Measures of Center

Some measures of center can be calculated with a simple command.  Others will require either a multi-step process, or a little basic work on our part.  To demonstrate these methods we will make use of the data in the example below.

Example: Listed below are the numbers (in millions) of mobile 3G subscribers in the fourth quarter of 2011.

USA Italy United Kingdom BrazilGermany SpainFrance JapanChina South KoreaIndia Indonesia
208 44 42 4138 3330 12257 4539 29
Source: mobithinking.com

Before we perform any calculations we will need to create a data set in R.

> mobile.3g = c(208, 44, 42, 41, 38, 33, 30, 122, 57, 45, 39, 29)

Mean

To calculate the mean of a data set we will use the command mean.

> mean(mobile.3g)
[1] 60.66667

Median

To calculate the median of a data set we will use the command median.

> median(mobile.3g)
[1] 41.5

Mode

Calculating the mean and median of a data set is very straightforward.  Calculating the mode is not as automatic, but there are a few simple methods that will allow us to easily investigate our data.  Let's look at using the command table on our data set.

> table(mobile.3g)
mobile.3g
 29  30  33  38  39  41  42  44  45  57 122 208 
  1   1   1   1   1   1   1   1   1   1   1   1 

This command provides a count of how many times each number occurs in our data set.  To identify the mode of our data set we need to read through each result to identify which number occurs most frequently.  In this particular example, no number is repeated so there is no mode.

Let's consider another data set that will have a mode.

10 10 100 100 100 1,000

We can apply the table command to this data set, and consider our results.

> test.set = c(10, 10, 100, 100, 100, 1000)
> table(test.set)
test.set
  10  100 1000 
   2    3    1 

Our results indicate that the value 100 is repeated 3 times, making it the mode of our data set.

Mid-Range

To find the mid-range of a data set we need to calculate the average of the minimum and maximum values in our data set.  To identify the maximum and minimum values in a data set we can use the commands max and min, respectively.

> max(mobile.3g)
[1] 208
> min(mobile.3g)
[1] 29

Once we have the results we can use R to calculate 208 + 292 directly.

> (208+29)/2
[1] 118.5

So to reach one simple result we needed 3 separate steps.  To simplify his process we can calculate the mean of a data set consisting of 2 values: the maximum and minimum values.

> mean(c(max(mobile.3g), min(mobile.3g)))
[1] 118.5


No comments:

Post a Comment