Construction of a Frequency Table

Constructing a frequency table involves putting data into categories showing the number of observations in each mutually exclusive category. The categories are mutually exclusive in the sense that there is no overlap between categories. Data points fit into one category and only one category. In addition the frequency table categories are collectively exhaustive in that there is some category that fits for every data point.

Steps in building a frequency table

Example Frequency Table

Consider a problem that asks you to build a frequency distribution. Let's say we have information on the weight of cans of Regular Coca Cola and we want to build a frequency table. The data (weights of coke cans) is as follows:

Weight Regular Coke (pounds)

0.8192 0.8150 0.8163 0.8211 0.8181 0.8247
0.8062 0.8128 0.8172 0.8110 0.8251 0.8264
0.7901 0.8244 0.8073 0.8079 0.8044 0.8170
0.8161 0.8194 0.8189 0.8194 0.8176 0.8284
0.8165 0.8143 0.8229 0.8150 0.81520.8244
0.8207 0.8152 0.8126 0.8295 0.8161 0.8192

As a rough rule of thumb we probably want between 5 and 15 classes (Note: that is my view. Texts and instructors vary). Lets try 5, 10 and 15 and get a rough estimate of the class widths for those numbers of classes.
Possible Class Widths
High value - Low valueNumber of classesSuggested class widthHuman adjusted class width
0.034350.00670.0050
0.0343100.00340.0050
0.0343 150.00230.0025

In this case let's pick the class width as 0.0050. Why did we pick that number? It fits our results, but more importantly human beings can probably look at a table built with that class width and readily understand it. If we chose something like 0.0067 we could still build a table but it would be a nightmare for people to comprehend. So, our class width is 0.0050 because it fits our data and its easy to grasp.

The next issue is what starting value do we use in our table? Out lowest value is 0.7901 so we will start our table at 0.7900. Again we picked a number that lets us build a table that people can read and comprehend. At this point we have the first class starting at 0.7900 lb and the class width as 0.0050 lb. Given this information the second class will start at 0.7900 + 0.0050 or 0.7950. The third class will start at 0.7950 + 0.0050 or 0.8000 and you will continue this process generating lower class limits until your classes allow you to cover all of the data. How about upper class limits? In the example the upper class limits are 0.0001 less than the next lower class limit. For example, class one has an upper limit of 0.7949 and class two has a lower limit of 0.7950. The idea here is to make it clear where you should count a data point. Going through the data in the appendix one point at a time and assigning each point to a class you come up with the result shown.
Weights of Regular Coke
ClassFrequency
0.7900-0.79491
0.7950-0.79990
0.8000-0.80491
0.8050-0.80993
0.8100-0.81494
0.8150-0.814917
0.8200-0.82496
0.8250-0.82994

What can we tell about the data by looking at the frequency table? The most obvious thing is that almost half of the data is in the class from 0.8150 to 0.8149 lbs and virtually all of the data is between 0.8000 and 0.8299 lbs. If we were comparing this data to data for Diet Coke or some version of Pepsi we would look to see how the other sodas weight data is spread across the weight classes.