Chapters

Measures of Central Tendency
Measures of Spread
Percentiles
How to Find Percentiles
Interpret Percentiles
When to Use Percentiles
Common Percentiles
Boxplot
Interpret Boxplots

The best Maths tutors available

Measures of Central Tendency

In probability theory, there are several ways to analyse a random variable. Recall that the branch of probability is concerned with random phenomena, which can be thought of random variables whose outcomes are unknown.

There are several ways you can analyse a random variable. However, we can split these measures into two general categories: measures of central tendency and spread.

As you can see in the image above, measures of central tendency include the mean, median and mode. As the same suggests, these measures try to capture the centre of the data. The median is especially great to use when the data is skewed, whereas the mode should be used when you’re interested in frequencies.

Measures of Spread

Measures of spread are the second class of measures in probability theory. Their name also gives us a hint as to what they’re used for: measuring the dispersion of the data set. The most common measures include: standard deviation, variance, range, interquartile range/percentiles.

To understand this concept, let’s take a look at two sets of data with the same mean but different standard deviations.

As you can see, while the centre of the data is the same for both sets of data, the spread or location around the centre looks different for both. This is why measures of central tendency and spread are usually reported together, because they put the data into context.

Percentiles

Percentiles are a powerful measure of spread. You will sometimes see percentiles called a measure of location as well. Regardless of which term you decide to use, they can be very helpful in understanding the distribution of your data, especially when you have very large data sets.

As you can see in the image above, percentiles split up the data into different intervals. You can split the data in any interval length, however there are some common percentiles you should know about.

Percentile	Description	Use
Deciles	Each interval contains 10% of the data	Want to have detailed understanding of distribution
Quartiles	Intervals contain 25% of the data	Want quartile characteristics, such as the median
20th, 40th, 60th, 80th Percentiles	Intervals contain 20% of the data	Want a broad overview, can compare top 20% to the bottom 20%

How to Find Percentiles

Finding percentiles is usually very dependent on the software you’re choosing to find it with, as most programs have different formulas or locations in the menu for it. However, if you’re dealing with a simple data set, finding the percentiles by hand can be pretty easy.

Student	Score
1	40
2	45
3	60
4	67
5	75
6	80
7	85
8	88
9	89
10	90
11	91

Say you’re interested in knowing the deciles for this data. There are three main methods you can use to calculate percentiles:

	Notation	Description	Other Names
Greater than	k > percentiles	Find the lowest value greater than k	Excluding
Equal or greater than	k >= percentile	Find the value = k or lowest value greater than k	Including
Weighted average	N/a	Find the two above and calculate the weighted average	Interpolation

Here, we use the second method, which is sometimes known as “including,” as it includes the value at k. Keep in mind that k is equal to the percentiles we want to take. Because we want to take deciles, k would be equal to every 10 values (k=10, k=20, k=30…). Because we have 11 values, it is quite easy: every other value will contain 10 percent of the data.

Percentile	k	Value
10	0.1	45
20	0.2	60
30	0.3	67
40	0.4	75
50	0.5	80
60	0.6	85
70	0.7	88
80	0.8	89
90	0.9	90
100	1	91

This was easy, because we had an odd number of values. Take a look at the image below to clarify the percentiles we found.

Interpret Percentiles

In order to interpret any percentile, let’s take a look at an example.

As you can see in the example, we are using 20th percentiles. This means that each interval, or “bucket,” contains 20% of the data. The cumulative amount that they contain can be seen at each next interval. For example, the 20th percentile contains 20% of the data.

The interval between the 20th and 40th percentile contains 4 data points, which is another 20% of the data. The interval between 0 and 40 contains 40% of the data, or 8 points. At the 80th percentile, this means that the points above it are higher than 80% of the data.

When to Use Percentiles

You should use percentiles when you’re interested in:

Knowing the distribution of your data
Comparing different groups of values within your data

Take a look at the two images below.

quantile_example quantiles_20th

While the range is the same for both sets of data, with each going from 10 to 90, the distribution of the values is very different. In the first image, the bottom 20% of the data is contained between 10 and 33 while the top 20% is contained between 79 and 90. On the other hand, the second image has the bottom 20% between 10 and 44 while the top 20% are between 86 and 90.

Here, we compared the distributions between two data sets and also compared the values between different percentiles.

Common Percentiles

The most common percentiles are often used because they tell us important information about the data. The most common ones are listed below.

	Name	Description	Use
10th	Deciles	Gives a detailed view of the data	Can group each 10th to compare top 10, bottom 30, etc.
20th	-	Useful for broad overviews of the data	Can compare top 20% and bottom 20%
25th	Quartiles	Gives us information about each quarter	Gives the bottom 25%, the median, and top 25%
30th	-	Gives information about each third	Gives a third of the data

Boxplot

Boxplots are common in descriptive analysis: they give us information about the quartiles. The image below shows an example.

The following points on the data correspond to the following quartiles.

	Quartile
A	0th
B	25th
C	50th
D	75th
E	100th

Interpret Boxplots

In order to interpret the boxplot, you should use an example.

This boxplot can be interpreted as the following.

	Quartile	Interpretation
10	0th	Minimum
15	25th	Points above are greater than 25% of the data
40	50th	Median
62	75th	Points above are greater than 75% of the data
85	100th	Maximum

Did you like this article? Rate it!

4.00 (3 rating(s))

Emma

I am passionate about travelling and currently live and work in Paris. I like to spend my time reading, gardening, running, learning languages and exploring new places.

Theory

Solved Problem of Probabilty 8

Mode

Measures of Central Tendency, Position and Dispersion

Solved Problems of Simple and Compound Probability

Solved Problems of Conditional Probability

Confidence Interval

Events

Solved Problem of Probabilty 1

Solved Problem of Probabilty 13

Solved Problem of Probabilty 15

Confidence Interval for the Mean

Contingency Tables

Multiplication Rule

Standard Normal Table

Solved Problem of Probabilty 10

Solved Problem of Probabilty 14

Solved Problem of Probabilty 6

Combinatorics and Probability

Confidence Interval for the Proportion

Normal Distribution

Percentiles

Conditional Probability Word Problems

Using the Z Table

Pie Charts

Normal Approximation

Probability Properties

Solved Problem of Probability 18

Solved Problem of Probabilty 5

Solved Problem of Probability 2

Tree Diagrams

Probability Theory

Median

Bayes’ Theorem

Conditional Probability

Standard Normal Distribution

Law of Total Probability

Solved Problems of Probability 4

Solved Problems of Probability 11

Solved Problems of Probability 17

Solved Problems of Probability 16

Solved Problems of Probability 3

Solved Problem of Probabilty 9

S1 and S2 distributions cheat sheet

Formulas

Probability Formulas

Exercises

Arithmetic Mean Worksheet

Confidence Interval Problems

Arithmetic Mean Problems

Median Worksheet

Normal Distribution Word Problems

Mode Worksheet

Standard Deviation Problems

Probability Worksheet

Probability Word Problems

Leave us a comment

Cancel reply

Dindo cańabano

March 2022

Using facebook account,conduct a survey on the number of sport related activities your friends are involvedin.construct a probability distribution andbcompute the mean variance and standard deviation.indicate the number of your friends you surveyed

Aidarous

October 2021

this page has a lot of advantage, those student who are going to be statitian

Dannah Lynch Gesto

September 2021

I’m a junior high school,500 students were randomly selected.240 liked ice cream,200 liked milk tea and 180 liked both ice cream and milktea

Mary Virnadeth Talagtag

January 2022

A box of Ping pong balls has many different colors in it. There is a 22% chance of getting a blue colored ball. What is the probability that exactly 6 balls are blue out of 15?

Dannah Lynch Gesto

September 2021

Where is the answer??

Mary Virnadeth Talagtag

January 2022

A box of Ping pong balls has many different colors in it. There is a 22% chance of getting a blue colored ball. What is the probability that exactly 6 balls are blue out of 15?

Emmanuel

September 2021

ere is a 60% chance that a final years student would throw a party before leaving school ,taken over 50 student from a total of 150 .calculate for the mean and the variance

Michael Rommen Cipcon

March 2022

There are 4 white balls and 30 blue balls in the basket. If you draw 7 balls from the basket without replacement, what is the probability that exactly 4 of the balls are white?