Chapters

Population Definition
Definition of a Sample
Types of Samples
Confidence Interval Definition
Population Proportion
Confidence Interval for the Proportion
Confidence Level
Interpretation of Confidence Interval

The best Maths tutors available

Population Definition

In probability theory and statistics, there are two terms that are fundamental in understanding why many of the techniques are used. These two terms are: population and sample. A population is the group of people, places, or things you’re interested in studying. You can find some examples in the image below.

Image	Study Interest	Population
A	Voter preference for people in the UK	All people of voting-age in the UK
B	Trees affected by an infectious disease	All trees in the UK
C	Daily number of tea drunk per person	All cups of tea drunk in the UK

As you can see, populations tend to be enormous. Take the first example, described in image A. The number of people 18 and over was 52.7 million people in 2019, according to the ONS. Imagine measuring the voting preference of all those people!

Definition of a Sample

Because populations tend to be enormous, we need a way to estimate the metrics we want to study without needing to measure all units or individuals in the population. This is where samples come in. Samples are defined as a subset of a population that is used to estimate true population parameters. Take a look at the image below to see how we solve the examples given above.

Image	Population	Sample
A	All people of voting-age in the UK	500 people of voting age in each region of the UK
B	All trees in the UK	50 trees in each national park
C	All cups of tea drunk in the UK	Coffee drinks of 1,000 people in the UK

Types of Samples

There are actually many different types of samples that you can take from a population. No one sample is the best, as each depends on the population of interest as well as the resources available to you. There are two main types of samples, which can be seen described in the image below.

While understanding the intricacies of samples aren’t super important here, it’s important to know that for probability samples, you are able to apply the inferential tools involved in probability theory. These inferential tools involve things like:

Confidence interval
Hypothesis testing

Confidence Interval Definition

As you can see, confidence intervals are part of the inferential tools of probability theory. As discussed, samples can be used to estimate the true population parameter. To understand this, let’s revisit the tea example.

	Composition	Mean Cups per Day	Meaning
Population	All people in the UK who drink tea	3	True value, which rarely can ever be measured
Sample	A sample of 1,000 tea drinkers	2.2	Estimated by the population

As you can see in the image above, we have a population parameter of 3 cups of tea per day per person versus what we measured in the sample: 2.2 cups. Because we’re estimating the true population number using the sample, we can use the confidence interval to capture the uncertainty in this estimation.

A confidence interval is defined as a range of values that’s likely to contain the true population parameter. It can be calculated for:

Mean
Proportion

Population Proportion

A population proportion is simply the true proportion measured for the population. A proportion is the ratio of a subset of a group in relation to the entire group. The table below illustrates the differences between a sample and population proportion.

	Formula	Example
Population	$p = \frac{M}{N}$	Number of people who voted pink in population
Sample	$\bar{p} = \frac{m}{n}$	Number of people who voted pink in sample

In practice, many people conduct studies on the same variable of interest. Continuing the example above, say five studies were conducted measuring the proportion of people who voted for pink.

The image above illustrates the distribution of these sample proportions. These proportions represent estimates of the true population proportion.

Confidence Interval for the Proportion

In order to be certain that we’ve captured the true population measure, we can build a confidence interval. The formula for the confidence interval is the following.

Confidence \; Interval \; = \; \bar{p} \; \pm \; z*(\sqrt{\frac{\bar{p}(1-\bar{p})}{n}})

The table below gives an explanation of each of the elements in the formula.

Element	Description
$\bar{p}$	The sample proportion
z	The z-score
n	The sample size

This formula results in a range of values above and below the sample proportion that is likely to contain the population parameter. Take the example from before, where we were given a couple of different sample proportions.

As you can see, taking several samples gives us an idea of where the true population parameter might lie. Instead of taking many different samples, a confidence interval can give us an idea of the range of values that include the population proportion.

Confidence Level

The confidence level represents what amount of certainty you want for your confidence interval. The bigger the confidence level, the more certainty you introduce into your interval - and vice versa. Recall that z-scores are the values on a z-table corresponding to the z-scores on a standard normal distribution.

Each z-score is simply a standardized version of the normal value, which in this case would be our proportion. Each z-score corresponds to a probability, marked on the y-axis, which tells us how likely that z-score is given the distribution. The confidence level, which can be thought of as a probability, have their corresponding z-values. The most common ones are listed below.

Confidence Level	Z-Score
0.95	1.96
0.90	1.645
0.85	1.44

Interpretation of Confidence Interval

Let’s continue the example from before. Say that you take a sample of 1,000 people and 320 voted for pink. To find the confidence level, we first determine n and $\bar{p}$.

Sample size	n	1,000
Sample proportion	$\bar{p}$	320/1000 = 0.32

Next, we simply plug in the values into the formula for the confidence interval. Let’s see the difference between confidence intervals at different confidence levels.

95% Confidence Interval	$0.32 \pm 1.96(\sqrt{\frac{0.32(1-0.32)}{1000}})$	0.35,0.29	There is a 95% chance that the confidence interval between 350 and 290 contains the true population proportion of those who voted pink
85% Confidence Interval	$0.32 \pm 1.44(\sqrt{\frac{0.32(1-0.32)}{1000}})	0.34, 0.3	There is an 85% chance that the confidence interval between 340 and 300 contains the true population proportion of those who voted pink

As you can see, the confidence interval is wider at a 0.85 confidence level than at 0.95.

Did you like this article? Rate it!

4.00 (3 rating(s))

Emma

I am passionate about travelling and currently live and work in Paris. I like to spend my time reading, gardening, running, learning languages and exploring new places.

Theory

Solved Problem of Probabilty 8

Mode

Measures of Central Tendency, Position and Dispersion

Solved Problems of Simple and Compound Probability

Solved Problems of Conditional Probability

Confidence Interval

Events

Solved Problem of Probabilty 1

Solved Problem of Probabilty 13

Solved Problem of Probabilty 15

Confidence Interval for the Mean

Contingency Tables

Multiplication Rule

Standard Normal Table

Solved Problem of Probabilty 10

Solved Problem of Probabilty 14

Solved Problem of Probabilty 6

Combinatorics and Probability

Confidence Interval for the Proportion

Normal Distribution

Percentiles

Conditional Probability Word Problems

Using the Z Table

Pie Charts

Normal Approximation

Probability Properties

Solved Problem of Probability 18

Solved Problem of Probabilty 5

Solved Problem of Probability 2

Tree Diagrams

Probability Theory

Median

Bayes’ Theorem

Conditional Probability

Standard Normal Distribution

Law of Total Probability

Solved Problems of Probability 4

Solved Problems of Probability 11

Solved Problems of Probability 17

Solved Problems of Probability 16

Solved Problems of Probability 3

Solved Problem of Probabilty 9

S1 and S2 distributions cheat sheet

Formulas

Probability Formulas

Exercises

Arithmetic Mean Worksheet

Confidence Interval Problems

Arithmetic Mean Problems

Median Worksheet

Normal Distribution Word Problems

Mode Worksheet

Standard Deviation Problems

Probability Worksheet

Probability Word Problems

Leave us a comment

Cancel reply

Dindo cańabano

March 2022

Using facebook account,conduct a survey on the number of sport related activities your friends are involvedin.construct a probability distribution andbcompute the mean variance and standard deviation.indicate the number of your friends you surveyed

Aidarous

October 2021

this page has a lot of advantage, those student who are going to be statitian

Dannah Lynch Gesto

September 2021

I’m a junior high school,500 students were randomly selected.240 liked ice cream,200 liked milk tea and 180 liked both ice cream and milktea

Mary Virnadeth Talagtag

January 2022

A box of Ping pong balls has many different colors in it. There is a 22% chance of getting a blue colored ball. What is the probability that exactly 6 balls are blue out of 15?

Dannah Lynch Gesto

September 2021

Where is the answer??

Mary Virnadeth Talagtag

January 2022

A box of Ping pong balls has many different colors in it. There is a 22% chance of getting a blue colored ball. What is the probability that exactly 6 balls are blue out of 15?

Emmanuel

September 2021

ere is a 60% chance that a final years student would throw a party before leaving school ,taken over 50 student from a total of 150 .calculate for the mean and the variance

Michael Rommen Cipcon

March 2022

There are 4 white balls and 30 blue balls in the basket. If you draw 7 balls from the basket without replacement, what is the probability that exactly 4 of the balls are white?