Skip to Tutorial Content

Introduction

In this lab you will explore The binomial and Poisson distributions.

The Binomial Distribution

The binomial distribution is used to describe dichotomous data. The probability density function describes the probability of any particular value being observed in terms of density. The binomial distribution is a two parameter distribution, which means two values (the number of trials and the probability of success) determine the shape of the distribution.

Visualizing Probability density

The data you collect is called binomial when there are only two possible outcomes (YES, NO). The data you collect is approximated by the binomial probability density function (pdf). The pdf function is \({\displaystyle {\binom {n}{x}}p^{x}(1-p)^{n-x}}\) and it describes the probability of any of the possible values (not necessarily observed). R has built in functions for all of the distributions that you will learn about this semester. To learn more check out the R tutorial page

Exercise 1: Binomial Distribution Plot

Instructions Run the code below to see the impact different parameter values have on the shape and position of the distributions. Use the plot to answers the quiz questions.

# Ploting the binomial distribution. 
# number of tials in this case there are 50
trials<-50

# Different probability of success
prob1<-0.5
prob2<-0.3
prob3<-0.8


# Plotting code (you don't have to understand this but feel free to check it out)
# values that probabilities are calculated
X<-seq(0,trials,1)
# This calculates the probabiltiy of success for each value in the x vector
# The propabilities are the y values for our plot 
# This is the distribution for X~BIN(n=50,p=0.5)
Y1<-dbinom(X,trials, prob1)
# This is the distribution for X~BIN(n=50,p=0.3)
Y2<-dbinom(X,trials, prob2)
# This is the distribution for X~BIN(n=50,p=0.8)
Y3<-dbinom(X,trials, prob3)

par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=TRUE)
plot(X,Y1, type = "h", ann=FALSE, col="blue", ylim = c(0,0.12), bty="L")
par(new=T)
plot(X,Y2, type = "h", ann=FALSE, axes=FALSE, col="red")
par(new=T)
plot(X,Y3, type = "h", ann=FALSE, axes=FALSE, col="dark green")
title(main = "Binomial Probability Distributions",
      ylab = "Probability",
      xlab = paste("Number of Successes (Trials = ",paste(trials),")",sep = ""))
legend("topright", title="Probability of Success", c(paste(prob1),paste(prob2),paste(prob3)), fill = c("blue","red","dark green"), cex = 0.75, inset = c(-0.2,0))

Quiz: Questions 1-3

Quiz

Exercise 2: Calculating Probabilities from Binomial Distribution

Instructions: Before selecting one of the calculations options move the sliders that change either the probability of success or the number of trials to see how the distribution changes. Then use the one of the appropriate calculation option to answer the quiz questions. (Hint: if you are having a hard time setting specific values with the slider. Click the slider and then use the arrow keys to increase or decrease the value)

How to calculate the different probabilities:

  • To calculate \(P(X<x)\) set the lower bound to \(0\) and the upper bound to \(x-1\), for example with \(Bin(n=50,p=0.5)\) the \(P(X<22)=0.161118\) lower bound is set to 0 and the upper bound is set to \(21\) (\(22-1=21\)).
  • To calculate \(P(X\leq x)\) set the lower bound to \(0\) and the upper bound to \(x\), for example with \(Bin(n=50,p=0.5)\) the \(P(X\leq22)=0.239944\) lower bound is set to 0 and the upper bound is set to \(22\).
  • To calculate \(P(X=x)\) set both the lower bound and upper bound to the same value, for example with \(Bin(n=50,p=0.5)\) the \(P(X=22)=0.078826\) set the lower bound to \(22\) and the upper bound to \(22\).
  • To calculate \(P(x1\leq \ X \leq x2)\) set the lower bound to \(x1\) and the upper bound to \(x2\), for example with \(Bin(n=50,p=0.5)\) the \(P(19\leq \ X \leq 22)=0.20749\) set the lower bound to \(19\) and the upper bound to \(22\).
Review the exercise instructions. You can check to see if you are doing it right with the examples.
This returns the number of successes for the specified percentile

Quiz: Questions 4-6

Instructions: Select “Calculate Probabilities” to answer questions about probability. The “Calculate Percentiles” option will be used in future labs

Relevant information for the questions below: The prevalence of black lung disease in the general population of coal miners is p=0.17.

Quiz

The Poisson Distribution

The Poisson distribution is used to describe count data over a specific area or time period. The probability density function describes the probability of any particular value being observed in terms of density (also known as probability) . The Poisson distribution is a one parameter distribution, which means one value lambda (\(\lambda\)) determines the shape of the distribution. In the case of the Poisson distribution \(\lambda\) is both the mean (measure of center), variance (measure of spread), and one of the modes (measure of center). \(Lambda\) is often described as a rate (Number of Occurrences/time period), so the birth rate in a hospital could be described as 1.5 per hour. So the distribution shape would be described using the following notation \(Pois(\lambda=1.5)\).

Visualizing Probability density

Poisson counts can only be positive integers defined by a time period or specific area. For example the number of parking citations in a month, or the number of parking citations in a month in Athens. The definition of area is not restricted to geographic definitions. The number of parasites on a single honey bee is also a Poisson count with the area being defined as the body of a single honey bee.

The count data you collect is approximated by the Poisson probability density function (pdf). The pdf function is \(P\left( x \right) = \frac{{e^{ - \lambda } \lambda ^x }}{{x!}}\) and it describes the probability of any of the possible values (not necessarily observed). R has built in functions for all of the distributions that you will learn about this semester.

Exercise 3: Poisson Distribution Plot

Instructions Run the code and look at the impact of different lambda values.

# Ploting the Poisson distribution. 
# number of occurances in this case there are 150
k<-150

# Different probability of success
lam1<-75
lam2<-140
lam3<-20


# Plotting code (you don't have to understand this but feel free to check it out)
# values that probabilities are calculated for
X<-seq(0,k,1)
# This calculates the probabiltiy of success for each value in the x vector
# The propabilities are the y values for our plot 
# This is the distribution for X~BIN(n=50,p=0.5)
Y1<-dpois(X,lam1)
# This is the distribution for X~BIN(n=50,p=0.3)
Y2<-dpois(X,lam2)
# This is the distribution for X~BIN(n=50,p=0.8)
Y3<-dpois(X,lam3)

par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=TRUE)
plot(X,Y1, type = "h", ann=FALSE, col="blue", ylim = c(0,0.12), bty="L")
par(new=T)
plot(X,Y2, type = "h", ann=FALSE, axes=FALSE, col="red")
par(new=T)
plot(X,Y3, type = "h", ann=FALSE, axes=FALSE, col="dark green")
title(main = "Poisson Probability Distributions",
      ylab = "Probability",
      xlab = paste("Number of Occurrences (k = ",paste(k),")",sep = ""))
legend("topright", title=expression(paste(lambda,"Values")), c(paste(lam1),paste(lam2),paste(lam3)), fill = c("blue","red","dark green"), cex = 0.75, inset = c(-0.2,0))

Quiz: Questions 7-9

Quiz

Exercise 4: Calculating Probabilities from Poisson Distribution

Instructions: Before selecting one of the calculations options move the slider that changes lambda (\(\lambda\)) to see how the distribution changes. Then use the appropriate calculation option to answer the quiz questions. (Hint: if you are having a hard time setting specific values with the slider. Click the slider and then use the arrow keys to increase or decrease the value)

How to calculate the different probabilities:

  • To calculate \(P(X<x)\) set the lower bound to \(0\) and the upper bound to \(x-1\), for example with \(Pois(\lambda=1.5)\) the \(P(X<2)=0.557825\) lower bound is set to \(0\) and the upper bound is set to \(1\) (\(2-1=1\)).
  • To calculate \(P(X\leq x)\) set the lower bound to \(0\) and the upper bound to \(x\), for example with \(Pois(\lambda=1.5)\) the \(P(X\leq2)=0.808847\) lower bound is set to \(0\) and the upper bound is set to \(2\).
  • To calculate \(P(X=x)\) set both the lower bound and upper bound to the same value, for example with \(Pois(\lambda=1.5)\) the \(P(X=2)=0.251021\) set the lower bound to \(2\) and the upper bound to \(2\).
  • To calculate \(P(x1\leq \ X \leq x2)\) set the lower bound to \(x1\) and the upper bound to \(x2\), for example with \(Pois(\lambda=1.5)\) the \(P(1\leq \ X \leq 2)=0.585717\) set the lower bound to \(1\) and the upper bound to \(2\).
Review the exercise instructions. You can check to see if you are doing it right with the examples.
This returns the number of successes for the specified percentile

Quiz: Questions 10-12

Instructions: Select “Calculate Probabilities” to answer questions about probability. The “Calculate Percentiles” option will be used in future labs

Relevant information for the questions below: The average rate of car accidents in Athens is 2.6 per day.

Quiz

Summary

In this lab, you completed 4 exercises and answered 12 quiz questions.

The lab covered 2 topics:

  1. The Binomial Distribution
  2. The Poisson Distribution

You are done with lab now you can binge a new show on Netflix! Don’t forget to record your answers and take the eLC quiz to get credit

Discrete Distributions

Computer Lab 7