Probability distributions & meteor shower gazing¶

Our goals for today:

Discuss some key statistics topics: samples versus populations and empirical versus theorectical distributions
Simulate a head/tail coin toss and well drilling i.e. Binomial distribution
Simulate meteors entering Earth's atmosphere i.e. Poisson distribution
Simulate geomagnetic polarity reversals i.e. Gamma distribution
Use Gutenberg-Richter to assess earthquake probability

Setup¶

Run this cell as it is to setup your environment.

In [1]:

Copied!





import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from math import factorial
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from math import factorial

Case 1: Flipping a coin¶

Let's pretend we are flipping a coin 10 times using np.random.choice([0, 1]). How many times will be get heads? 1 is heads, 0 is tails. Let's use a for loop and get Python to simulate such a coin flip scenario for us.

What for loops do is take a chunk of code (in Python the chunk that is indented) being run multiple times. In this case, the code will get looped through 10 times -- specified by range(0,10).

In [2]:

Copied!





# np.random.seed(0)
for flip in range(0,10):
    flip_result = np.random.choice([0, 1])
    print(f"Flip {flip} result: {flip_result}")
# np.random.seed(0)
for flip in range(0,10):
    flip_result = np.random.choice([0, 1])
    print(f"Flip {flip} result: {flip_result}")

Flip 0 result: 1
Flip 1 result: 0
Flip 2 result: 0
Flip 3 result: 1
Flip 4 result: 0
Flip 5 result: 0
Flip 6 result: 1
Flip 7 result: 1
Flip 8 result: 0
Flip 9 result: 0

What does np.random.seed(0) do? Try running the code block above without it and with it. What do you notice?

Now let's record how many times the result was heads. We will make a list called flip_results and have it be blank to start. Each time we go through the code we will append the result to the list:

In [3]:

Copied!

flip_results = []

for flip in range(0,10):
    flip_result = np.random.choice([0, 1])
    flip_results.append(flip_result)
    
print(f"{flip_results = }")
flip_results = []

for flip in range(0,10):
    flip_result = np.random.choice([0, 1])
    flip_results.append(flip_result)
    
print(f"{flip_results = }")

flip_results = [0, 1, 0, 1, 0, 1, 0, 0, 1, 1]

We can calculate how many times were heads by taking the sum of the list:

In [4]:

Copied!

## use sum() to calculate the number of heads
number_heads = sum(flip_results)
print(f"{number_heads = }")
## use sum() to calculate the number of heads
number_heads = sum(flip_results)
print(f"{number_heads = }")

number_heads = 5

Now let's flip the coin 10 times and do that 10 times. Each time we flip it, let's record how many heads resulted from the flip.

In [5]:

Copied!





number_heads = []

## do 100 experiments
number_experiments = 100
number_flips = 10
for flip_experiment in range(number_experiments):

    flip_results = []
    
    ## each experiment, do 10 flips
    for flip in range(0,number_flips):
        flip_result = np.random.choice([0, 1])
        flip_results.append(flip_result)

    ## count the number of heads in the current experiment
    current_experiment = sum(flip_results)
    
    ## save the result
    number_heads.append(current_experiment)

## print the number of heads
print(f"{number_heads[:10] = }")
number_heads = []

## do 100 experiments
number_experiments = 100
number_flips = 10
for flip_experiment in range(number_experiments):

    flip_results = []
    
    ## each experiment, do 10 flips
    for flip in range(0,number_flips):
        flip_result = np.random.choice([0, 1])
        flip_results.append(flip_result)

    ## count the number of heads in the current experiment
    current_experiment = sum(flip_results)
    
    ## save the result
    number_heads.append(current_experiment)

## print the number of heads
print(f"{number_heads[:10] = }")

number_heads[:10] = [6, 7, 5, 6, 5, 4, 8, 8, 3, 5]

Let's visualize the distribution of the number of heads.

In [6]:

Copied!





plt.figure()
bins = np.arange(0, number_flips+1, 1)-0.5 # make the bins centered
plt.hist(number_heads, bins=bins, density=True, edgecolor='white')
plt.show()
plt.figure()
bins = np.arange(0, number_flips+1, 1)-0.5 # make the bins centered
plt.hist(number_heads, bins=bins, density=True, edgecolor='white')
plt.show()

No description has been provided for this image

Instead of 10 experiments, let's do 1000 experiments. Plot the histogram of the result and check how the distribution changes.

In [7]:

Copied!





number_heads = []

## do 10 experiments
number_experiments = 1000
number_flips = 10
for flip_experiment in range(number_experiments):

    flip_results = []
    
    ## each experiment, do 10 flips
    for flip in range(0,number_flips):
        flip_result = np.random.choice([0, 1])
        flip_results.append(flip_result)

    ## count the number of heads in the current experiment
    current_experiment = sum(flip_results)
    
    ## save the result
    number_heads.append(current_experiment)

## print the number of heads
print(f"{number_heads[:10] = }")
number_heads = []

## do 10 experiments
number_experiments = 1000
number_flips = 10
for flip_experiment in range(number_experiments):

    flip_results = []
    
    ## each experiment, do 10 flips
    for flip in range(0,number_flips):
        flip_result = np.random.choice([0, 1])
        flip_results.append(flip_result)

    ## count the number of heads in the current experiment
    current_experiment = sum(flip_results)
    
    ## save the result
    number_heads.append(current_experiment)

## print the number of heads
print(f"{number_heads[:10] = }")

number_heads[:10] = [6, 6, 3, 6, 6, 5, 8, 5, 8, 4]

In [8]:

Copied!





plt.figure()
bins = np.arange(0, number_flips+1, 1) - 0.5 # make the bins centered
plt.hist(number_heads, bins=bins, density=True, edgecolor='white')
plt.show()
plt.figure()
bins = np.arange(0, number_flips+1, 1) - 0.5 # make the bins centered
plt.hist(number_heads, bins=bins, density=True, edgecolor='white')
plt.show()

How does the distribution change as the number of experiments increases? Is the center of the distribution changing? Is the width of the distribution changing?

Binomial distribution:¶

Theoretical¶

A relatively straight-forward distribution is the binomial distribution which describes the probability of a particular outcome when there are only two possibilities (yes or no, heads or tails, 1 or 0).

For example, in a coin toss experiment (heads or tails), if we flip the coin $n$ times, what is the probability of getting $x$ 'heads'? We assume that the probability $p$ of a head for any given coin toss is 50%; put another way $p$ = 0.5.

The binomial distribution can be described by an equation:

$$P=f(x,p,n)= \frac{n!}{x!(n-x)!}p^x(1-p)^{n-x}$$

We can look at this kind of distribution by evaluating the probability for getting $x$ 'heads' out of $n$ attempts. We'll code the equation as a function, and calculate the probability $P$ of a particular outcome (e.g., $x$ heads in $n$ attempts).

Note that for a coin toss, $p$ is 0.5, but other yes/no questions can be investigated as well (e.g., chance of finding a fossil in a sedimentary layer; whether or not a landslide occurs following an earthquake).

In [9]:

Copied!





def binomial_probability(number_positive,positive_rate,number_attempts):
    """
    This function computes the probability of getting x particular outcomes (heads) in n attempts, where p is the 
    probability of a particular outcome (head) for any given attempt (coin toss).
    
    Parameters
    ----------
    number_positive : number of a particular outcome
    positive_rate : probability of that outcome in a given attempt
    number_attempts : number of attempts
    
    Returns
    ---------
    prob : probability of that number of the given outcome occuring in that number of attempts
    """
    x = number_positive
    p = positive_rate
    n = number_attempts

    ## compute the binomial probability following the equation above
    prob = factorial(n)/(factorial(x)*factorial(n-x))*(p**x)*((1-p)**(n-x))

    return prob
def binomial_probability(number_positive,positive_rate,number_attempts):
    """
    This function computes the probability of getting x particular outcomes (heads) in n attempts, where p is the 
    probability of a particular outcome (head) for any given attempt (coin toss).
    
    Parameters
    ----------
    number_positive : number of a particular outcome
    positive_rate : probability of that outcome in a given attempt
    number_attempts : number of attempts
    
    Returns
    ---------
    prob : probability of that number of the given outcome occuring in that number of attempts
    """
    x = number_positive
    p = positive_rate
    n = number_attempts

    ## compute the binomial probability following the equation above
    prob = factorial(n)/(factorial(x)*factorial(n-x))*(p**x)*((1-p)**(n-x))

    return prob

We can use this function to calculate the probability of getting 10 heads ($x=10$) when there are 10 coin tosses ($n=10$) given with the $p$ (probability) of 0.5.

In [10]:

Copied!

binomial_probability(10,0.5,10)
binomial_probability(10,0.5,10)

Out[10]:

0.0009765625

Let's calculate the probability of getting [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] heads.

In [11]:

Copied!

## create an array of the number of heads we want to calculate the probability for
x = np.arange(11)
print(f"{x = }")
## create an array of the number of heads we want to calculate the probability for
x = np.arange(11)
print(f"{x = }")

x = array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [12]:

Copied!





prob_heads = 0.5
number_flips = 10

probabilities = []
for number_head in x:
    prob = binomial_probability(number_head,prob_heads,number_flips)
    probabilities.append(prob)
    
    print(f"Number of heads: {number_head}, probability: {prob:.5f}")
prob_heads = 0.5
number_flips = 10

probabilities = []
for number_head in x:
    prob = binomial_probability(number_head,prob_heads,number_flips)
    probabilities.append(prob)
    
    print(f"Number of heads: {number_head}, probability: {prob:.5f}")

Number of heads: 0, probability: 0.00098
Number of heads: 1, probability: 0.00977
Number of heads: 2, probability: 0.04395
Number of heads: 3, probability: 0.11719
Number of heads: 4, probability: 0.20508
Number of heads: 5, probability: 0.24609
Number of heads: 6, probability: 0.20508
Number of heads: 7, probability: 0.11719
Number of heads: 8, probability: 0.04395
Number of heads: 9, probability: 0.00977
Number of heads: 10, probability: 0.00098

These probabilities are the theoretical probabilities of getting a particular number of heads in 10 coin flips. Let's verify that the histogram of the number of heads in 1000 coin flips matches the theoretical probabilities.

Make a plot where you both plot the histogram from 1000 coin flips (using plt.hist() with density=True) and you plot the results head_numbers probabilities (using plt.plot()).

In [13]:

Copied!





plt.figure(figsize=(10,5))

## plot the histogram from 1000 coin flips
bins = np.arange(0, number_flips+1, 1)-0.5 # make the bins centered
plt.hist(number_heads, bins=bins, density=True, edgecolor='white', label='Simulated')

## plot the theoretical probabilities calculated based on the binomial distribution
plt.plot(x, probabilities, 'o-', color='red', lw=2, label='Theoretical')

plt.xlabel(f'Number of heads out of {number_flips} attempts')
plt.ylabel('Fraction of times with this number of heads') 

plt.title(f'Coin flip results (n={number_flips})')

plt.legend()
plt.show()
plt.figure(figsize=(10,5))

## plot the histogram from 1000 coin flips
bins = np.arange(0, number_flips+1, 1)-0.5 # make the bins centered
plt.hist(number_heads, bins=bins, density=True, edgecolor='white', label='Simulated')

## plot the theoretical probabilities calculated based on the binomial distribution
plt.plot(x, probabilities, 'o-', color='red', lw=2, label='Theoretical')

plt.xlabel(f'Number of heads out of {number_flips} attempts')
plt.ylabel('Fraction of times with this number of heads') 

plt.title(f'Coin flip results (n={number_flips})')

plt.legend()
plt.show()

How does the histogram of the number of heads in 1000 coin flips match the theoretical probabilities? Does this imply that the theoretical distribution i.e., the binomial distribution, is a good model for the coin flip experiment?

Empirical¶

The type of sampling we were doing above where we were flipping coins is called a Monte Carlo simulation. We can use simulate data from all sorts of distributions. Let's keep focusing on the binomial distribution and look at using the np.random.binomial function.

In [14]:

Copied!

help(np.random.binomial)
help(np.random.binomial)

Help on built-in function binomial:

binomial(...) method of numpy.random.mtrand.RandomState instance
    binomial(n, p, size=None)
    
    Draw samples from a binomial distribution.
    
    Samples are drawn from a binomial distribution with specified
    parameters, n trials and p probability of success where
    n an integer >= 0 and p is in the interval [0,1]. (n may be
    input as a float, but it is truncated to an integer in use)
    
    .. note::
        New code should use the `~numpy.random.Generator.binomial`
        method of a `~numpy.random.Generator` instance instead;
        please see the :ref:`random-quick-start`.
    
    Parameters
    ----------
    n : int or array_like of ints
        Parameter of the distribution, >= 0. Floats are also accepted,
        but they will be truncated to integers.
    p : float or array_like of floats
        Parameter of the distribution, >= 0 and <=1.
    size : int or tuple of ints, optional
        Output shape.  If the given shape is, e.g., ``(m, n, k)``, then
        ``m * n * k`` samples are drawn.  If size is ``None`` (default),
        a single value is returned if ``n`` and ``p`` are both scalars.
        Otherwise, ``np.broadcast(n, p).size`` samples are drawn.
    
    Returns
    -------
    out : ndarray or scalar
        Drawn samples from the parameterized binomial distribution, where
        each sample is equal to the number of successes over the n trials.
    
    See Also
    --------
    scipy.stats.binom : probability density function, distribution or
        cumulative density function, etc.
    random.Generator.binomial: which should be used for new code.
    
    Notes
    -----
    The probability density for the binomial distribution is
    
    .. math:: P(N) = \binom{n}{N}p^N(1-p)^{n-N},
    
    where :math:`n` is the number of trials, :math:`p` is the probability
    of success, and :math:`N` is the number of successes.
    
    When estimating the standard error of a proportion in a population by
    using a random sample, the normal distribution works well unless the
    product p*n <=5, where p = population proportion estimate, and n =
    number of samples, in which case the binomial distribution is used
    instead. For example, a sample of 15 people shows 4 who are left
    handed, and 11 who are right handed. Then p = 4/15 = 27%. 0.27*15 = 4,
    so the binomial distribution should be used in this case.
    
    References
    ----------
    .. [1] Dalgaard, Peter, "Introductory Statistics with R",
           Springer-Verlag, 2002.
    .. [2] Glantz, Stanton A. "Primer of Biostatistics.", McGraw-Hill,
           Fifth Edition, 2002.
    .. [3] Lentner, Marvin, "Elementary Applied Statistics", Bogden
           and Quigley, 1972.
    .. [4] Weisstein, Eric W. "Binomial Distribution." From MathWorld--A
           Wolfram Web Resource.
           http://mathworld.wolfram.com/BinomialDistribution.html
    .. [5] Wikipedia, "Binomial distribution",
           https://en.wikipedia.org/wiki/Binomial_distribution
    
    Examples
    --------
    Draw samples from the distribution:
    
    >>> n, p = 10, .5  # number of trials, probability of each trial
    >>> s = np.random.binomial(n, p, 1000)
    # result of flipping a coin 10 times, tested 1000 times.
    
    A real world example. A company drills 9 wild-cat oil exploration
    wells, each with an estimated probability of success of 0.1. All nine
    wells fail. What is the probability of that happening?
    
    Let's do 20,000 trials of the model, and count the number that
    generate zero positive results.
    
    >>> sum(np.random.binomial(9, 0.1, 20000) == 0)/20000.
    # answer = 0.38885, or 38%.

np.random.binomial( ) requires 2 parameters, $n$ and $p$, with an optional keyword argument size (if size is not specified, it returns a single trial). We could have used this function earlier to get the number of heads that were flipped, but the way we did it also worked.

Let's follow the example the is given in the np.random.binomial( ) docstring.

A company drills 9 wild-cat oil exploration wells (high risk drilling in unproven areas), each with an estimated probability of success of 0.1. All nine wells fail. What is the probability of that happening?

Note that success in this context means that liquid hydocarbons came out of the well. In reality, you may not consider this a success given that the result is that more hydrocarbons will be combusted as a result, leading to higher atmospheric carbon dioxide levels and associated global warming.

If we do np.random.binomial(9, 0.1, 100) we will get a list of 100 values that represent the number of wells that yielded oil when there is a 10% (p = 0.1) chance of each individual well yielding oil.

In [15]:

Copied!

success_rate = 0.1
number_wells = 9
number_simulations = 100

number_success = np.random.binomial(number_wells, success_rate, number_simulations)

print(f"{number_success[:10] = }")
success_rate = 0.1
number_wells = 9
number_simulations = 100

number_success = np.random.binomial(number_wells, success_rate, number_simulations)

print(f"{number_success[:10] = }")

number_success[:10] = array([3, 0, 2, 0, 0, 2, 0, 3, 1, 0])

In [16]:

Copied!





## Count the number of failures
number_failures = np.sum(number_success == 0)

## Calculate the failure rate
failure_rate = number_failures/number_simulations

print(f"The number of failures is {number_failures} out of {number_simulations} simulations")
print(f"The failure rate is {failure_rate:.2f}")
## Count the number of failures
number_failures = np.sum(number_success == 0)

## Calculate the failure rate
failure_rate = number_failures/number_simulations

print(f"The number of failures is {number_failures} out of {number_simulations} simulations")
print(f"The failure rate is {failure_rate:.2f}")

The number of failures is 36 out of 100 simulations
The failure rate is 0.36

We can write a function that uses this process to simulate fraction of times that there no successful wells for a given number of wells, a given probability and a given number of simulations;

In [17]:

Copied!





def wildcat_failure_rate(number_wells,success_rate,number_simulations):
    '''
    Simulate the number of times that there are no successful wells for a given number of wells and a given probability for each well.
    
    Parameters
    ----------
    number_wells : number of wells drilled in each simulation
    success_rate : probability that each well will be successful
    number_simulations : number of times that drilling number_wells is simulated
    '''
    
    simulations = np.random.binomial(number_wells, success_rate, number_simulations)
    number_failures = np.sum(simulations == 0)
    failure_rate = number_failures/number_simulations

    return failure_rate
def wildcat_failure_rate(number_wells,success_rate,number_simulations):
    '''
    Simulate the number of times that there are no successful wells for a given number of wells and a given probability for each well.
    
    Parameters
    ----------
    number_wells : number of wells drilled in each simulation
    success_rate : probability that each well will be successful
    number_simulations : number of times that drilling number_wells is simulated
    '''
    
    simulations = np.random.binomial(number_wells, success_rate, number_simulations)
    number_failures = np.sum(simulations == 0)
    failure_rate = number_failures/number_simulations

    return failure_rate

Put the wildcat_failure_rate function to use

Use the function to simulate the failure rate for the above scenario (9 wells drilled, 0.1 probability of success for each well) and do it for 10 simulations

In [18]:

Copied!

number_wells = 9
success_rate = 0.1
number_simulations = 10

failure_rate = wildcat_failure_rate(number_wells,success_rate,number_simulations)

print(f"The failure rate is {failure_rate:.2f}")
number_wells = 9
success_rate = 0.1
number_simulations = 10

failure_rate = wildcat_failure_rate(number_wells,success_rate,number_simulations)

print(f"The failure rate is {failure_rate:.2f}")

The failure rate is 0.40

Use the function to simulate the failure rate for the same scenario for 1000 simulations

In [19]:

Copied!

number_wells = 9
success_rate = 0.1
number_simulations = 1000

failure_rate = wildcat_failure_rate(number_wells,success_rate,number_simulations)

print(f"The failure rate is {failure_rate:.2f}")
number_wells = 9
success_rate = 0.1
number_simulations = 1000

failure_rate = wildcat_failure_rate(number_wells,success_rate,number_simulations)

print(f"The failure rate is {failure_rate:.2f}")

The failure rate is 0.39

Use the function to simulate the failure rate for 100,000 simulations

In [20]:

Copied!

number_wells = 9
success_rate = 0.1
number_simulations = 100000

failure_rate = wildcat_failure_rate(number_wells,success_rate,number_simulations)

print(f"The failure rate is {failure_rate:.2f}")
number_wells = 9
success_rate = 0.1
number_simulations = 100000

failure_rate = wildcat_failure_rate(number_wells,success_rate,number_simulations)

print(f"The failure rate is {failure_rate:.2f}")

The failure rate is 0.39

Put the binomial_probability function to use

Instead of the examples above we are simulating the result, we could directly use the theoretical binomial_probability distribution to calculate the probability without doing many simulations.

In [21]:

Copied!

## calculate the probability of 0 successful wells, which is the failure rate
failure_rate = binomial_probability(number_positive=0,positive_rate=0.1,number_attempts=9)

print(f"The failure rate is {failure_rate:.2f}")
## calculate the probability of 0 successful wells, which is the failure rate
failure_rate = binomial_probability(number_positive=0,positive_rate=0.1,number_attempts=9)

print(f"The failure rate is {failure_rate:.2f}")

The failure rate is 0.39

How well does the calculated binomial_probability match the simulated wildcat_failure rates? How many times do you need to simulate the problem to get a number that matches the theoretical probability?

Poisson distribution:¶

A Poisson Distribution gives the probability of a number of events in an interval generated by a Poisson process: the average time between events is known, but the exact timing of events is random. The events must be independent and may occur only one at a time.

Within Earth and Planetary Science there are many processes that approximately meet this criteria.

Theoretical¶

The Poisson distribution gives the probability that an event (with two possible outcomes) occurs $k$ number of times in an interval of time where $\lambda$ is the expected rate of occurance. The Poisson distribution is the limit of the binomial distribution for large $n$. So if you take the limit of the binomial distribution as $n \rightarrow \infty$ you'll get the Poisson distribution:

$$P(k) = e^{-\lambda}\frac{\lambda^{k}}{k!}$$

In [22]:

Copied!





def poisson_probability(k,lamb):
    """
    This function computes the probability of getting k particular outcomes when the expected rate is lambda.
    """
    
    ## compute the poisson probability of getting k outcomes when the expected rate is lambda
    prob = (np.exp(-1*lamb))*(lamb**k)/factorial(k)
    
    return prob
def poisson_probability(k,lamb):
    """
    This function computes the probability of getting k particular outcomes when the expected rate is lambda.
    """
    
    ## compute the poisson probability of getting k outcomes when the expected rate is lambda
    prob = (np.exp(-1*lamb))*(lamb**k)/factorial(k)
    
    return prob

Observing meteors¶

From https://www.amsmeteors.org/meteor-showers/meteor-faq/:

How big are most meteoroids? How fast do they travel? The majority of visible meteors are caused by particles ranging in size from about that of a small pebble down to a grain of sand, and generally weigh less than 1-2 grams. Those of asteroid origin can be composed of dense stony or metallic material (the minority) while those of cometary origin (the majority) have low densities and are composed of a “fluffy” conglomerate of material, frequently called a “dustball.” The brilliant flash of light from a meteor is not caused so much by the meteoroid’s mass, but by its high level of kinetic energy as it collides with the atmosphere.

Meteors enter the atmosphere at speeds ranging from 11 km/sec (25,000 mph), to 72 km/sec (160,000 mph!). When the meteoroid collides with air molecules, its high level of kinetic energy rapidly ionizes and excites a long, thin column of atmospheric atoms along the meteoroid’s path, creating a flash of light visible from the ground below. This column, or meteor trail, is usually less than 1 meter in diameter, but will be tens of kilometers long.

The wide range in meteoroid speeds is caused partly by the fact that the Earth itself is traveling at about 30 km/sec (67,000 mph) as it revolves around the sun. On the evening side, or trailing edge of the Earth, meteoroids must catch up to the earth’s atmosphere to cause a meteor, and tend to be slow. On the morning side, or leading edge of the earth, meteoroids can collide head-on with the atmosphere and tend to be fast.

What is a meteor shower? Does a shower occur “all at once” or over a period of time? Most meteor showers have their origins with comets. Each time a comet swings by the sun, it produces copious amounts of meteoroid sized particles which will eventually spread out along the entire orbit of the comet to form a meteoroid “stream.” If the Earth’s orbit and the comet’s orbit intersect at some point, then the Earth will pass through this stream for a few days at roughly the same time each year, encountering a meteor shower. The only major shower clearly shown to be non-cometary is the Geminid shower, which share an orbit with the asteroid (3200 Phaethon): one that comes unusually close to the sun as well as passing through the earth’s orbit. Most shower meteoroids appear to be “fluffy”, but the Geminids are much more durable as might be expected from asteroid fragments.

Observing the Southern Taurids meteor shower¶

Let's say you are planning a camping trip to go out try to see shooting stars next Fall in a rural location. You are looking at a date in October and that there is an active shower:

Southern Taurids

Active from September 28th to December 2, 2021. The peak is November 4-5, 2021

The Southern Taurids are a long-lasting shower that reaches a barely noticeable maximum on October 9 or 10. The shower is active for more than two months but rarely produces more than five shower members per hour, even at maximum activity. The Taurids (both branches) are rich in fireballs and are often responsible for increased number of fireball reports from September through November. https://www.amsmeteors.org/meteor-showers/meteor-shower-calendar/

At a rate of 5 observed meteors per hour, what is the probability of observing 6 meteors in an hour?

We can use the Poisson probability function to answer this question:

In [23]:

Copied!





lamb = 5
k = 6
prob = poisson_probability(k,lamb)
print(f"The probability of getting {k} outcomes when the expected rate is {lamb} is {prob*100:.3f}%")
lamb = 5
k = 6
prob = poisson_probability(k,lamb)
print(f"The probability of getting {k} outcomes when the expected rate is {lamb} is {prob*100:.3f}%")

The probability of getting 6 outcomes when the expected rate is 5 is 14.622%

So that result tells us that there is a 14.6% chance of observing exactly 6, but it would be much more helpful to be able to visualize the probability distribution. So let's go through and calculate the probability of seeing any number between 0 and 10. First, we can make an array between 0 and 11:

In [24]:

Copied!





## Let's calcualte the theoretical probability of seeing different numbers of meteors when the expected rate is 5 (Southern Taurids)
taurid_meteor_rate = 5
number_meteors_seen = np.arange(0,11)
taurid_meteor_sighting_probability = []

for n in number_meteors_seen:
    prob = poisson_probability(n,taurid_meteor_rate)
    taurid_meteor_sighting_probability.append(prob)
    print(f"The probability to see {n} meteors is {prob:.3f}")
## Let's calcualte the theoretical probability of seeing different numbers of meteors when the expected rate is 5 (Southern Taurids)
taurid_meteor_rate = 5
number_meteors_seen = np.arange(0,11)
taurid_meteor_sighting_probability = []

for n in number_meteors_seen:
    prob = poisson_probability(n,taurid_meteor_rate)
    taurid_meteor_sighting_probability.append(prob)
    print(f"The probability to see {n} meteors is {prob:.3f}")

The probability to see 0 meteors is 0.007
The probability to see 1 meteors is 0.034
The probability to see 2 meteors is 0.084
The probability to see 3 meteors is 0.140
The probability to see 4 meteors is 0.175
The probability to see 5 meteors is 0.175
The probability to see 6 meteors is 0.146
The probability to see 7 meteors is 0.104
The probability to see 8 meteors is 0.065
The probability to see 9 meteors is 0.036
The probability to see 10 meteors is 0.018

Based on the probability of seeing 0 meteors, what is the theoretical probability of seeing at least 1 meteor?

In [25]:

Copied!

prob_w_meteor_sighting = 1 - taurid_meteor_sighting_probability[0]
print(f"The probability of seeing at least 1 meteor is {prob_w_meteor_sighting*100:.3f}% at the Southern Taurids rate of {taurid_meteor_rate} meteors per hour")
prob_w_meteor_sighting = 1 - taurid_meteor_sighting_probability[0]
print(f"The probability of seeing at least 1 meteor is {prob_w_meteor_sighting*100:.3f}% at the Southern Taurids rate of {taurid_meteor_rate} meteors per hour")

The probability of seeing at least 1 meteor is 99.326% at the Southern Taurids rate of 5 meteors per hour

In [26]:

Copied!





## Plot the probability distribution
plt.figure()
plt.plot(number_meteors_seen,taurid_meteor_sighting_probability,label='Southern Taurids ($\lambda = 5$)')
plt.xlabel('Number of meteors seen')
plt.ylabel('Probability')
plt.legend()
plt.show()
## Plot the probability distribution
plt.figure()
plt.plot(number_meteors_seen,taurid_meteor_sighting_probability,label='Southern Taurids ($\lambda = 5$)')
plt.xlabel('Number of meteors seen')
plt.ylabel('Probability')
plt.legend()
plt.show()

When there is not an active shower the background meteor rate is about 2 an hour (although it is variable depending on time of night and season; see more here: https://www.amsmeteors.org/meteor-showers/meteor-faq/).

Let's update our code to calculate the probability of seeing different numbers of meteors when the background rate is 2 an hour (lambda = 2).

Calculate the probability of seeing different numbers of meteors when the background rate is 2 an hour (lambda = 2).
Plot these probabilities alongside the probability of seeing those same numbers during the Southern Taurids shower.

In [27]:

Copied!

## Let's calcualte the theoretical probability of seeing different numbers of meteors when the background rate is 2 an hour

background_meteor_rate = 2
number_meteors_seen = np.arange(0,11)
background_sighting_probability = []

for n in number_meteors_seen:
    
    prob = poisson_probability(n, background_meteor_rate)
    background_sighting_probability.append(prob)

    print(f"The probability to see {n} meteors is {prob:.3f}")
## Let's calcualte the theoretical probability of seeing different numbers of meteors when the background rate is 2 an hour

background_meteor_rate = 2
number_meteors_seen = np.arange(0,11)
background_sighting_probability = []

for n in number_meteors_seen:
    
    prob = poisson_probability(n, background_meteor_rate)
    background_sighting_probability.append(prob)

    print(f"The probability to see {n} meteors is {prob:.3f}")

The probability to see 0 meteors is 0.135
The probability to see 1 meteors is 0.271
The probability to see 2 meteors is 0.271
The probability to see 3 meteors is 0.180
The probability to see 4 meteors is 0.090
The probability to see 5 meteors is 0.036
The probability to see 6 meteors is 0.012
The probability to see 7 meteors is 0.003
The probability to see 8 meteors is 0.001
The probability to see 9 meteors is 0.000
The probability to see 10 meteors is 0.000

Based on the probability of seeing 0 meteors, what is the theoretical probability of seeing at least 1 meteor?

In [28]:

Copied!

prob_w_meteor_sighting = 1 - background_sighting_probability[0]
print(f"The probability of seeing at least 1 meteor is {prob_w_meteor_sighting*100:.3f}% at the background rate of {background_meteor_rate} meteors per hour")
prob_w_meteor_sighting = 1 - background_sighting_probability[0]
print(f"The probability of seeing at least 1 meteor is {prob_w_meteor_sighting*100:.3f}% at the background rate of {background_meteor_rate} meteors per hour")

The probability of seeing at least 1 meteor is 86.466% at the background rate of 2 meteors per hour

In [29]:

Copied!





## Plot the probability distribution
plt.figure()
plt.plot(number_meteors_seen,taurid_meteor_sighting_probability,label='Southern Taurids ($\lambda = 5$)')
plt.plot(number_meteors_seen,background_sighting_probability,label='Background meteors ($\lambda = 2$)')
plt.ylabel('Probability')
plt.xlabel('Number of meteors seen')
plt.legend()
plt.show()
## Plot the probability distribution
plt.figure()
plt.plot(number_meteors_seen,taurid_meteor_sighting_probability,label='Southern Taurids ($\lambda = 5$)')
plt.plot(number_meteors_seen,background_sighting_probability,label='Background meteors ($\lambda = 2$)')
plt.ylabel('Probability')
plt.xlabel('Number of meteors seen')
plt.legend()
plt.show()

Simulate meteor observing¶

There are many cases where it can be useful to simulate data sets. In this case, one could simulate what your experience could be in terms of the number of hours you could spend looking at the night sky and seeing 1 meteor or more on a normal night vs. a night with the Southern Taurids shower ongoing.

We can use the np.random.poisson function to simulate 'realistic' data.

Recall the Poisson distribution represents the probability of observing $k$ events when the expected rate is $\lambda$.

The Poisson distribution assumes that the events are independent and occur at a constant rate. Each call to np.random.poisson( ) is doing an experiment following the Poisson distribution to see how many events occur.

Let's try it with $\lambda = 2$ (the background rate) and watch the sky for 1000 hours.

In [30]:

Copied!





lamb = 2
number_hours = 1000
number_meteors = []

for hour in np.arange(number_hours):
    number_meteor_per_hour = np.random.poisson(lamb)
    number_meteors.append(number_meteor_per_hour)

## Calculate the fraction of hours watched with at least 1 meteor sighting
number_meteors = np.array(number_meteors)
number_hours_w_meteor_sighting = np.sum(number_meteors >= 1) # at least 1 meteor sighting
prob_w_meteor_sighting = number_hours_w_meteor_sighting/number_hours

print(f"The probability of seeing at least 1 meteor per hour is {prob_w_meteor_sighting*100:.3f}% from {number_hours} hours of watching")
lamb = 2
number_hours = 1000
number_meteors = []

for hour in np.arange(number_hours):
    number_meteor_per_hour = np.random.poisson(lamb)
    number_meteors.append(number_meteor_per_hour)

## Calculate the fraction of hours watched with at least 1 meteor sighting
number_meteors = np.array(number_meteors)
number_hours_w_meteor_sighting = np.sum(number_meteors >= 1) # at least 1 meteor sighting
prob_w_meteor_sighting = number_hours_w_meteor_sighting/number_hours

print(f"The probability of seeing at least 1 meteor per hour is {prob_w_meteor_sighting*100:.3f}% from {number_hours} hours of watching")

The probability of seeing at least 1 meteor per hour is 85.300% from 1000 hours of watching

Do the same meteor watching simulation with $\lambda = 5$ (the Southern Taurids rate).

In [31]:

Copied!





lamb = 5
number_hours = 1000
number_meteors = []

for hour in np.arange(number_hours):
    number_meteor_per_hour = np.random.poisson(lamb)
    number_meteors.append(number_meteor_per_hour)

## Calculate the fraction of hours watched with at least 1 meteor sighting
number_meteors = np.array(number_meteors)
number_hours_w_meteor_sighting = np.sum(number_meteors >= 1) # at least 1 meteor sighting        
prob_w_meteor_sighting = number_hours_w_meteor_sighting/number_hours

print(f"The probability of seeing at least 1 meteor per hour is {prob_w_meteor_sighting*100:.3f}% from {number_hours} hours of watching")
lamb = 5
number_hours = 1000
number_meteors = []

for hour in np.arange(number_hours):
    number_meteor_per_hour = np.random.poisson(lamb)
    number_meteors.append(number_meteor_per_hour)

## Calculate the fraction of hours watched with at least 1 meteor sighting
number_meteors = np.array(number_meteors)
number_hours_w_meteor_sighting = np.sum(number_meteors >= 1) # at least 1 meteor sighting        
prob_w_meteor_sighting = number_hours_w_meteor_sighting/number_hours

print(f"The probability of seeing at least 1 meteor per hour is {prob_w_meteor_sighting*100:.3f}% from {number_hours} hours of watching")

The probability of seeing at least 1 meteor per hour is 99.300% from 1000 hours of watching

The above estimate is based on Monte Carlo simulation of hundreds and thousands of hours. How does it compare to the theoretical probability?

Hint: If your results are not similar, try increasing the number of hours simulated.

Extension: Earthquake Probability¶

The occurrence of earthquakes is also a Poisson process, events occur randomly in time, and the average recurrence can be determined from Gutenberg-Richter.

The Gutenberg-Richter statistic which gives the annual rate of earthquakes above a given magnitude.

Assumeing the Gutenberg-Richter relationshipfor the San Franciso Bay Area: log10(N)= 3.266 - 0.797M, for each year.

Let's apply the Poisson distribution to this problem.

So $\lambda = N * {\Delta}time$, where N is the annual rate. It is common to consider ${\Delta}time=30 yrs$.

Use the Poisson's distribution to find the probability of 0 to 9 M5 events in a 30 year period.

In [32]:

Copied!





## Calculate the expected number of M5 earthquakes in a year
M = 5
logN = 3.266 - 0.797 * M
N = 10**logN
print(f"The expected number of M5 earthquakes in 1 year is {N:.2f}")

## Calculate the expected number of M5 earthquakes in 30 years
lamb = N * 30
print(f"The expected number of M5 earthquakes in 30 years is {lamb:.2f}")

## Calculate the probability of observing 0-9 M5 earthquake in a year
number_earthquakes = np.arange(0,10)
earthquake_probability = []
for n in number_earthquakes:
    prob = poisson_probability(n,lamb)
    earthquake_probability.append(prob)
    print(f"The probability of observing {n} M5 earthquakes in a year is {prob*100:.3f}%")
## Calculate the expected number of M5 earthquakes in a year
M = 5
logN = 3.266 - 0.797 * M
N = 10**logN
print(f"The expected number of M5 earthquakes in 1 year is {N:.2f}")

## Calculate the expected number of M5 earthquakes in 30 years
lamb = N * 30
print(f"The expected number of M5 earthquakes in 30 years is {lamb:.2f}")

## Calculate the probability of observing 0-9 M5 earthquake in a year
number_earthquakes = np.arange(0,10)
earthquake_probability = []
for n in number_earthquakes:
    prob = poisson_probability(n,lamb)
    earthquake_probability.append(prob)
    print(f"The probability of observing {n} M5 earthquakes in a year is {prob*100:.3f}%")

The expected number of M5 earthquakes in 1 year is 0.19
The expected number of M5 earthquakes in 30 years is 5.73
The probability of observing 0 M5 earthquakes in a year is 0.325%
The probability of observing 1 M5 earthquakes in a year is 1.861%
The probability of observing 2 M5 earthquakes in a year is 5.332%
The probability of observing 3 M5 earthquakes in a year is 10.183%
The probability of observing 4 M5 earthquakes in a year is 14.587%
The probability of observing 5 M5 earthquakes in a year is 16.715%
The probability of observing 6 M5 earthquakes in a year is 15.962%
The probability of observing 7 M5 earthquakes in a year is 13.065%
The probability of observing 8 M5 earthquakes in a year is 9.357%
The probability of observing 9 M5 earthquakes in a year is 5.957%

In [33]:

Copied!





## Plot the probability distribution
plt.figure()
plt.plot(number_earthquakes,earthquake_probability,label='M5 earthquakes ($\lambda = 10^{3.266-0.797M}$)')
plt.ylabel('Probability')
plt.xlabel('Number of M5 earthquakes in 30 years')
plt.legend()
plt.show()
## Plot the probability distribution
plt.figure()
plt.plot(number_earthquakes,earthquake_probability,label='M5 earthquakes ($\lambda = 10^{3.266-0.797M}$)')
plt.ylabel('Probability')
plt.xlabel('Number of M5 earthquakes in 30 years')
plt.legend()
plt.show()

How does the probability change with the number of events? How does that reconcile with the rate of M5 earthquakes in 30 years?

Poisson Probability of 1 or more earthquakes¶

The Poisson probability of zero events has an interesting use in characterizing earthquake hazard.

$P(k=0)=e^{-\lambda}$

The complement of the zero event probability is the probability of 1 or more earthquakes occuring in the period of time. It is this probability that is used in earthquake forecast reports. The probability of one or more events is written as;

$P(k >= 1) = 1 - e^{-\lambda}$

Determine the probability of 1 or more M4, M5, M6 and M7 in a 30 year period.

In [34]:

Copied!





## Calculate the probability of observing at least one M4, M5, M6 and M7 earthquake in 30 years
magnitudes = [4, 5, 6, 7]
number_years = 30
for M in magnitudes:
    logN = 3.266 - 0.797 * M
    N = 10**logN
    lamb = N * number_years
    prob0 = poisson_probability(0,lamb)

    prob = 1 - prob0
    print(f"The probability of observing at least one M{M} earthquake in {number_years} years is {prob*100:.3f}%") 
## Calculate the probability of observing at least one M4, M5, M6 and M7 earthquake in 30 years
magnitudes = [4, 5, 6, 7]
number_years = 30
for M in magnitudes:
    logN = 3.266 - 0.797 * M
    N = 10**logN
    lamb = N * number_years
    prob0 = poisson_probability(0,lamb)

    prob = 1 - prob0
    print(f"The probability of observing at least one M{M} earthquake in {number_years} years is {prob*100:.3f}%") 

The probability of observing at least one M4 earthquake in 30 years is 100.000%
The probability of observing at least one M5 earthquake in 30 years is 99.675%
The probability of observing at least one M6 earthquake in 30 years is 59.923%
The probability of observing at least one M7 earthquake in 30 years is 13.578%

How do the probabilities change if a 10 year period is considered?

In [35]:

Copied!





## Calculate the probability of observing at least one M4, M5, M6 and M7 earthquake in 10 years
magnitudes = [4, 5, 6, 7]
number_years = 10
for M in magnitudes:
    logN = 3.266 - 0.797 * M
    N = 10**logN
    lamb = N * number_years
    prob0 = poisson_probability(0,lamb)
    prob = 1 - prob0
    print(f"The probability of observing at least one M{M} earthquake in {number_years} years is {prob*100:.3f}%") 
## Calculate the probability of observing at least one M4, M5, M6 and M7 earthquake in 10 years
magnitudes = [4, 5, 6, 7]
number_years = 10
for M in magnitudes:
    logN = 3.266 - 0.797 * M
    N = 10**logN
    lamb = N * number_years
    prob0 = poisson_probability(0,lamb)
    prob = 1 - prob0
    print(f"The probability of observing at least one M{M} earthquake in {number_years} years is {prob*100:.3f}%") 

The probability of observing at least one M4 earthquake in 10 years is 99.999%
The probability of observing at least one M5 earthquake in 10 years is 85.190%
The probability of observing at least one M6 earthquake in 10 years is 26.272%
The probability of observing at least one M7 earthquake in 10 years is 4.748%