Tutorial

How to import the library

beanPy can be imported by putting beanPy in your workspace and running

>>> import beanPy as Beans

For all of the guides it is implied that you have aready imported the library into your workspace

How to define a distribution

beanPy has an assortment of different distributions to use such as

  • Normal

This defines a normal distribution given a defined mean_value and variance_value

our_normal_distribution = Beans.NormalDistribution(mean_value, variance_value)
  • Exponential

This defines an exponential distribution given a defined lambda_value

our_exponential_distribution = Beans.ExponentialDistribution(lambda_value)
  • Poisson

This defines an poisson distribution given a defined lambda_value

our_poisson_distribution = Beans.PoissonDistribution(lambda_value)
  • Continuous uniform

This defines a continous uniform distribution given a defined min_value, max_value

our_continous_uniform_distribution = Beans.ContinuousUniformDistribution(min_value, max_value)
  • Discrete uniform

This defines a discrete uniform distribution given a defined min_value, max_value and an optional step_value

our_discrete_uniform_distribution = Beans.DiscreteUniformDistribution(min_value, max_value, [step_value])

Note: The default value of step_value is 1, so if omitted it will default to step_value = 1

The min_value is always included in the distribution however if you give a max_value and the step_value cannot step towards it it will reduce the max_value such that the largest number of steps are included


For example:

>>> mean_value = 0
>>> variance_value = 1
>>> our_normal_distribution = Beans.NormalDistribution(mean_value, variance_value)

This will define a standardized normal distribution, of course you could always parse the mean and variance directly into our function

How to take samples with a chosen distribution

The main use of our library is to take samples of numbers with a given distribution. To do so you must define a distribution and define number_of_samples and seed then parse

our_chosen_distribution.TakeSample(seed)

Or in order to take multiple samples at once you parse

our_chosen_distribution.TakeMultipleSamples(number_of_samples, seed)

The seed can be omitted in both of these cases


For example:

First we define the distribution, in this case using standardized normal so that you can check it against other values online.

>>> mean_value = 0
>>> variance_value = 1
>>> our_normal_distribution = Beans.NormalDistribution(mean_value, variance_value)

Now you generate a sample of numbers in preportion to the normal distribution

>>> number_of_samples = 1000
>>> seed = 0
>>> our_normal_distribution.TakeMultipleSamples(number_of_samples, seed)

This will output 1000 samples of numbers in the preportion of the normal distribution

To test if this has given us a suitable output, test the samples mean (\(\mu\)) and variance (\(\sigma^2\))

>>> output = our_normal_distribution.TakeMultipleSamples(number_of_samples, seed)
>>> sample_mean = sum(i for i in output) / number_of_samples
>>> #using the equation of variance
>>> sample_variance = sum(i ** 2 for i in output) / number_of_samples - sample_mean
>>> print(f"μ = {sample_mean}, σ² = {sample_variance}")

Giving us

\[\mu = 0.04904221840834106, \sigma^2 = 0.9342589470547096\]

This has generated a list of numbers with, approximatley, a mean 0 and a variance 1. Obviously with more samples it will be closer to the desired mean and variance since the sample size would be larger

How to find quantiles of a defined distribution

For any of the distributions \(X\) you can find their quantiles, which is to find the value of \(x\) at which the probability of something happening is less than or equal to a probability you parse into it. In other words to find \(x\) given \(p\)

\[P(X ≤ x) = p\ \text{for}\ 0 < p < 1 \]

To do this you must define a distribution and parse

probability_value = p
our_chosen_distribution.FindQuantile(probability_value)

For example:

Using the standardized normal distribution so that we can check we get the correct value of the quartile

>>> mean_value = 0
>>> variance_value = 1
>>> our_normal_distribution = Beans.NormalDistribution(mean_value, variance_value)

Now use the FindQuantile function on the our_normal_distribution

>>> our_probability = 0.95
>>> our_normal_distribution.FindQuantile(our_probability)
\[x = 1.16308715367667 \sqrt{2}\]

beanPy uses sympy to compute its functions so it is returned as much of an exact value as it can, obviously when you use this code you may want to just see the decimal value, hence we return the value as a float

>>> float(our_normal_distribution.FindQuantile(our_probability))
\[x = 1.6448536269514724\]

We can see if this is true by looking at a table of standardized normal values you can see that

table
\[P(X > 1.6449) = 0.05\ \text{and hence}\ P(X ≤ 1.6449) = 0.95\]

And hence have found the quantile of our defined distribution

How to plot a PDF of a distribution with given parameters

For all of the supported distributions of beanPy you can plot them on the graph to have a visualization of how the numbers generated are spread, the library supports being able to change the number of points to draw the graph with - in order to get the required level of accuracy with a speed. To do this you must define a distribution and parse.

number_of_points = 100
our_chosen_distribution.PlotPDF(number_of_points)

You can omit number_of_points and it will default to 50 points


For example:

Using the exponential distribution for this example

>>> lambda_value = 3
>>> our_exponential_distribution = Beans.ExponentialDistribution(lambda_value)

For any of the distributions 1000 points is a good balance of speed to accuracy of graph, so we will be using that for ours but it defaults to 50 if nothing given

>>> number_of_points = 1000
>>> our_exponential_distribution.DrawPDF(number_of_points)

And hence have our PDF curve

PDF