Appendix N — Overview of Project

For the project deliverable in this class, you will create a “webapp” using streamlit.

The purpose of the “app” will be to provide a tool for fitting histograms to statistical data. The interface should include:

Figure N.1: Distributions are everywhere

N.1 Using scipy.stats Objects

The scipy.stats module contains many distributions which can be used to either generate data that follows a distribution, or to fit a distribution to given data. In this project we’ll use the latter approach. This approach is actually pretty straight-forward, as demonstrated below.

First let’s generate some data using a gamma distribution with parameters of 5, 1, 1.

import numpy as np
from scipy.stats import gamma
import matplotlib.pyplot as plt

data = gamma.rvs(5, 1, 1, size=1000)
fig, ax = plt.subplots(figsize=[5, 5])
ax.plot(data, 'k.')
ax.set_xlabel('Measurement Number')
ax.set_ylabel('Value')
ax.set_ylim([0, 25])

Now we can fit the gamma distribution to the data, and it should return the same parameters we used to generate the data.

params = gamma.fit(data)
print(params)
(np.float64(4.74709076626095), np.float64(1.132154490735322), np.float64(1.0403836667561777))

Now let’s generate some values using the gamma distribution with the fitted parameters and overlay in on the original data.

dist = gamma(*params)
x = np.linspace(0, 25, 100)
fit = dist.pdf(x)

fig, ax = plt.subplots(figsize=[5, 5])
ax.plot(x, fit)
ax.hist(data, bins=25, density=True);  
# The semi colon stops matplotlib from writing to the console