Create a cumulative histogram in Matplotlib
Last Updated :
29 Aug, 2022
Improve
The histogram is a graphical representation of data. We can represent any kind of numeric data in histogram format. In this article, We are going to see how to create a cumulative histogram in Matplotlib
Cumulative frequency: Cumulative frequency analysis is the analysis of the frequency of occurrence of values. It is the total of a frequency and all frequencies so far in a frequency distribution.
Example:
X contains [1,2,3,4,5] then the cumulative frequency for x is [1,3,6,10,15].
Explanation:
[1,1+2,1+2+3,1+2+3+4,1+2+3+4+5]
In Python, we can generate a histogram with dataframe.hist, and cumulative frequency stats.cumfreq() histogram.
Example 1:
# importing pyplot for getting graph
import matplotlib.pyplot as plt
# importing numpy for getting array
import numpy as np
# importing scientific python
from scipy import stats
# list of values
x = [10, 40, 20, 10, 30, 10, 56, 45]
res = stats.cumfreq(x, numbins=4,
defaultreallimits=(1.5, 5))
# generating random values
rng = np.random.RandomState(seed=12345)
# normalizing
samples = stats.norm.rvs(size=1000,
random_state=rng)
res = stats.cumfreq(samples,
numbins=25)
x = res.lowerlimit + np.linspace(0, res.binsize*res.cumcount.size,
res.cumcount.size)
# specifying figure size
fig = plt.figure(figsize=(10, 4))
# adding sub plots
ax1 = fig.add_subplot(1, 2, 1)
# adding sub plots
ax2 = fig.add_subplot(1, 2, 2)
# getting histogram using hist function
ax1.hist(samples, bins=25,
color="green")
# setting up the title
ax1.set_title('Histogram')
# cumulative graph
ax2.bar(x, res.cumcount, width=4, color="blue")
# setting up the title
ax2.set_title('Cumulative histogram')
ax2.set_xlim([x.min(), x.max()])
# display the figure(histogram)
plt.show()
Output:
Example 2:
# importing numpy for getting array
import numpy as np
# importing scientific python
from scipy import stats
# list of values
x = [10, 40, 20, 10, 30, 10, 56, 45]
res = stats.cumfreq(x, numbins=4,
defaultreallimits=(1.5, 5))
# generating random values
rng = np.random.RandomState(seed=12345)
# normalizing
samples = stats.norm.rvs(size=1000,
random_state=rng)
res = stats.cumfreq(samples,
numbins=25)
x = res.lowerlimit + np.linspace(0, res.binsize*res.cumcount.size,
res.cumcount.size)
fig = plt.figure(figsize=(10, 4))
ax1 = fig.add_subplot(1, 2, 1)
ax2 = fig.add_subplot(1, 2, 2)
ax1.hist(samples, bins=25, color="green")
ax1.set_title('Histogram')
ax2.bar(x, x, width=2, color="blue")
ax2.set_title('Cumulative histogram')
ax2.set_xlim([x.min(), x.max()])
plt.show()
Output: