Introduction to Python Statistics Module
Statistics module in Python furnishes the functions to statistical mathematics of numeric data. Some well-liked functions in statistics defined in this module can be seen. These modules are pretty straight forward to use.
These are used with import statistics file to calculate the results of various statistical operations. The statistics module gives you a way to approach doing some simple math problems such as mean, median, mode, variance, standard deviation. The module provides calculations of mathematical statistics numeric for functions in Real valued data. These functions support float, int, fraction, decimal unless explicitly noted.
Why is Statistics Used in Python?
Programming in R is rich and dedicated to statistics, whereas Python is a language that persists general purposes of statistical modules. Comparatively, Python is minimal with respect to features that are of statistical analysis than R language. Nonetheless, Python has a mix of statistics and building complex analysis pipelines, where it stands rich and is an invaluable asset. Sadly, the statistics module is not available in Python 2.7, but you are good to go with Python 3 if you have had to use these. There are some third-party libraries like Scipy, Numpy and a few for professional statisticians Matlab, SAS, Minitab. These are high end leveled in graphing and scientific calculations.
Modules of Python Statistics
Let us get into explaining a few of all the modules of statistical – mathematical approach in Python one by one.
Returns arithmetic mean of sample or population for which data is a sequence or an iterable. If empty data found, StatisticsError is raised.
Data conversion to mean and arithmetic mean computation. A function always returns a float for which data is a sequence or an iterable. If empty data found, StatisticsError is raised.
Computation of geometric mean by converting data to float. Central tendency or Typical value of data is indicated through this function. If empty data found, StatisticsError is raised.
Example: round(geometric_mean([36, 24, 54]), 1)
Median value or middle value is returned of data. When the data number points are even, the median interpolates and prints by taking the average of two middle values. If empty data found, StatisticsError is raised.
This function returns a list of regularly occurring values. An empty list is returned when the data is empty, and more than one result is returned if there are multiple modes. If empty data found, StatisticsError is raised.
The below are new in version 3.8
The harmonic mean of data is returned. Subcontrary mean or harmonic mean is the reciprocal of the arithmetic mean. If empty data found, StatisticsError is raised.
Example: harmonic_mean ([40, 60])
The standard deviation of the sample is returned (sample variance square root).
The sample variance of data is returned, iterable of two real-valued numbers at least. A big difference or variance indicates data spread.
Averages and Measures of Central Location
Functions in Averages and measures calculate an average value from the sample population. Whereas, Measures of spread functions calculate deviation measure of sample or population from average or typical values.
- mean function: Average or mean of data.
- fmean function: Mean for floating point arithmetic.
- geometric_mean function: Mean of data for geometric.
- harmonic_mean function: Harmonic mean of data.
- median function: Middle value or median of data.
- median_low function: Least median of data.
- median_high function: High-end median of data.
- median_grouped function: 50th percentile of grouped data.
- mode function: The most common value of nominal or discrete data. (single mode)
- multimode function: The most common values of nominal or discrete data. (list of modes)
- qualities function: Data divided into intervals with equal probability
- stdev function: Standard deviation of population data.
- pstdev function: Variance in population data.
- pvariance function: Standard deviation of sample data.
- variance function: Variance of sample data.
Example of Python Statistics Module
Some of the examples are given below:
import statistics list_example = [6,7,2,6,3,5,5,5,2,5,6,1,2] a = statistics.mean(list_example) print(a) b = statistics.median(list_example) print(b) c = statistics.mode(list_example) print(c) d = statistics.stdev(list_example) print(d) e = statistics.variance(list_example) print(e)
- The first line of output displays the mean. That is the average of the given list.
- The second line of output displays the median of the given list.
- The third line of output displays the mode of the given list. That is the value that frequently occurs in the given list.
- The fourth line of the list displays the standard deviation of the given list. That is the amount of variation or dispersion among the set of values in the given list.
- The fifth line of output displays the variance. That is, how far the set of values are spread out far away from the average value.
Exceptions and Errors in Python Statistics Module
Exceptions and errors in the python statistics module are given below:
For the statistics related exceptions, a subclass of ValueError. If sigma is –ve, Statistics Error is raised. You need to make sure that you use the map function to ensure a consistent result when your input consists of a mixed collection type.
Like for example, map(float, data_input).
#For mean: import statistics data = [1, 5, 6, 7, 4, 5, 6] a = statistics.mean(data) print("Mean of the given list : ", a)
#For median: import statistics data = [-8, -5, -4, -9, 0, 2, 5, 6, 1] a = statistics.median(data) print("Median of the given list is : " ,a)
This is a guide to Python Statistics Module. Here we discuss basic concept, averages and measures of central location along with modules of python statistics. You may also look at the following articles to learn more –