Sunday, April 3, 2022

Distributions

 

Patterns

And the light from a street lamp
Paints a pattern on my wall,
Like the pieces of a puzzle
Or a child's uneven scrawl

Finding the pattern,  (theory) can help you solve a puzzle.

Ruh-Roh.  Be warned.  This blog posting is especially nerdy.

Statistics is a descriptive science. It is NOT a predictive science. While statistics can help develop probabilistic models, Probability and Statistics alone can not develop models that predict with a  certainty. (see Pynchon’s  Gravity’s Rainbow or Stoppard’s Rosencrantz and Guildenstern Are Dead.)  It is possible to observe relationships that can assist in developing a theory, which is then used in predictive models, but first there has to be a theory. Observations can support a theory. But observations alone do not make a theory.

A normal statistical distribution has to allow the possibility of negative values. If the mean is significantly greater than zero, negative values are improbable BUT they are not impossible. For example, a normal distribution might have a mean of 100, a variance of 10, (the phrase “flatten the curve” that become popular during the COVID-19 pandemic would be translated to “increase the variance” by a statistician), a skew of 0, and a kurtosis of 0. If the kurtosis is 0, then the observed data has no outliers. If the skew is 0, then the function is uniform, and the mean is equal to the median which is equal to the mode. At a value of negative 100, this uniform normal probability function still says that -100 could be observed 2.2529*10-262  % of the time. This is NOT zero. but it is a vanishingly small number and is often treated as if it WERE zero. 

There is a difference between a negative which is reported and a real negative. When you say it is  ‑10 degrees Fahrenheit outside, it does not mean that there is negative energy. Just that a value of zero on the Fahrenheit scale,  0 °F, was established as the freezing temperature of a solution of brine made from a mixture of water, ice, and ammonium chloride. The other limit established was a best estimate of the average human body temperature, which at the time the Fahrenheit scale was established was 96 °F. The Centigrade, or Celsius,  scale of temperatures defines values of 0 °C as the freezing point of water and of 100 °C  as the boiling point of water. At -40 degrees, the Fahrenheit and Celsius system are the same. However both systems are merely a translation of temperature from a temperature of absolute zero.

If negative numbers are not allowed, statistics suggests the use of an exponential distribution. While an exponential  distribution does not allow negative numbers, it is a smooth continuous function. That is a probability function of 1.5 will have a value, even if only integer values should be observed. It may be convenient to state the mean family size of as 3.14, and the median family size as 3.12, but families don’t come that way. Each family member will only be an integer value. The mode is most probably 3, and thus family size is  skewed, even if it is a very small skew.

A uniform normal distribution requires that the mean, median and mode are equal. In an exponential distribution the mode  is zero; the demand is equal to 1/λ, where λ is the coefficient of the exponential in the probability function of the distribution, λe-λx; and the median is ln 2/λ , the mode is 0, and the skew is 2. A distribution that combines the features of a normal distribution with the features of an exponential distribution is an exponentially modified Gaussian (normal) distribution. Then the mode does not have to be equal to zero, and the median can be adjusted to be close to the mean without being the equal to the mean.

 

 

 

No comments:

Post a Comment