Ain’t We Got
Fun
There's nothing surer
The rich get rich and the poor get poorer
In the meantime, in between time
Don't we have fun?
But is it normal for the rich to get richer?
It is proposed that the Cumulative Distribution Function,
CDF, for an exponential distribution, which is 1-e-λx, with a
rate parameter, λ, can be approximated by a coordinate translation of
the random normal logistics distribution, also known as the hyperbolic secant squared
distribution, whose CDF is ½*tanh((x-µ)/(2*s))+½, from an origin
of (0,0) to an origin of (λ, 0.5) if that random normal CDF is also scaled by 2. This
means that the range parameter, s, of the logistics distribution can be approximated
by 1/(2*λ*ln(2)). While the exponential distributions is traditionally only
defined for x>0, this can be translated to begin at any location, µ,
if the exponential distribution is also defined for x>µ>0.
Because the logistics function is already defined for all
ranges of x, this means that the exponential distribution, whose CDF is also known as
the exponential association, can also be defined for all values of x, including x<µ, if its
parameter s is a function of λ. This means that there is no
need for a combination of the exponential distribution and a random normal function,
either as an Exponentially Modified Gaussian distribution as proposed by
Grushka [1], or as an Exponentially Modified Logistic distribution as proposed by Reyes [2]
The figure below shows the CDF of a logistics distribution
(blue), which does not look like the CDF of the exponential distribution (red).
Also shown as a dash red curve is what the CDF for the exponential distribution
would be for x<0. The doubling of logistics function with a shift along the
y-axis of the origin from (0, 0) to an origin of (0, 0.5) does look like
the exponential distribution for x>0 (green).
As shown below, if the curves are shifted on the x-axis
to both cross at µ, by shifting the exponential distribution from an origin of (0, 0)
to an origin of (µ, 0) then the two curves look more similar for x>µ.
By setting the two curves equal at a common location, µ,
it is possible to solve for s, the range parameter of the logistics distribution
in terms of λ, the rate parameter of the exponential distribution. This
function is s=1/(2*ln(2)*λ). If the variance is equal to 1.0,
then the relationship between s
and the variance, σ2,
as s2π2/3 can be used to compute that s=0.55. At that value of s, this means that the correlation between the two curves from µ to µ+3σ, is almost perfect at 0.9967. However if the difference
between that scaled logistics distribution greater than the median and the exponential
distribution is set to a minimum, the values
become λ= 1/ln(2)=1.44, s =0.5, the variance thus becomes 0.822
and the correlation between the exponential and the logistics curve, scaled and
shifted, increases to 0.9982.
It is thus proposed that there is no need to develop a new
distribution combining the exponential and a random normal distribution. The exponential
distribution with a constraint of
x>µ, appears to be merely the upper half of a normal logistics
distribution, the half beginning at the median. It is also suggested the lowest
variance for a normal distribution should be 0.822, the lowest standard devaiation should be 0.9069,
s should be 0.5, and that
the rate parameter of the exponential distribution is related to the difference
between the mean and median of any distribution.
Thus if the mean household income in 2021 is $66,018 and the
median household income is $58,153 according to the U.S. Census, and income
follows an exponential distribution, the curve would be as shown below, which
also shows the reported mean household income by the mid‑point of a
decile, as well as the reported mean income limit of the highest 5%. This suggests
that only when zero represents an absolute value, e.g. as the vector distance
from an object, or an empty condition, where the mean and the median of the distribution
are the same, will this be a true exponential distribution. It will be skewed by
definition and is not normal. However if the median and the mean are appreciably
different, then the distribution may only appear to follow an exponential distribution,
but the distribution is in fact normal and its appearance as a skewed exponential
distribution is because only the portion above the median is being used. Or as Garrison
Keillor ironically puts it in his tales from Lake Wobegon, “All the children are
above average.”
The chart above has been adjusted for inflation, i.e. all
incomes are in 2021 US Dollars. Both the 1968 and
the 2021 distributions have the same total income for society but only vary in
how it is distributed to individual households.
It suggests that, the income distribution in 1968 was less skewed, and
that if it was viewed as a normal distribution for all incomes, including
subsidies and transfers, i.e. negative incomes, the lower income range would be
between $0 to $100,645 instead of the current range of $0 to $163,547 and the income
to be wealthy would be $301,934 instead of $490,642. The 1968 distribution was less normal, had a lower coefficent of determination, r2, to the random distribution, but was more
equitable, had a lower variance. The 2021 distribution was more normal but less equitable. The challenge is to distribute incomes in a manner that is both normal and equitable.
[1] Grushka, E. (1972). Characteristics of Exponentially
Modified Gaussian Peaks in Chromatography. Analytical Chemistry Vol 44, pp.
1733-1738.
[2] Reyes, J., Venegas, O., & Gómez, H. W. (2018).
Exponentially-Modified Logistic Distribution with Application to Mining and
Nutrition Data. Applied Mathematics & Information Sciences Vol 12
Number 6, pp. 1109-1116.