Monday, March 13, 2023

Hyperbolic Statistics

 

Natural Gift

You don't have to be a genius to find
All the hidden potential deep in your mind
You don't have to know about nuclear physics
Know all the formulas and vital statistics
You don't have to be an intellectual, you don't have to be a scientist
To use your natural gifts, you got natural gifts, yeh
Use your natural gifts, you got natural gifts

Coming soon to a math journal near you.  Hyperbolic statistics!

The first moment of observations about the origin is:

m1=(1/n) ∑fi xi,  , summed from i=1 to k

where the index i is the ith grouped observation, fi is the frequency of that observation, (e.g. if there are 3 observations of 2, this makes 3 the frequency of the group of observations that is 2), k is the number of groups, and n is the number of observations. This is also the formula for the arithmetic, computed, mean, , often called the average. The average is more properly defined as the centrality of the normal. The median is the centrality, but in a normal distribution the mean is also equal to the median. Therefore saying the average is the mean is only true when the observations are also normally distributed.

The second moment about the about the computed mean, often called the variance, is:

m2=(1/n) ∑fi (xi- )2 summed from i=1 to k

The square of the Standard Deviation is also often defined as the variance. In flat Euclidean space, the Standard Deviation, S.D., is: 

Euclidean S.D.=square root(fi (xi- )2)/(n-1))

However this is only true in Euclidean space since it relies on Pythagoras’ Theorem for a hypotenuse. On a non-Euclidean surface, this is not the correct formula for the sum of squares. For example, the shortest distance between two points, a, and b, on the spherical surface of the Earth is the Great Circle Distance. According to Pythagoras’ theorem for a spherical surface, where R is the radius of the surface, e.g. the Earth, this is R*cos-1(cos(a/R) *cos (b/R)). When the distance between points a and b is very small compared to the Radius, e.g. of the Earth, of the surface, then this is virtually indistinguishable from the traditional Pythagorean theorem.

If the surface is hyperbolic, not spherical or flat, then the shortest distance between two points is
cosh-1(cosh(a)*cosh(b)). If a is defined as the summation of the deviations about the mean, and b is defined as 0, then the Hyperbolic Standard Deviation is

(cosh-1 (cosh( (fi (xi-))/n)))^(1/2)

while the Euclidean, flat, S.D is as defined before.

When the number of observations, n, is very large, and the sum in the second moment is not zero then there is virtually no difference between the hyperbolic S.D. and the flat, Euclidean traditional S.D. This does not mean that the difference is not real, just that in many applications there is no observable difference between the hyperbolic S.D. and the traditional S.D. Even when the sum of the square of the differences between the observations and the computed mean is virtually indistinguishable from zero, the difference between the traditional and hyperbolic Standard Deviation is virtually indistinguishable for large n , as shown in the figure below.


With a roll of a traditional six-sided die, there are six possible outcomes, 1 though 6, which if the die is not loaded should follow a normal distribution. The mean outcome is 3.5. The median outcome is 3.5.  The Euclidean Standard Deviation is 1.7.  This requires that according to the 68/95/99 rule for normal distributions that 99.7% of the outcomes should fall between the mean minus 3 SD and the mean plus 3 SD. According to the traditional SD, this requires that 99.7% of the die roll outcomes should fall between -0.6 nd 7.6.  While this is true, a more useful metric might say that 100% of the outcomes fall between 1 and 6. This requires the variance, σ2, to be .694 and the square root of the variance, σ, to be .833. Then 100% of the observations fall between the mean ± 3σ. The Hyperbolic Standard Deviation of a six-sided die role is .34.  According to the Hyperbolic SD, this requires that 99.7% of the die roll outcomes should fall between 2.48 and 4.56, which is also the incorrect variance. 

The reason that the square of a Standard Deviation might not be σ2, the true variance, is because of error. The true mean is not necessarily the computed mean because the computed mean can contain error.

=με¯

          =           The computed mean, (1/n) ∑fi xi,  , summed from i=1 to k;

μ            =          The true mean;

ε¯          =           The mean error, (1/n) ∑ εi,  , summed from i=1 to k. 

The moments about the computed mean will only have non-zero values if the computed mean is NOT equal to the true mean.  This is because  (1/n) ∑fi (xi- )r summed from i=1 to is equal to zero for every moment r when there is no error.  If there is no error, then the square root of the second moment about the mean should not be solved using Euclidean mathematics, etc. The computed Euclidean Standard Deviation added the Bessel adjustment, n-1, in order that the square of the Euclidean Standard Deviation be closer to the True Variance. The Bessel adjustment is only necessary if the Standard Deviation is computed using Euclidean geometry. It should be computed using non-Euclidean hyperbolic geometry. If the computed mean is the true mean, then the variance should be other than the square of the Standard Deviation.

It is suggested that the Standard Deviation for a normal distribution where the mean error is zero,  appears to be 0.  This is also not the variance squared, σ2but this is the limit of the Euclidean Standard Deviation when the number of observations approaches infinity.  This suggests that the 100% of the values of a traditional six-sided die, where this hyperbolic SD is 1.90,  occurs between -2.2 and 9.2.  Thus it is suggested that the square of the Standard Deviation is NOT the variance.  It is suggested instead that the high observation and the low observation be identified.  If the distribution is negatively skewed then the computed mean minus the lowest value minus .003 divided by 3 is the square root of the variance, σ.  If the distribution is positively skewed, then the highest value minus the computed mean minus  minus .003 divided by three is the squared root of the variance. If the skew can not be determined, then the maximum of these values should be taken as the square root of the variance, 

A no/yes, heads/tails, off/on transition which occurs at μ is a normal distribution. The transition is from whatever choice is assigned a value of zero to whatever choice is assigned a value of 1. Then the median choice is  0.5, the mean choice is 0.5,  the true variance σ2 is 0.028, and the square root of that variance, σ, is 0.167.  According to the rule of normal distributions this requires that:

  0.3% of the transitions from 0 to 1 will be made by μ-3σ, or μ-0.500;

  5%  of the transitions from 0 to 1 will be made by μ-2σ, or μ-0.333;

32%  of the transitions from 0 to 1 will be made by μ-σ, or μ-0.167;

50%  of the transitions from 0 to 1 will be made by μ;

68%  of the transitions from 0 to 1will be made by μ+σ ,or μ+0.167;

95%  of the transitions from 0 to 1 will be made by μ+2σ, or μ+0.333;

99.7% of the transitions from 0 to 1will be made by μ+3σ, or μ+0.500. 

Mathematically there is no difference between a transition that happens at μ+3σ and one that happens at μ-3σ. Or Biblically, the Parable of the Workers in the Vineyard (Matthew 20: 1–16). It is wrong to say that there is no variance in choice, transitions, even when there is no error. Mathematically there is a definite non-zero variance with every choice.


Sunday, March 12, 2023

Intolerance

 

Humpty Dumpty

And all the king’s horsemen and all the king’s men
Couldn’t put Humpty together again

Wait, that’s not right, is it?

My wife and her sister have a small business on Etsy. On the items that they sell is a cloth book of the children’s nursery rhyme Humpty Dumpty. My wife’s sister designs the fabric, and my wife sews the fabric. The line above is how it appears in their fabric book. A disgruntled customer was upset because in her opinion it should  have been “all the king’s horses”,  not “all the king’s horsemen.”  She was offered a refund including shipping both ways if she retuned the book. She refused. She wanted the fabric to be reprinted and the book to be  changed. This was not possible. She wanted the book to be no longer sold to others because it contained an “error” and left a bad review because of that. In other words, she wanted her opinion to be imposed on others.

This sounds like someone who is concerned with the letter of the law and not justice, the spirit of the law. I bet that customer also thinks she is being a good Christian which makes her better than other religions, and non-Americans who don’t speak English and are not as white as her. Who is going to tell her that that Jesus Christ was a Jew, didn’t speak English, didn’t live in the United States and was not as white as her?

She also should realize  that her preferred wording is not the only way that the nursery rhyme has even appeared. According to Wikipedia the first appearance of the Poem in 1797 was

Four-score Men and Four-score more,
Could not make Humpty Dumpty where he was before.

In 1810 it appeared as

Threescore men and threescore more,
Cannot place Humpty dumpty as he was before.

Elsewhere in that poem, “Humpty” was spelled “Humpti” and “sat” was spelled as “sate” but that is another issue.

The current version did not appear until 1882.

The poem was thought to be a riddle, where Humpty Dumpty is an egg, which is speculated as Dutch slang for Egg. Which reminds of an English coworker who triggered a spit take by me when he made the comment that he spent the weekend "knocking up" an old girlfriend. He meant that he “was visiting”,  not what I thought he meant. Two great civilizations separated by a common language indeed! The motto for the state of my birth is “Hope” which came from a misunderstanding by the early English settlers that the nearby Native American village was Mount Hope. The natives were saying “mantoup” in their language

From Wikipedia

In 1996, the website of the Colchester tourist board attributed the origin of the rhyme to a cannon recorded as used from the church of St Mary-at-the-Wall by the Royalist defenders in the siege of 1648.In 1648, Colchester was a walled town with a castle and several churches and was protected by the city wall. The story given was that a large cannon, which the website claimed was colloquially called Humpty Dumpty, was strategically placed on the wall. A shot from a Parliamentary cannon succeeded in damaging the wall beneath Humpty Dumpty, which caused the cannon to tumble to the ground. The Royalists (or Cavaliers, "all the King's men") attempted to raise Humpty Dumpty on to another part of the wall, but the cannon was so heavy that "All the King's horses and all the King's men couldn't put Humpty together again".

The poem has also been advanced as a story based on the Laws of Thermodynamics because once broken an egg, Humpty Dumpty, could not be but together again because its entropy has increased.

In any event, on this night when the 95th Oscars are to be awarded, it is only a story anyway. If someone objects to how you are telling a story and calls you illiterate, they are probably projecting and confirming that they are illiterate, not you. Besides to be “woke” shouldn’t it be "horsepersons" any way. J

Friday, March 10, 2023

None of the Above

 

Complicated

Why'd you have to go and make things so complicated?
I see the way you're acting like you're somebody else
Gets me frustrated
Life's like this, you fall
And you crawl, and you break
And you take what you get, and you turn it into
Honesty and promise me I'm never gonna find you faking
No, no, no

A yes or no response isn’t enough

Facebook, I am told, allows you to set your relationship status as single, in a relationship, engaged, married, in a civil partnership, in a domestic partnership, in an open relationship, it's complicated, separated, divorced, and widowed. In this there are more than two options but even when there are only two responses, there should always be a third, It’s Complicated/Other. Thus a True/False quiz should be True/False/It’s Complicated. Otherwise you can be asked misleading “Have you stopped beating your wife?” questions, where you are damned if you answer yes (implying that you used to beat you wife but have stopped) or no ( admitting that you are currently beating your wife.). There always better be three options or the responses can’t tell anything.

This goes for any response/choice. If you are asked to choose between good and evil, you might choose to be 100% good or 100% evil, but you are not the only one making that choice. Each individual’s choice can be binary, but the sum of all choices is where we live.

Thursday, March 9, 2023

AI

 

Dawn (Go Away)

Think (think) What a big man he'll be Think Of the places you'll see Now think what the future would be with a poor boy like me Dawn go away

Thinking is intelligence, not inference

The topic du jour seems to be AI, Artificial Intelligence. This is, IMHO, an oxymoron. It really should be Artificial Inference. Computers (the Artificial part) are building inferences based on the data that they examine. However inferences are NOT always intelligent. Before Copernicus, the inference was that the sun moved around the earth. Before modern geology, the inference was that the Earth was only a few thousand years old. Before Michelson, the inference was that light moved though a luminiferous aether that permeated empty space. Before Einstein’s Theory of relativity, it was inferred that there was an absolute frame of reference.

The best that can be hoped from computers is that they will make inferences, discover a pattern in the data. It will be up to some one else to use intelligence to say what that pattern means.

Before you acknowledge that there is a pattern, make sure that you know what data has been examined. The Literary Digest famously predicted that Alf Landon would defeat Roosevelt in the Presidential election of 1936 because they had only polled their readers. If the data that AI is using is not inclusive, any patterns from that data will reflect its exclusions. You can’t make any inferences from data that you don’t have.

MAGA? II

 

Glory Days

Yeah, just sitting back, trying to recapture
A little of the glory, yeah
Well time slips away and leaves you with nothing, mister
But boring stories of

Glory days, yeah they'll pass you by
Glory days, in the wink of a young girl's eye
Glory days, glory days
Glory days, yeah they'll pass you by

Make America Great Again?

Make.  Good, an action verb.  Making things is progress. 

America.  The group to which you belong.  I’m with you. 

Great. What you aspire to. Now, that is inspirational. 

Again.  Uh.  You lost me there sport. 

Silly Rabbit. You can’t go home again.  You can’t recapture the glory days.  What has happened, has happened.  You can’t make the future look like the past. Anyone who tries to convince you otherwise is only tying to distract you.  MAG is fine. MAGA?  Not so much.

Thursday, March 2, 2023

The Right Reasons

 

A Kiss From A Rose

Baby, I compare you to a kiss from a rose on the grey
Ooh, the more I get of you, stranger it feels, yeah
And now that your rose is in bloom
A light hits the gloom on the grey.

Before I give you my rose, are you here for the right reasons?

Some rules of thumb which I generally follow when I enter the voting booth, if I do not know the candidate. They are actually identical to the rules of thumb if I do know the candidate. A candidate for office is supposed to represent me to the group. I want someone who is for truth, justice, and the American Way. By that I mean that someone who acknowledges the truth; acts for justice for the group, not for himself; and promotes inclusion in the group (America is known for being a melting pot.).

A plus for being a woman and/or a member of a traditionally excluded group.

Women and others have too long been excluded from the group. Being a woman or a minority member does not mean that you aren’t in favor of exclusion (e.g. Ann Coulter, Kimberly Guilfoyle, Peter Thiel, Clarence Thomas, and Tim Scott) but hopefully you will be more sensitive to those who have traditionally been excluded.

A plus for being rich.

I do not think that the rich are smarter than others in the group. I just think that they have a higher price than anyone else, e.g. they have more to lose if they act for themselves and not for the group, and have a higher price such that only when the bribe is enormous will they act for themselves and not the group. The self-made rich are usually think first of themselves, but this may not be true of their descendants. That means I will give points to the descendants of the Kennedy, Roosevelts, and Rockefellers. E.g. Joseph Kennedy, Senior? No. John F. Kennedy? Yes.

A plus for being a veteran, a member of a non-profit, a teacher, etc.

I want someone who will act for the group, not for themselves. Veterans, Doctors Without Borders, Peace Corps volunteers, teachers, etc. have demonstrated that they place the interests of the group over themselves.

A minus for being a celebrity

I expect the Peter Principle to apply, that being good in one job does not mean that you will be good in another job. Being a great football coach does not mean that you will be a great senator. (E.g. Tommy Tuberville)

A minus for being in any form of show business.

I have to trust the positions that are, and will be, supported by the candidate. The art of show business is learning the art of distraction, the razzle dazzle, which means I can’t trust the stated position. That means I won’t vote for a Jane Fonda or an Al Franken, but I also won’t vote for a Ronald Reagan, Doctor Oz, or Donald Trump.

A minus for attending a top school.

This sounds counter intuitive. I attended a top school. Shouldn’t I want someone like me to represent me? Don’t I trust myself? (Uh…not really!). Attending a top school might mean that the person wanted to get the best instruction, but it also could mean that the person is merely doing resumé padding. I attended classes at the Wharton School while at UPenn, but Donald Trump, Donald Trump Junior, Ivanka Trump, and Elon Musk also all also graduated from Wharton.

A minus for being divorced.

Yes, I realize that there are people in unsuccessful marriages and those people are perfectly justified in getting a divorce. However when they married, they took an oath that they would be in that marriage forever. A successful candidate will also be asked to take an oath to support the group. That is an oath that I don’t want them to break.

The candidate who has a higher point total than their opponents, is the one who will probably get my vote. To use a line from one of my wife’s favorite shows, I want to give my rose to a person who is there for the right reasons.

Wednesday, March 1, 2023

Unconscious Bias

 I Don't Know Why (I Just Do)

I don't know why I love you like I do
I don't why, but I do
I don't know why you thrill me like you do
I don't know why, but you do.

Shouldn’t you want to know why?

Unconscious biases can be long lasting and may be …doh…. unconscious, something of which you aren’t even aware. Case in point. I attended Brown University where the Computer Science Building was funded by one of its alumni, Thomas Watson, Jr, a former CEO and son of the founder of IBM.

In the 1980s, IBM was introducing Personal Computers, PCs. They used a subcontractor to work on its operating system and that subcontractor had a competing version of the BASIC computer language, which ran on those PCs. That version of BASIC was obviously not to be taken as seriously as IBM's since that competing version of BASIC went by the name GW (Golly Whiz) BASIC. So clearly there was no serious reason to invest in that subcontractor when they had their Initial Public Offering. That subcontractor was, at time, a little firm called Microsoft. Being biased in favor of IBM did not work out so well for me.

When search engines were being developed, Scientific American did a review of two competing, innovative search engines. One was IBM's CLEVER. The name of the other search engine is the punch line. Needless to say, I was much more impressed with IBM’s search engine and saw no reason to invest in its no-name, fly-by-night competitor. The name of that competing Search Engine... GOOGLE. Another case of where being biased in favor of IBM led me to back the wrong horse.

As to how long-lasting unconscious bias can be, my father’s parents saw two of their children die in the Spanish Flu Pandemic, before my father was even born. My father had an abiding fear of public spaces, anything shared, that I am fairly sure that he learned from his parents, but he never acknowledged why. When I reflect on my response to the COVID pandemic, which was influenced by my own upbringing by my father, I had to reflect on this incident that happened before my father was even born.

A coincidence. The firm at which I worked since 1998, had its HQ less than a mile from where my Uncle and Aunt died and were buried. I do not know, but suspect, that their deaths were why my grandparents moved from Cambridge, MA to Providence, RI where both my father and I were born.