The History of Bayes
Instead of publishing or sending it to Royal Society, he let it lie among his mathematical papers.
Bayes’ rule is left for dead (for the first time)
After his death, Bayes’ relatives asked Bayes’ young friend Richard Price to examine Bayes’ mathematical papers. Price got interested in the inverse probabilityThe branch of mathematics concerning numerical descriptions ... More work and polished it and sent it off to Royal Institute for publication. However, Price’s paper is largely ignored.
Bayes’ rule is left for dead (for the second time)
Laplace discovered Bayes’ rule on his own and his mémoire on the probability of causes was published in 1774.
1810: “Laplace announced the central limit theorem. It asserts that, with some exceptions, any average of a large number of similar terms will have a normal, bell-shaped distribution. Laplace’s probability of causes had limited him to binomial problems, but his final proof of the central limit theorem let him deal with almost any kind of data.” [McGrayne, 2011]
“In providing the mathematical justification for taking the mean of many data points, the central limit theorem had a profound effect on the future of Bayes’ rule. At the age of 62, Laplace, its chief creator and proponent, made a remarkable about-face. He switched allegiances to an alternate, frequency-based approach he had also developed. From 1811 until his death 16 years later Laplace relied primarily on this approach, which twentieth-century theoreticians would use to almost obliterate Bayes’ rule.” [McGrayne, 2011]
“Laplace made the change because he realized that where large amounts of data were concerned, both approaches generally produce much the same results. The probability of causes was still useful in particularly uncertain cases because it was more powerful than frequentism. But science matured during Laplace’s lifetime. By the 1800s mathematicians had much more reliable data than they had had in his youth and dealing with trustworthy data was easier with frequentism.” (Mathematicians did not learn until the midtwentieth century that, even with great amounts of data, the two methods can sometimes seriously disagree.) [McGrayne, 2011]
Bayes’ rule is left for dead (for the third time)
Supposed dead years for Bayes’ philosophy.
However, at the same time, Bayes’ theory was silently getting used in real life applications and with great success. It was used by French army and Russian artillery officers to fire their weapons as there were many uncertain factors in firing artillery. An engineer in Bell Labs created a cost effective way of dealing with uncertainty in telecommunications based on Bayes’ principles. The US insurance industry used Bayes’ principles to effectively price insurance premiums. [Krishna, 2011]
In 1925, Ronald Fisher published a manual of new techniques, Statistical Methods for Research Workers. A cookbook of statistical procedures for nonstatisticians — making stats available to scientists and researchers who did not have the time or did not possess skills in statistics — it turned frequency into the de facto statistical method.
“Fisher redefined most uncertainties not by their relative probabilities but by their relative frequencies. He brought to fruition Laplace’s frequency-based theories, the methods Laplace himself preferred toward the end of his life.” [McGrayne, 2011]
During the same time when Fisher was developing frequentist methods, Egon Pearson and Jerzy Neyman introduced hypothesis testing that helped a ton of people working in the labs to reject alternate or null hypothesis using a framework where the basic communication of a test result was through p-values.
Fisher and Neyman became fervent anti-Bayesians who limited themselves to events that could theoretically be repeated many times; regarded samples as their only source of information; and viewed each new set of data as a separate problem, to be used if the data were powerful enough to provide statistically significant conclusions and discarded if not. They banned subjective priors. Neyman, for example, denounced Bayes’ equal-prior shortcut as “illegitimate.” [McGrayne, 2011]
It was a geophysicist, Harold Jeffreys, who almost singlehandedly kept Bayes alive during the anti-Bayesian onslaught of the 1930s and 1940s. Jeffreys was completely against p-values and argued against anyone using them. Jeffreys and Fisher embarked on a two-year debate in the Royal Society proceedings in the 1930s. The debate ended inconclusively and frequentism totally eclipsed Bayes. [McGrayne, 2011]
Bayes’ rule is left for dead (for the fourth time)
“Alan Turing’s work during World War II at Bletchley Park along with other cryptographers validated Bayesian theory. Though Turing and others did not use the word Bayesian, almost all of their work was Bayesian in spirit. However, this code-breaking information became classified and nobody could reveal the Bayesian principles at work.” [Krishna, 2011]
Bayes’ rule is left for dead (for the fifth time)
“Bayes’ stood poised for another of its periodic rebirths as three mathematicians Jack Good, Leonard Jimme Savage and Dennis V Lindley tackled the job of turning Bayes’ rule in to a respectable form of mathematicsSymbolic representation and treatment of relationships betwe... More and a logical coherent methodology.” [McGrayne, 2011]
“Jack Good being Turing’s wartime assistant knew the power of Bayes’ and hence started publishing and making Bayes’ theory known to a variety of people in Academia. However his work was still classified info. Hampered by governmental secrecy and his inability to explainTo accept subjectively that a model accounts for real world ... More his work, Good remained an independent voice in the Bayesian community.” [Krishna, 2011]
“Savage on the other hand was instrumental in spreading Bayesian stats and making a legitimate mathematical framework for analyzing small data events. He also wrote books with precise mathematical notion and symbols thus formalizing Bayes’ theory. However the books did not become famous or were not widely adopted as computing machinery to implement ideas were not available.” [Krishna, 2011]
“Dennis Lindley on the other hand pulled off something remarkable. In Britain he started forming small group of Bayesian circles and started pushing for Bayesian appointments in stats department. This feat in itself is something for which Lindley needs to be given enormous credit as he was taking on the mighty Fisher, in his own home turf, Britain.” [Krishna, 2011]
“Thanks to Lindley in Britain and Savage in US, Bayesian theory came of age in 1960s. The philosophical rationale of using Bayesian methods had been largely settled. It was becoming the only mathematics of uncertainty with an explicit, powerful and secure foundation in logicThe abstract, formal structure of argument as characterized ... More. “How to apply it?”, though remained a controversial question.” [Krishna, 2011]
“The extraordinary fact about the glorious Bayesian revival of the 1950s and 1960s is how few people in any field publicly applied Bayesian theory to real-world problems. As a result, much of the speculation about Bayes’ rule was moot. Until they could prove in public that their method was superior, Bayesians were stymied.” [Krishna, 2011]
“These two Harvard professor set on a journey to use Bayes’ to Business Statistics. Osher Schlaifer was a faculty in the accounting and business department. He was randomly assigned to teach statistics course. Knowing nothing about it Schlaifer crammed away the frequentist stats and subsequently wondered about its utility in the real life. Slowly he realized that in business one always deals with a prior and develops better probabilities based on the data one gets to see in the real world. This logically bought him to Bayes’ world. The math was demanding and Schlaifer immersed in math to understand every bit of it. He also came to know about a young prof at Columbia, Howard Raiffa. He pitched to Harvard to recruit Raiffa and subsequently brought him over to Harvard. For about seven years, both of them created a lot of practical applications to Bayesian methods. The professors realized that there was no mathematics toolbox for Bayesians to work and hence went on developing a lot of concepts which could make Bayesian computations easy. They also published books on applications of Bayesian statistics to management, detailed notes on Markov chains etc. They also tried inculcating Bayesian stuff in to curriculum. Despite these efforts, Bayes could not really permeate through academia for various reasons that are mentioned in this chapter. It is also mentioned that some students burnt the lecture notes in front of professor’s office to give vent to their frustration.” [Krishna, 2011]
“Amid this mathematical fervor, a few practical types sat down in the 1960s to build the kind of institutional support that frequentists had long enjoyed: annual seminars, journals, funding sources, and textbooks. Morris H. DeGroot wrote the first internationally known text on Bayesian decision theory, the mathematical analysis of decision making (1970). Arnold Zellner at the University of Chicago raised money, founded a conference series, and began testing standard economics problems one by one, solving them from both Bayesian and non-Bayesian points of view. Thanks to Zellner’s influence, Savage’s subjective probability would have one of its biggest impacts in economics. The building process took decades.” [McGrayne, 2011]
“In what could have been a computational breakthrough, Lindley and his student Adrian F. M. Smith showed Bayesians how to develop models by breaking complex scientific processes into stages called hierarchies (1972). The system would later become a Bayesian workhorse, but at the time it fell flat on its face. The models were too specialized and stylized for many scientific applications. It would be another 20 years before Bayesian textbooks taught hierarchical models. Mainstream statisticians and scientists simply did not believe that Bayes could ever be practical. Indicative of their attitude is the fact that while Thomas Bayes’ clerical ancestors were listed in Britain’s Dictionary of National Biography he himself was not.” [McGrayne, 2011]
“While Stone was writing his book, the United States agreed to help Egypt clear the Suez Canal of unexploded ammunition from the Yom Kippur war with Israel in 1973. The explosives made dredging dangerous. Using the SEPs developed in Palomares, it was possible to measure the search effectiveness to get the probability that, if a bomb had been there, it would have been spotted. But how could anyone estimate the number of bombs remaining in the canal when no one knew how many were there to begin with? Wagner, Associates chose three priors with different probability distributions to express high, middle, and low numbers. Next, using the handy system of conjugate priors described by Raiffa and Schlaifer in 1961, they declared that each prior would have a posterior with the same class of probability distributions. This produced three tractable distributions (Poisson, binomial, and negative binomial) complete with those statistical desiderata, mean values and standard deviations. Computing became “a piece of cake,” Richardson reported, but it proved impossible to explain the system to hardened ordinance- disposal specialists with missing fingers. In the end, no one talked about Bayes at Suez.” [McGrayne, 2011]
“When Box, J. Stuart Hunter, and William G. Hunter wrote Statistics for Experimenters in 1978, they intentionally omitted any reference to Bayes’ rule: too controversial to sell. Shorn of the big bad word, the book was a bestseller. Ironically, an Oxford philosopher, Richard Swinburne, felt no such compunctions a year later: he inserted personal opinions into both the prior hunch and the supposedly objective data of Bayes’ theorem to conclude that God was more than 50% likely to exist; later Swinburne would figure the probability of Jesus’ resurrection at “something like 97 percent.” These were calculations that neither the Reverend Thomas Bayes nor the Reverend Richard Price had cared to make, and even many nonstatisticians regarded Swinburne’s lack of careful measurement as a black mark against Bayes itself.” [McGrayne, 2011]
“A loss of leadership, a series of career changes, and geographical moves contributed to the gloom. Jimmie Savage, chief U.S. spokesman for Bayes as a logical and comprehensive system, died of a heart attack in 1971. After Fermi’s death, Harold Jeffreys and American physicist Edwin T. Jaynes campaigned in vain for Bayes in the physical sciences; Jaynes, who said he always checked to see what Laplace had done before tackling an applied problem, turned off many colleagues with his Bayesian fervor. Dennis Lindley was slowly building Bayesian statistics departments in the United Kingdom but quit administration in 1977 to do solo research. Jack Good moved from the supersecret coding and decoding agencies of Britain to academia at Virginia Tech. Albert Madansky, who liked any technique that worked, switched from RAND to private business and later to the University of Chicago Business School, where he claimed to find more applications than in statistics departments. George Box became interested in quality control in manufacturing and, with W. Edwards Deming and others, advised Japan’s automotive industry. Howard Raiffa also shifted gears to negotiate public policy, while Robert Schlaifer, the nonmathematical Bayesian, tried to program computers. ”When James O. Berger became a Bayesian in the 1970s, the community was still so small he could track virtually all of its activity. The first international conference on Bayes’ rule was held in 1979, in Valencia, Spain, and almost every well-known Bayesian showed up—perhaps 100 in all.” [McGrayne, 2011]
“Not until 1981 did two industry-supported studies finally employ Bayes’ theorem—and admit it. Analysts used it to combine the probabilities of equipment failures with specific information from two particular power plants: Zion Nuclear Power Station north of Chicago and Indian Point reactor on the Hudson River, 24 miles north of New York City. Since then, quantitative risk analysis methods and probabilistic safety studies have used both frequentist and Bayesian methods to analyze safety in the chemical industry, nuclear power plants, hazardous waste repositories, the release of radioactive material from nuclear power plants, the contamination of Mars by terrestrial microorganisms, the destruction of bridges, and exploration for mineral deposits. To industry’s relief, risk analysis is also now identifying so- called unuseful safety regulations that can presumably be abandoned. Subjective judgment still bothers many physical scientists and engineers who dislike mixing objective and subjective information in science. Avoiding the word “Bayes,” however, is no longer necessary—or an option.” [McGrayne, 2011]
“Bayesians were still a small and beleaguered band of a hundred or more in the early 1980s. Computations took forever, so most researchers were still limited to “toy” problems and trivialities. Models were not complex enough. The title of a meeting held in 1982, “Practical Bayesian Statistics,” was a laughable oxymoron. One of Lindley’s students, A. Philip Dawid of University College London, organized the session but admitted that “Bayesian computation of any complexity was still essentially impossible. . . . Whatever its philosophical credentials, a common and valid criticism of Bayesianism in those days was its sheer impracticability.”1” [McGrayne, 2011]
“Next, Smith and three others—Lindley, José M. Bernardo, and Morris DeGroot—organized an international conference series for Bayesians in Valencia, Spain. It has been held regularly since 1979. Smith expected “the usual criticism from non-Bayesians in reaction to whatever I say.” Sure enough, frequentists accused Bayesians of sectarian habits, meetings in remote locations, and mock cabarets featuring skits and songs with Bayesian themes. Other disciplines have done the same, of course. The conferences played a vital role in helping to build camaraderie in a small field under attack. ”In 1984 Smith issued a manifesto—and italicized it for emphasis: “Efficient numerical integration procedures are the key to more widespread use of Bayesian methods.”10 With computerized data collection and storage, hand analyses were becoming impossible. When microcomputers appeared, attached to fast networks with graphics and vast storage capabilities, data analysts could finally hope to improvise as easily as they had with pencil and paper. With characteristic practicality, Smith set his University of Nottingham students to work developing efficient, user-friendly software for Bayesian problems in spatial statistics and epidemiology.” [McGrayne, 2011]
“While lung cancer researchers explored Bayes, Adrian Raftery was working at Trinity College in Dublin on a well-known set of statistics about fatal coal-dust explosions in nineteenth-century British mines. Previous researchers had used frequency techniques to show that coal mining accident rates had changed over time. They assumed, however, that the change had been gradual. Raftery wanted to check whether it had been gradual or abrupt. First, he developed some heavy frequentist mathematics for analyzing the data. Then, out of curiosity, he experimented with Bayes’ rule, comparing a variety of theoretical models to see which had the highest probability of determining when the accidents rates actually changed (1986). “I found it very easy. I just solved it very, very quickly,” Raftery recalled. And in doing so he discovered a remarkable, hitherto unknown event in British history. Raftery’s Bayesian analysis revealed that accident rates plummeted suddenly in the late 1880s or early 1890s. A historian friend suggested why. In 1889, British miners had formed the militant Miners’ Federation (which later became the National Union of Mine Workers). Safety was their number one issue. Almost overnight, coal mines got safer. “It was a Eureka moment,” Raftery said. “It was quite a thrill. And without Bayesian statistics, it would have been much harder to do a test of this hypothesis.”5 Frequency-based statistics worked well when one hypothesis was a special case of the other and both assumed gradual behavior. But when hypotheses were competing and neither was a special case of the other, frequentism was not as helpful, especially with data involving abrupt changes—like the formation of a militant union.” [McGrayne, 2011]
“Yet another disagreement between Bayesians and anti-Bayesians surfaced in 1957, when Lindley, elaborating on a point made by Jeffreys, highlighted a theoretical situation when the two approaches produce diametrically opposite results. Lindley’s Paradox occurs when a precise hypothesis is tested with vast amounts of data. In 1987, Princeton University aeronautics engineering professor Robert G. Jahn conducted a large study which he concluded supported the existence of psychokinetic powers. He reported that a random event generator had produced 104,490,000 trials testing the hypothesis that someone on a couch eight feet away cannot influence their results any more than random chance would. Jahn reported that the random event generator produced 18,471 more examples (0.018%) of human influence on his sensitive microelectronic equipment than could be expected with chance alone. Even with a p-valueThe probability of the observed data (or of more extreme dat... More as small as 0.00015, the frequentist would reject the hypothesis (and conclude in favor of psychokinetic powers) while the same evidence convinces a Bayesian that the hypothesis against spiritualism is almost certainly true.” [McGrayne, 2011]
“With the introduction of high-performance workstations in the 1980s it became possible to use Bayesian networks to handle medicine’s many interdependent variables, such as the fact that a patient with a high temperature will usually also have an elevated white blood count. Bayesian networks are graphs of nodes with links revealing cause-and-effect relationships. The “nets” search for particular patterns, assign probabilities to parts of the pattern, and update those probabilities using Bayes’ theorem. A number of people helped develop Bayesian networks, which were popularized in 1988 in a book by Judea Pearl, a computer scientist at UCLA. By treating cause and effect as a quantifiable Bayesian belief, Pearl helped revive the field of artificial intelligence.” [McGrayne, 2011]
“When Smith spoke at a workshop in Quebec in June 1989, he showed that Markov chain Monte Carlo could be applied to almost any statistical problem. It was a revelation. Bayesians went into “shock induced by the sheer breadth of the method.”12 By replacing integration with Markov chains, they could finally, after 250 years, calculate realistic priors and likelihood functions and do the difficult calculations needed to get posterior probabilities. ”To outsiders, one of the amazing aspects of Bayes’ history is that physicists and statisticians had known about Markov chains for decades. To illustrate this puzzling lapse, some flashbacks are required. Monte Carlo began in 1906, when Andrei Andreyevich Markov, a Russian mathematician, invented Markov chains of variables. The calculations took so long, though, that Markov himself applied his chains only to the vowels and consonants in a Pushkin poem.” [McGrayne, 2011]
“Yet Smith and Gelfand still thought of Monte Carlo as a last resort to be used in desperation for complicated cases. They wrote diffidently, careful to use the B-word only five times in 12 pages (1990). “There was always some concern about using the B-word, a natural defensiveness on the part of Bayesians in terms of rocking the boat,” Gelfand said. “We were always an oppressed minority, trying to get some recognition. And even if we thought we were doing things the right way, we were only a small component of the statistical community and we didn’t have much outreach into the scientific community.”15 ”The Gelfand–Smith paper was an “epiphany in the world of statistics,” as Bayesians Christian P. Robert and George Casella reported. And just in case anyone missed their point, they added: “Definition: epiphany n. A spiritual event . . . a sudden flash of recognition.” Years later, they still described its impact in terms of “sparks,” “flash,” “shock,” “impact,” and “explosion.”16” [McGrayne, 2011]
“Six years later, Jimmie Savage, Harold Lindman, and Ward Edwards at the University of Michigan showed that results using Bayes and the frequentist’s p-values could differ by significant amounts even with everyday-sized data samples; for instance, a Bayesian with any sensible prior and a sample of only 20 would get an answer ten times or more larger than the p-value.” [McGrayne, 2011]
The International Society for Bayesian Analysis (1992) and the Bayesian section of the American Statistical Association were not formed until the early 1990s.
“Outside of diagnostic and medical device testing, Bayes’ mathematical procedures have had little impact on basic clinical research or practice. Working doctors have always practiced an intuitive, nonmathematical form of Bayes for diagnosing patients. The biggest unknown in medicine, after all, is the question, What is causing the patient’s symptoms? But traditional textbooks were organized by disease. They said that someone with, for instance, measles probably has red spots. But the doctor with a speckled patient wanted to know the inverse: the probability that the patient with red spots has measles. Simple Bayesian problems—for example, What is the probability that an exercise echocardiogram will predict heart disease?—started appearing on physicians’ licensing examinations in 1992.” [McGrayne, 2011]
“Schlaifer, farseeing to the last, spent the remaining years of his life trying to write computer software for practitioners, even though teams of mathematically sophisticated programmers were already taking over the field. In 1994, at the age of 79, Schlaifer died of lung cancer. After his death, Raiffa and Pratt finished the trio’s 30-year-old opus, Introduction to Statistical Decision Theory. Dedicating it to their former colleague, Pratt and Raiffa hailed Schlaifer as “an original, deep, creative, indefatigable, persistent, versatile, demanding, sometimes irascible scholar, who was an inspiration to us both.”34” [McGrayne, 2011]
“In economics and finance Bayes appears at multiple levels, ranging from theoretical mathematics and philosophy to nitty-gritty money making. The method figured prominently in three Nobel Prizes awarded for theoretical economics, in 1990, 1994, and 2004. The first Nobel involved the Italian Bayesian de Finetti, who anticipated the Nobel Prize–winning work of Harry Markowitz by more than a decade. Mathematical game theorists John C. Harsanyi and John Nash (the latter the subject of a book and movie, A Beautiful Mind) shared a Bayesian Nobel in 1994. Harsanyi often used Bayes to study competitive situations where people have incomplete or uncertain information about each other or about the rules. Harsanyi also showed that Nash’s equilibrium for games with incomplete or imperfect information was a form of Bayes’ rule.” [McGrayne, 2011]
“Smith became the first Bayesian president of the Royal Statistical Society in 1995. Three years later he stunned his friends by quitting statistics to become an administrator of the University of London. A proponent of evidence-based medicine, he wanted to help develop evidence-based public policy too. Dismayed colleagues chastised him for abandoning Bayes’ rule. But Smith told Lindley that all the problems of statistics had been solved. We have the paradigm, he said, and with MCMC we know how to implement it. He told Diaconis that there was nothing else to do with statistical problems but to plug them into a computer and turn the Bayesian crank.” [McGrayne, 2011]
“Medical research and diagnostic testing were among the earliest beneficiaries of Bayes’ new popularity. Just as the MCMC frenzy appeared to be moderating, Peter Green of Bristol University showed Bayesians how to compare the elaborate hypotheses that scientists call models. Before 1996 anyone making a prediction about the risk of a stroke had to focus on one model at a time. Green showed how to jump between models without spending an infinite time on each. Previous studies had identified 10 possible factors involved in strokes. Green identified the top four: systolic blood pressure, exercise, diabetes, and daily aspirin.” [McGrayne, 2011]
“Bayes made headlines in 2000 by augmenting DNA evidence with statistical data to conclude that Thomas Jefferson had almost certainly fathered six children by his slave Sally Hemings.” [McGrayne, 2011]
“In 2002 Bayes won perhaps not an entire Nobel Prize but certainly part of one. Psychologists Amos Tversky, who died before the prize was awarded, and Daniel Kahneman showed that people do not make decisions according to rational Bayesian procedures. People answer survey questions depending on their phrasing, and physicians choose surgery or radiation for cancer patients depending on whether the treatments are described in terms of mortality or survival rates. Although Tversky was widely regarded as a philosophical Bayesian, he reported his results using frequentist methods. When James O. Berger of Duke asked him why, Tversky said it was simply a matter of expedience. During the 1970s it was more difficult to publish Bayesian research. “He just took the easy way out,” Berger said.” [McGrayne, 2011]
“Spiegelhalter spent more than 10 years trying to sell the medical community on BUGS (short for Bayesian Statistics Using Gibbs Sampling) as the mathematical way to learn from experience. He argued that “advances in health-care typically happen through incremental gains in knowledge rather than paradigm-shifting breakthroughs, and so this domain appears particularly amenable to a Bayesian perspective.” He contended (2004) that “standard statistical methods are designed for summarizing the evidence from single studies or pooling evidence from similar studies, and have difficulties dealing with the pervading complexity of multiple sources of evidence.”21 While frequentists can ask only certain questions, a Bayesian can frame any question.” [McGrayne, 2011]
“In economics and finance Bayes appears at multiple levels, ranging from theoretical mathematics and philosophy to nitty-gritty money making. The method figured prominently in three Nobel Prizes awarded for theoretical economics, in 1990, 1994, and 2004. The first Nobel involved the Italian Bayesian de Finetti, who anticipated the Nobel Prize–winning work of Harry Markowitz by more than a decade. Mathematical game theorists John C. Harsanyi and John Nash (the latter the subject of a book and movie, A Beautiful Mind) shared a Bayesian Nobel in 1994. Harsanyi often used Bayes to study competitive situations where people have incomplete or uncertain information about each other or about the rules. Harsanyi also showed that Nash’s equilibrium for games with incomplete or imperfect information was a form of Bayes’ rule.” [McGrayne, 2011]
“Economists were still catching their breaths when Martin Feldstein, professor of economics at Harvard, stood up at the same meeting and delivered a crash course in Bayesian theory. Feldstein had been President Ronald Reagan’s chief economic advisor and was president of the National Bureau of Economic Research, a leading research organization. He learned Bayesian theory at the Howard Raiffa–Robert Schlaifer seminars at Harvard Business School in the 1960s. Feldstein explained that Bayes lets the Federal Reserve weigh a low-probability risk of disaster more heavily than a higher-probability risk that would cause little damage. And he likened Bayes to a man who has to decide whether to carry an umbrella even when the probability of rain is low. If he carries an umbrella but it does not rain, he is inconvenienced. But if he does not carry an umbrella and it pours, he will be drenched. “A good Bayesian,” Feldstein concluded, “finds himself carrying an umbrella on many days when it does not rain.”4” [McGrayne, 2011]
“When Nate Silver at FiveThirtyEight.com used hierarchical Bayes during the presidential race in November 2008, he combined information from outside areas to strengthen small samples from low-population areas and from exit polls with low response rates. He weighted the results of other pollsters according to their track records and sample size and how up to date their data were. He also combined them with historical polling data. That month Silver correctly predicted the winner in 49 states, a record unmatched by any other pollster. Had Tukey publicized the Bayesian methods used for NBC, the history of political polling and even American politics might have been different.” [McGrayne, 2011]
“Four years later rain flooded the financial markets and banking. Greenspan, who by then had retired from the Federal Reserve, told Congress he had not foreseen the collapse of the real-estate lending bubble in 2008. He did not blame the theory he used but his economic data, which “generally covered only the past two decades, a period of euphoria . . . [instead of] historic periods of stress.”5 ”But did Greenspan actually employ Bayesian statistics to quantify empirical economic data? Or were Bayesian concepts about uncertainty only a handy metaphor? Former Reserve Board governor Alan S. Blinder of Princeton thought the latter, and when he said so during a talk, Greenspan was in the audience and did not object.” [McGrayne, 2011]
“A $1-million contest sponsored by Netflix.com illustrates the prominent role of Bayesian concepts in modern e-commerce and learning theory. In 2006 the online film-rental company launched a search for the best recommender system to improve its own algorithm. More than 50,000 contestants from 186 countries vied over the four years of the competition. The AT&T Labs team organized around Yehuda Koren, Christopher T. Volinsky, and Robert M. Bell won the prize in September 2009.” [McGrayne, 2011]