Silver 2012 Penguin Press
|Silver N (2012) The signal and the noise. The art and science of prediction. Penguin Press:534 pp.|
Abstract: The philosophy of this book is that prediction is as much a means as an end. Prediction serves a very central role in hypothesis testing, .. and therefore in all of science. This book thinks of the signal as an indication of the underlying truth behind a statistical or predictive problem. .. I tend to use noise to mean random patterns that might easily be mistaken for signals. The signal is the truth. The noise is what distracts us from the truth.
- Bold face added
- The story the data tell us is often the one we'd like to hear.
- We focus on those signals that tell a story about the world as we would like it to be, not how it really is.
- Information becomes knowledge only when it’s placed into context.
- We face a danger whenever information growth outpaces our understanding of how to process it. The last forty years of human history imply that it can still take a long time to translate information into useful knowledge.
- .. if the quantity of information is increasing by 2.5 quintillion bytes per day, the amount of ‘’useful information’’ almost certainly isn’t. Most of it is just noise, and the noise is increasing faster than the signal.
- .. a certain amount of immersion in a topic will provide disproportionately more insight than an executive summary.
- We think we want information when we really want knowledge.
- The more interviews that an expert had done with the press, Tetlock found, the ‘’worse’’ his predictions tended to be.
- Our brains, wired to detect patterns, are always looking for a signal, when instead we should appreciate how noisy the data is.
- Wherever there is human judgement there is the potential for bias. The way to become more objective is to recognize the influence that our assumptions play in our forecasts and to question ourselves about them.
- Good innovators typically think very big ‘’and’’ they think very small.
- - a visual inspection of a graphic showing the interaction between two variable is often a quicker and more reliable way to detect outliers in your data than a statistical test.
- One of the most important tests of a forecast - - is called calibration.
- When catastrophe strikes, we look for a signal in the noise.
- If you’re speaking with a seismologist:
- A ‘’’prediction’’’ is a definitive and specific statement about when and where an earthquake will strike ..
- Whereas a ‘’’forecast’’’ is a probabilistic statement, usually over a longer time scale ..
- The USGS’s official position is that earthquakes cannot be predicted. They can, however, be ‘’forecasted’’.
- What happens in systems with noisy data and underdeveloped theory - - is a two-step process. First, people start to mistake the noise for a signal. Second, this noise pollutes journals, blogs, and news accounts with false alarms, undermining good science and setting back our ability to understand how the system really works.
- In statistics, the name given to the act of mistaking noise for a signal is ‘’overfitting’’.
- Overfitting represents a double whammy: it makes our model look ‘’better’’ on paper but perform ‘’worse’’ in the real world. .. This may make it easier to get the model published in an academic journal or to sell to a client, crowding out more honest models from the marketplace. But if the model is fitting noise, it has the potential to hurt the science.
- As Hatzius sees it, economic forecasters face three fundamental challenges. First, it is very hard to determine cause and effect from economic statistics alone. Second, the economy is always changing, so explanations of economic behavior that hold in one business cycle may not apply to future ones. And third, as bad as their forecasts have been, the data that economists have to work with isn’t much good either.
- Forecasts of everything from hurricane trajectories to daytime high temperatures have gotten much better than they were even ten or twenty years ago, thanks to a ‘’’combination of improved computer power, better data-collection methods, and old-fashioned hard work’’’.
- .. improved technology did not cover for the lack of theoretical understanding about the economy; it only gave economists faster and more elaborate ways to mistake noise for a signal.
- .. real management is mostly about managing coalitions, maintaining support for a project so it doesn’t evaporate.
- George E.P. Box: “All models are wrong, but some models are useful.”
- The argument made by Bayes and Price is not that the world is intrinsically probabilistic or uncertain. .. It is, rather, a statement - expressed both mathematically and philosophically - about how we learn about the universe: that we learn about it through approximation, getting closer and closer to the truth as we gather more evidence.
- Admitting to our own imperfection is a necessary step on the way to redemption.
- Ioannidis: “I’m not saying that we haven’t made any progress. Taking into account that there are a couple of million papers, it would be a shame if there wasn’t. But there are obviously not a couple of million discoveries. Most are not really contributing much to generating knowledge.”
- Most of the data is just noise, as most of the universe is filled with empty space.
- .. the negative findings are probably kept in a file drawer rather than being published (about 90 percent of the papers published in academic journals today document positive findings rather than negative ones.) However, that does not mask the problem of false positives in the findings that do make it to publication.
- If you’re using a biased instrument, it doesn’t matter how many measurements you take – you’re aiming at the wrong target.
- .. the frequentist approach toward statistics (“Fisherian”) seeks to wash its hand of the reason that predictions most often go wrong: human error. It views uncertainty as something intrinsic to the experiment rather than something intrinsic to our ability to understand the real world. .. These methods discourage the researcher from considering the underlying context of plausibility of his hypothesis.
- There is an unhealthy obsession with the term consensus as it is applied to global warming. .. In formal usage, consensus is not synonymous with unanimity - nor with having achieved a simple majority. Instead, consensus connotates broad agreement after a process of deliberation, during which time most members of a group coalesce around a particular idea of alternative. .. A consensus-driven process, in fact, ofter represents an alternative to voting. .. But this introduces the possibility of groupthink and herding. Some members of a group may be more influential because of their charisma or status and not necessarily because they have the better idea. Empirical studies of consensus-driven predictions have found mixed results, in contrast to a process where individual members of a group submit independent forecasts and those are averaged or aggregated together, which can almost always be counted on to improve predictive accuracy.
- And there is a whole nomenclature that the IPCC authors have developed to convey how much agreement or certainty there is about a finding. For instance, the phrase “likely” taken along is meant to imply at least a 66 percent change of a prediction occurring when it appears in an IPCC report, while the phrase “virtually certain” implies 99 percent confidence or more. .. Although climatologists might think carefully about uncertainty, ‘’there is uncertainty about how much uncertainty there is’’.
- Uncertainty in forecasts is not necessarily a reason not to act – the Yale economist William Nordhaus has argued instead that it is precisely the uncertainty in climate forecasts that compels action, since the high-warming scenarios could be quite bad. Meanwhile, our government spends hundreds of billions toward economic stimulus programs, or initiates wars in the Middle East, under the pretense of what are probably far more speculative forecasts than are pertinent in climate science.
- .. there is another reason to quantify the uncertainty carefully and explicitly. It is essential to scientific progress, especially under Bayes’s theorem.
- In science, one rarely sees ‘’all’’ the data point toward one precise conclusion. Real data is noisy – even if the theory is perfect, the strength of the signal will vary. And under Bayes’s theorem, no theory is perfect. Rather, it is a work in progress, always subject to further refinement and testing. This is what scientific skepticism is all about.
- In politics, a domain in which the truth enjoys no privileged status, it’s anybody’s guess. The dysfunctional state of the American political system is the best reason to be pessimistic about our country’s future. Our scientific and technological prowess is the best reason to be optimitic. We are an inventive people.
- Whatever range of abilities we have acquired, there will always be tasks sitting right at the edge of them. If we judge ourselves by what is hardest for us, we may take for granted those things that we do easily and routinely.
A brief reading list