The Art of Statistics: Learning from Data
by David Spiegelhalter
Buy on AmazonRecommended by
"The Art of Statistics is a really accessible and comprehensive introduction to statistics. It’s not only about the tools of statistics, but it also goes through lots of interesting and relevant areas where statistics can help us make important decisions. For example, Spiegelhalter looks at whether Harold Shipman (a British doctor who murdered more than 200 of his patients) could have been stopped sooner. “Data doesn’t speak for itself. We need to speak for the data” He also tries to pique people’s interest by looking at, say, the results of the national sex survey, or the variations in the numbers of deaths at different hospitals. He looks at how to predict who survived the sinking of the Titanic, given incomplete data. He makes statistics accessible by providing real world examples. I found that really engaging—and I learned a lot about statistics at the same time. I absolutely do. We’re living in the age of big data. There’s more and more data being collected. It’s not like situations in the past, where collecting data was very difficult to do and you had to make the best you could with the limited data that you had. Nowadays, we’re almost overwhelmed with data, and as he discusses in the book, it’s important to be able to pick out the signal from all the noise. What are just fluctuations around the true signal? How do we access what the real signal is coming through? That’s a really, really important problem. The other thing that he emphasizes is that data doesn’t speak for itself. We need to speak for the data. We need to interpret the data, and it’s the way that we interpret it that tells the story. To some extent. Spiegelhalter recognizes that part of the limitation of statistics is the fact that which tests you choose to apply to a certain dataset is subjective. He also emphasizes that when you’re trying to prove something, you set out this null hypothesis. Then you check to see whether your data differs from the null hypothesis. But you can never confirm the null hypothesis; you can never say the null hypothesis is true. You can only say, ‘My data is consistent with this null hypothesis’ or ‘This data doesn’t support rejecting it. This is more evidence towards its truth.’ And that’s the way science works in general. Science has theories that have not been proved to be false. It’s not that they ever get proved to be true—they just accrue more and more evidence, more and more weight attached to them. That’s quite different from math. Mathematics has theorems that are proved to be true based on fundamental axioms. You work all the way up from the bottom using deductive reasoning, while science uses induction. It’s important to note that statistics, although it’s about mathematics, is actually more of a science. He’s very honest about the potential limitations of statistics. If you don’t have enough data, then you can’t draw a reliable conclusion. And that’s something important to acknowledge more generally as a scientist: that science and math don’t always have all the answers. Yes, it gets deep in some places. What I liked about the way he wrote The Art of Statistics was that he signposted when it was getting deep. He says, ‘This is going to be the hardest chapter that you have to read. If you can get through this, then you’ll be fine.’ Then, at the end of the chapter, he says, ‘even if you didn’t get this, it’s completely fine. Lots of students of statistics don’t understand it.’ “If David Spiegelhalter were my statistics teacher, I maybe would have done more statistics in my life” It’s as if he’s writing for a student audience, and I can imagine him giving the book to his undergraduates and saying, ‘Read this first and then we’ll start talking.’ I can also see places where he’s clearly taken examples from his teaching. At one point, he’s talking about whether people fold their arms left over right or right over left. He’s actually taken the data from a set of his students where they’ve done this as an exercise. I have to say, if David Spiegelhalter were my statistics teacher, I maybe would have done more statistics in my life. I’d be absolutely delighted. With a lot of the ideas he introduces he does talk about the different people who have been involved, like Karl Pearson, Ronald Fisher, Thomas Bayes. He mentions them not only in terms of their scientific contribution, but also in terms of their personality. Spieghalter basically says that Fisher was a fantastic scientist, but morally dubious because he believed in eugenics and had ties to the tobacco industry and tried to deny that lung cancer was linked to smoking. What’s interesting is that when Spiegelhalter talks about these historical figures, and why they came up with these statistical tests or statistical measures while giving context about who they were personally, he divorces the two things. It’s quite a nice idea that you can be revered as a good scientist at the same time as being condemned for being a horrible human being. Francis Galton is another one. He was an absolutely fascinating mathematical character and had so many brilliant ideas—but also founded the concept of eugenics."
The Best Math Books of 2019 · fivebooks.com