Vicki Boykis's Reading List
Vicki Boykis graduated from Penn State University in Economics, before deciding to become a data scientist. She also holds an MBA from Temple University. She is now a consultant in data science for CapTech, helping companies across a broad range of industries by building and analyzing data using Python and machine learning.
Open in WellRead Daily app →Learning Python and Data Science (2019)
Scraped from fivebooks.com (2019-07-15).
Source: fivebooks.com
Zed A. Shaw · Buy on Amazon
"Zed Shaw has written a lot of Learn X the Hard Way books. Initially he put them on his website for free, which is how a lot of people found out about them. He’s experimented with many ways to monetize his work, and I believe the only way to get the books now is to pay for them—but it’s completely worth it! Support Five Books Five Books interviews are expensive to produce. If you're enjoying this interview, please support us by donating a small amount . I chose this book because a lot of other Python books talk about theory, and Learn Python the Hard Way is first and foremost about building things. His approach is to say: ‘Coding is going to suck at first; you’re not going to understand anything. But just do this stuff that I tell you to do, and eventually it’s all going to make sense.’ He goes through all of the building blocks that you need to master Python. It’s been updated for Python 3, which is very important as well. It’s very practical and down to earth, with about 50 to 60 exercises, and it’s written in a way that doesn’t feel overwhelming and that really allows you to go through all of them."
Peter Seibel · Buy on Amazon
"This is more of a ‘cultural’ book about programming, in which you won’t get a lot of specific, technical advice on how to program in Python, C or Java. What you’ll get is an introduction to the industry by the people who founded it; people like Brendan Eich, who wrote JavaScript, or Joshua Bloch, who was one of the main contributors to Java. It’s about how those people got into programming and how they think about it. It’s a very conversational book that really helps you to learn the culture of this industry you’re coming into, and some of its terminology. And it’s a much lighter read than Learn Python the Hard Way , of course. It’s extremely important. You’ll be working with people who have been writing code since they were 10 years old. They’ve very much immersed in this world, and as a beginner, it can be intimidating to ask certain things, because it gives away your ‘status’ and the fact that you don’t know as much. To balance that, this book will for example give you a good overview of the history of computer development, what kind of programming languages there are and how they’ve come about—all these things that you wouldn’t necessarily get from online courses and technical books. In spite of my answer to the last question, I still think it’s one of the most welcoming industries that you can get into. I worked hard but I do consider that I had some luck by stumbling into data just as it was getting big, around 2011-2012. None of my friends, be they nurses, doctors, lawyers, actuaries, could have found their job without a lot of advanced studies and official certifications. For all the problems that the computer science industry has, it’s probably one of the most egalitarian, at least on the particular issue of qualifications."
Darrell Huff · Buy on Amazon
"Indeed it was written in the 50s; Darrell Huff was not a statistician himself. He was a journalist, who tried to introduce how you can lie with statistics to people who may not be aware of this problem. It covers things like the typical ‘correlation doesn’t imply causation’, but also how random sampling works, how pie charts and bar charts can be misleading, and so on. “Data can be manipulated and changed in so many ways” It’s a must-read for anyone who works in business or in this industry in general. You can very easily lie with data. Often times, companies will say: ‘this is data, therefore it’s the truth’. But data can be manipulated and changed in so many ways. This book will teach you how to think honestly about the data that you are analyzing and presenting to different people, and how to be a more critical consumer of data as well, something that is essential in today’s world. If you’re doing something very advanced like artificial intelligence—or if you want to work at Google Brain, for example—you probably do need to take formal courses. But for most data science projects, you really don’t. Get the weekly Five Books newsletter An interesting thing that’s happening is that the models that data scientists have put together are starting to become commoditized. Software products by Amazon, Google or Microsoft let you construct models and train them in the cloud with tremendous computing power, without worrying too much about the mathematical implementation behind it. What matters more is knowing which algorithm to pick, how to tune the parameters of the model, and how to interpret the results in the right way. It’s become easier to create models, but it’s really the interpretation of those models that matters now."
David A. Patterson & John L. Hennessy · Buy on Amazon
"This is a textbook that covers how computers work from the ground up. It includes hardware, software, and operating systems. It’s a really thick book, but also a really good one! What I’ve found during my move to data science is that the more I work in this field, the more I need to understand how to optimize my code, how the instructions move through the various big data systems that I mentioned earlier, and how all of this can impact performance. It’s really important for me, for example, to understand how data moves through a network and can be slowed down by bandwidth limitations. “This is the book I’d recommend reading if you missed out on a formal computer science education” This is the book I’d recommend reading if you missed out on a formal computer science education. Some data scientists tend to forget about these aspects because the structures they use, such as R data tables or Python data frames, are abstracting away what’s going on under the hood. You don’t need to know everything from end to end, but having a general idea of why things can be under-optimized will be very helpful. Something that’s really motivated me in my learning was to have concrete projects to work on. The first time I really witnessed the power of computer programming was when I was doing some data cleaning at work, and made the effort of automating this cleaning; by the end I was blown away by how fast the process had become. This idea of making the computer work for me was very powerful. It’s only later that you start having an interest for understanding the inner workings of computer hardware, once you stumble upon a few problems in your programming. So yes, it does make sense to read this book a bit later on, once you’ve mastered the basics of whatever language you are interested in. It’s easy to get overwhelmed when you go down this path. People always say to start with projects, but it can be hard at the beginning to even think of what to do. Finding a mentor on specialized websites or at data science meetups can be a great way of solving this issue, by having an experienced person telling you where to start. Support Five Books Five Books interviews are expensive to produce. If you're enjoying this interview, please support us by donating a small amount . For people who are already working at a job that revolves around data, I’d recommend focusing on your personal ‘pain points’ and trying to solve them with programming; for me, this was automating things I was doing with Excel spreadsheets. There’s a very good book dedicated to these kinds of very actionable tasks, called Automate the Boring Stuff with Python ; this would be another great book to read for those moving from typical office applications to programming."