![](https://images.hive.blog/768x0/https://img.inleo.io/DQmXWvP3NRTMupKWhUqt5qiq9C4dqyoSqrhmxDCEKGXgCzn/image.png)
Data Science for Business
Today I'll review the book "Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking" by by Foster Provost and Tom Fawcett.
Data Science for Business was one of the best books I read last year - maybe ever - and now, considering the context we are living in with the rise of AI, is a great time to talk about it.
Is this book for you?
If you are a manager in charge of deciding whether your company, division or sector should invest in obtaining data and resources to consume that data to generate insights then this book is definitely for you.
If you are an enthusiast who would like dig a little deeper into what's "under the hood" of AI but you don't have a deep technical background you can still benefit from reading this book.
However, if you are a data scientist or someone involved with AI in a technical capacity, this book might be a bit too shallow for you, so perhaps you might want to check out a summary online before deciding whether you should read it or not.
My review
I was slightly misled by the title of the book. I was expecting something a lot less technical. For a book that's supposedly targeting business managers rather than technical specialists it is quite heavy on maths and statistics but in the end that was a good thing as it gave me a good understanding of some of the most important building blocks of data mining.
However, if math is not your strong suit, don't worry too much. The authors themselves mention in the book that while they encourage readers to try and understand the mathematics behind the concepts, it's not a requirement to grasp the central ideas.
The book was originally published in 2013, way before the rise of AI as we know it, but it remains very current and accurate because, even though the technology itself evolved immensely, the foundations of Artificial Intelligence remain untouched.
Without further ado, let's go to it
Data is the new oil. You probably heard this before and, with the rise of AI, it has never been more true.
AI needs data. A lot of it. And good data too. A recent study shows that if even 0.001 percent of an AI's training data is poisoned, the whole thing falls apart.
There are many other factors that impact the performance of an AI model such as the algorithm used, the validation methods that are used and, while it's not guaranteed that a model trained on good data will produce good results, one trained with bad data is doomed to fail.
The authors start by highlighting the importance of data in today's business context and how companies that are serious about advancing data-driven decision-making must treat data as an asset and, therefore, the acquisition of data as an investment.
They proceed to introduce the basics of data mining and analysis and, right of the bat, they answered one of the main questions I had regarding predictive data mining.
I have always wondered how data scientists picked how many and which attributes of a dataset they should use to build their models, and the answer is quite interesting.
There is some trial and error involved, but it's not just guessing. There are two important concepts that guide this exploration: entropy and information gain. I will not expand on these at this time, but they are some of the building blocks of predictive models.
Speaking of predictive models, that's the focus of the book. The authors briefly discuss other applications of data mining as well but, being a business book in nature, it's only natural that it focuses on the use case that is more aligned with business needs.
There are entire sessions dedicated to the different types of algorithms that can be used to build predictive models and, while they authors do not go over actual code, they provide very detailed analysis on how the algorithms are built and how they work from a statistical point of view.
Some other key concepts they teach are how to go about training a model, how to evaluate the performance of a model, what is overfitting and how to avoid it and much more.
Evaluating the performance of models was one of my favourite sessions because it was completely different from what I imagined it would be. I used to think that you could evaluate a model simply by using test data and measuring how many predictions the model got right compared to the size of the dataset, but I learned that a simplistic way of evaluating precision will not work in most real scenarios.
Of course, the book also teaches how to properly structure a business problem and how to start looking at data to address it and also how to create a data-driven strategic mindset.
One thing I should say is that it took me longer than usual to finish this one because I often found myself going back to parts I had already read to refresh some concepts or to try and apply them practically during my work or one of my many side hustles. I feel like this is not a book you read once and shelf it and that I will most likely be coming back to it in the near future.
Final thoughts
Data Science for Business is one of the best books I read last year and easily on my all-time top 5. It's a very dense book and not easy to read, so take your time and be prepared to go over the same page a few times before you are confident enough to proceed.
Don't feel intimidated if you are interested in data science, AI, or correlated areas. Even if you don't have a strong math or statistics background, you can still learn a lot from this book.
Posted Using INLEO
I always liked math. it was my 2nd favorite class in school, after band.. the book loks interesting. 😉😎🤙
It changed my perspective on data!
Enjoyed the review, I’m looking for something to read on data mining so may give this a look, thanks !BBH
I'm glad you liked it! Thank you!
This book won't go as deep as showing how to set up or code a data mining rig but it's a great way to understand it conceptually
Seems Like a strong book with valuable info
It is definitely very informative!