# Reply to discussion- 2 (250 words)

Reply to the post below in a format of 250 words.

Summary

The chapters assigned to us this week dig deep into how predictive analytics can help companies by capturing and analyzing data from the future. The main points talked about in Chapter 9 are about Data, The Statistics, and The Assumptions. By Data, the Textbook, “Guide to Data Analytics Basics for Managers”, meant that lack of good data is usually the first obstacle when companies are trying to implement a predictive analytics approach in their company. They explain that data is significant as data has to exist for it to be predicted. This kind of reminded me of the theory of matter is never created nor destroyed, it just changes in form. I feel like we can apply this to the data. If there is no data for the analytics to base themselves off of, then they can’t predict future data. The second point they made sure to dwell on is Statistics. It comes to show that predictive analytics are nothing without statistics. According to the textbook, regression analysis is a method in which analysts use as their primary tool for predictive analysis. The analysts assumes a correlation between handful of independent variables, such as gender, age, wages with the purchase of a product or service from a sample of customers. The regression analysis is tested to see if there is in fact a correlation between these variables and the purchase made. If there is something found, then the analyst is able to, with this information, create a score that predicts the likelihood of the purchase in the future. Lastly, Chapter 9 speaks about Assumptions. It is key to know that every model has an assumption. However, with every assumption, one has to test whether they are true or untrue. On the other hand, Chapter 10 focused more on regression analysis. In this area, we are told that regression analysis is a method in which we sort, mathematically, variables that have an impact. It makes us wonder what factors are the most important ones, how they relate with one another, what factors are the least important, and so on. The dependent variable is the factor that you are trying to predict, while the independent variable are the factors you think have an impact on the dependent variable. The chapter goes on to teach us how regression analysis works, how to read a graph, how companies use the analysis to their benefit, and dig deep into what correlation is and how it can confuse people into thinking one factor causes another. Regression analysis is not an easy task to do, which is why the textbook added a section of mistakes that people tend to make when working with regression analysis.

Humans are always changing and evolving. Even though humans do establish strong patterns of behavior over time, behaviors can still change. The problem with change is that models that were used to predict a behavior 10 years ago may no longer be valid.
Just like the textbook says, “The greater the elapsed time, the more likely it is that customer behavior has changed”.

Another critical issue found in these chapters was the fact that if an analyst doesn’t include an important variable in the model then the model is no longer valid.
This just goes to show that companies should invest in really good analysts, however it does not take the fact that humans are not perfect and we can make mistakes.

It’s easy to make a mistake when you do a regression analysis and assume there’s a correlation, but we have to remember the correlation does not always causation. You can’t always make assumptions when you see a correlation in an analysis. When this happens, you have to see it for yourself and assess the situation.
The question you have to answer is, “What’s the physical mechanism that’s causing the relationship?” You can do this by observing from far away, or even interacting with the customer and asking them questions.

According to the textbook, the biggest assumption throughout the practice of predictive analytics is how in predictive analytics the future will continue to be like the past. In the book, “The Power of Habit”, Charles Duhigg makes a point in saying that people establish habits that they keep over time.
However, there are instances in which they change those behaviors, which means that the models that were used once to predict them are no longer usable. For example, I am highly addicted to sugar. I tend to buy a lot of Coca Cola weekly, which means it’s a habit of mine to constantly consume products of Coca Cola. However, if I were to suddenly want to quit sugar, I’d then stop purchasing completely Coca Cola products. It’s also valid to know that humans change throughout time. You can’t expect people to continue the same habits. It is true that habits change as time passes, but one can assume that humans retain the same habit from 1 month to a year.

I feel like a good thing to have in mind is to always have multiple analysts check the predictive models’ assumptions and the variables being put in it. This is because if there is a missing key variable, then the whole model is useless. Wrong assumptions can definitely cause a lot of damage.
The textbook also offers many ways in which you can help your analysts not make those mistakes. One of those ways is to ask them what the key assumptions are and what would have to happen for them to no longer be valid.

Lastly, one of the lessons learned in the last chapter is that “correlation is not causation”. By this, the textbook wants to say that just because two things correlate, it does not meant that one factor was caused by another.
For example, if I say that cardio will help me lose body fat could be true. However, there are more factors that go into it. If I don’t lower my caloric intake, I will continue to have belly fat despite doing cardio.

The textbook mentions that a good amount of helpful data can improve the predictive analytics and work in the favor of companies. Having a general idea of the data that you want to predict is not enough, which is why the company needs to gather a lot of data.
Just like the textbook explained, if you are trying to find out what customers will buy in the future, you will have to gather information about what the customers are buying at the moment. Not only that, but you will have to look into information such as age, gender, and location of every purchase made.

When you find a correlation in a regression analysis, don’t automatically assume that one is caused by the other. The textbook states that making assumptions and not trying to find out more is a big mistake and it’s lazy.
Best way to go about this is to go out to the real world and observe and assess. You can even ask the customers who are buying the product and talk to them. The point is to try to find the reason of their purchase and gather more data.

When speaking to your analyst, do not just go haywire and ask them what is affecting your business’s sales. This usually means that the manager doesn’t know where to focus on and the analyst will end up with too many variables that he will try to dig deeper on.
When doing the analysis, you should also consider whether you have any control over the independent variables or not. If. You don’t then find ways around it and be creative. If that variable will continue to be there regardless of what you do, what will you do to increase your sales?.

There was a topic that we have reviewed before in one of our discussions in which we put emphasis on the fact that companies should really put a lot of their money on hiring really good analysts. Chapter 9 really liked focusing on that of predictive analytics and I feel like not putting good money on a good analyst won’t help you predict and therefore your company won’t have a competitive advantage.

In chapter 10, however, they focused more on the important of regression analysis and how gathering data correctly is significant to the final model. This is obviously substantially to what we have learned in past discussions in which we cover how data, and big data, can help our businesses.

Data has become a really big number, and companies have a plethora of ways to use this data to their advantage. In these chapters we were able to learn about predictive analysis and how these are used for the benefit of corporations.

Reference:

HBR guide to data analytics basics for managers. (2018). Boston, MA: Harvard Business Review Press.