A Study on The Professional Golf Association (PGA) Regression Analysis of The FedEx Cup Using Performance Statistics

in Deep Dives5 months ago

Introduction
Over the year’s performance statistics has become more prevalent in the game of golf. With advancements in technology the PGA Tour has opened up to a statistical approach when analyzing the game and individual progress. The PGA Tour is the world’s premier membership organization for co-sanctioning more than 130 tournaments (PGA Tour, 2020). Established in 2007, the FedEx Cup is a season-long points competition offering unprecedented bonus money while culminating with the FedEx Cup Playoffs which is a series of three events determining the FedEx Cup champion (PGA Tour, 2020). Statistics are not only used to determine the FedEx Cup standings but also used to make predictions and place betting odds. The top 125 players in the FedEx Cup standing are eligible for playoffs which features progressive cuts over a period of three different tournaments (125,75,30). Once cuts are made, the FedEx Cup Playoff finale also known as the Tour Championship is played out. The Tour Championship features starting strokes; A staggered stroke-based system which represents the final FedEx Cup standings, recognizing players for their season performances as well as their play in the first two FedEx Cup Playoff events (PGA Tour, 2020). The total bonus pool is comprised of $60 million with $15 million going to the FedEx Cup Champion. In addition to arguably the most important tournament of the year (The Tour Championship), the winner also gets a three-year exemption for all events on the PGA Tour. FedEx Cup points, standings and other statistics are also closely analyzed when deciding to renew a PGA Tour membership or grant one.

This study is designed to examine the impact of different golf statistics on the FedEx Cup point system and standings. The objective of this paper is to determine which dependant explanatory variables are directly correlated with FedEx Cup points and standings. I am set to take a deeper dive into the statistics of golf to make sense of specific correlations when looking at an individual players performance and their overall standings in the FedEx Cup. If you know anything about golf, general theory states that if you perform well, that usually results in cuts made, and if you make cuts that certainly means you have a better shot at winning tournaments. I will explain the points-based system as well as different literature that you must be aware of when analyzing the data that I am about to present you with. As far as the empirical methodology this will help my study derive specific variables that are correlated with results, as well as their significance and I will also point out other variables that may contain bias and are not as important.

Literature Review
The FedEx Cup regular season is held over the duration of the year, featuring 47 official events. Only tour members are qualified to earn FedEx Cup points. You may be wondering how points are earned? Points are earned based on how well a player performs in each of these tournaments with an emphasis on both tournament wins and high finishes. More specifically top 10 finishes. Official events each award 500 points to first place finishers however there are a few tournaments which the stakes are higher and more points are awarded. These tournaments include The Players Championship, Masters Tournament, PGA Championship, U.S. Open Championship and The Open Championship. First place winners of these tournaments, mainly because they are harder to win, will be awarded an extra 100 points for a total of 600 points towards the FedEx Cup standings. I should also mention that each of the four World Golf Championship events will be awarded 550 points to the winner and additional events will only award 300 FedEx Cup points to the winner. Now that you have some background information on how points are awarded, I think it is necessary to include some graphs to provide a visual.

As mentioned earlier there is a stroke-based system recognizing players for their respective standings. The following graph illustrates this. As you can see the leader starts the final playoff tournament -10 which means that the first seed starts the tournament off with a 10-stroke lead when compared with the 26th-30th ranked players in the FedEx Cup standings. This is pretty substantial lead and has a huge impact in the tournament especially when you take a look at the drop off in money distributions. This next graph illustrates how the FedEx Cup money is distributed amongst the field.

Data
Throughout this paper I will be talking about a lot of different statistics which will ultimately be the explanatory variables (X Independent) when correlating the data that I have found with my predictor variable (Y Dependant). I think it is important to understand what each statistic is and what it means because later on I will aim to explain how each variable influences my hypothesis and the relationships between them. Over the course of this study I will be focusing on some key dependant variables. These variables include: Earnings, AVG PUTT22, TOT PUTT, PUTT%10-15, PUTT%20-25, 3PUTT>25, DRV AVG, SCRBLE FRNG, SCRBLE% 10-20 and SCRBLE% 20-30. As I mentioned earlier all the variables I am using in this study, I have gotten from only the 2019 season. Also, assume that driving distance is measured using yards and putting distance is measured in feet. Unlike other sports, in golf a player’s earnings solely depends on their ability to make cuts. For this study, I will be using the annual season earnings for each player for the 2019 season. AVG PUTT22 refers to the average putting distance resulting in a 2 putt. TOT PUTT refers to the total amount of putts that each golfer converted on in the 2019 season. PUTT%10-15 refers to the percentage that player has of converting on a putt in between 10-15 feet. This also explains the PUTT%20-25. 3PUTT>25 refers to the amount of times it took a player 3 putts to hole out when the golfer was farther than 25 feet from the hole. DRV AVG is the average driving distance measured in yards over the course of the season. SCRBLE FRNG refers to a player’s ability to scramble for a par or even better when not converting on a green in regulation. SCRBLE%10-20 refers to the golfer’s percentage on converting for par when having a greenside shot in between 10-20 yards from the hole. Same explanation goes for SCRBLE% in between 20-30 yards from the hole.

The data I have found and will be using is from PGA.com which is a very credible source. Like I said these statistics are solely from the 2019 season and covers all the tournaments that each player participated in. For this data set I started off with a population size of 300 players. I then took a random sample size of 160 players because some data sets were incomplete. I want to make it clear that I did not cut the bottom half or the top half of the population size and this was a completely random sample size (160 players). I would have liked to examine the full population size but in this case, I am able to gather data on half the population which is pretty rare. There for we can assume the sample mean is the population mean which gives me an unbiased estimator of the population. If I were to only use the top 20 golfers in the world the sample mean would be very different from the population mean. In this case the sample mean would be very biased and not a good estimator of the population at all.
ADD Charts/ Graphs

Theories and Working Hypothesis
In most sports, general theory states that if you perform well then you give yourself the best shot at winning; However, this isn’t always the case in golf. In golf you are the only one who can dictate the outcome and sometimes you have an amazing round but one hole or one shot can change everything. I found this quote from Tiger Woods that puts things into perspective. “This sport is awfully lonely sometimes” “You have to fight it. No one is going to bring you off the mound or call in a sub. You have to fight though it. That’s what makes this game so unique and so difficult mentally”. Although all of this is true, I am trying to prove that certain performance statistics definitely have an impact on not only FedEx Cup points but also how successful a golfer can be. FedEx Cup points accumulated and standing over the year is a great indicator when ranking golfers in the world. Statistics like average driving distance and putting average definitely add up and results prove it. The better performance statistics you have, generally the better a golfer does and ultimately FedEx Cup points are accumulated.

I truly believe average driving distance tends to give a player an advantage in total strokes gained. Here I have found some data to support my claim. “It might be the case that being 10 yards above average in driving distance is now “worth” more (in terms of total strokes-gained) than it was in the past. This could be true due to changing course setups: as courses get longer, players are forced to hit driver off every tee, which makes having 10 extra yards more useful than when players were sometimes choosing shorter clubs off the tee for strategic purposes” ( Data Golf, 2020) As you can see as courses get longer, players who drive their ball longer may have an advantage because they have a shorter distance into the green which results in more strokes gained. More strokes gained ultimately results in players being more successful, making cuts and earning FedEx Cup points. Another statistic that supports my claim. “Players who hit the ball above-average distances on the PGA Tour are also above-average approach players. Therefore, the simple correlation between a golfer’s average driving distance and their performance in part captures the fact that longer players are, on average, better approach players (Data Golf, 2020).

Although it is very satisfying to hit the ball far and to have great ball striking ability, “Putting is often talked about as being the most important aspect of golf” (Noah’s Ark Golf Centre, 2018). If any golfer were to average 2 less putts a round then statistically your score would improve by 2 strokes. In this case if a golfer had a greatest percentage of hitting a putt within 15 feet than another golf, that golfer will have an advantage and typically will have a better score at the end of the round. A better score results in a higher finish which generally means that golfer will earn more points when considering the FedEx Cup point distribution system. A study conducted by Harvard stated that “An extra putt per round increased scoring average by .69 shots” (HARVARDSPORTS, 2019). Like I said things don’t always add up and there are hundreds of different variables in golf to account for but statistically speaking, holding everything else constant, the better average putting percentage and or the less putts you have in a round the better your score will be. Which ultimately reflects your success.

When I take a look at all my statistics and the regression model I have set in place, I will be able to determine which statistics are directly correlated with earning FedEx Cup points and also the significance behind each statistic.

Empirical Methodology

  1. POINTSi = 0 + 1SCOREi + 2DRIVEi + 3FWi + 4GRNi + 5PUTTi + 6TPi + 7CMi + Ui
    Fill in the rest with your variables….

Results and Discussion
But you explain like “based on the regression every extra inch the player has to put on average means they should expect to lose (if the sign is negative) 0.567 fedex cup points…

Conclusion

References

http://www.espn.com/golf/statistics/_/year/2020/count/241

https://datagolf.com/importance-of-driving-distance/

https://www.noahsarkgolf.com/single-post/2018/02/07/How-Important-Are-Putting-statistics

http://harvardsportsanalysis.org/2009/11/predictors-of-pga-tour-scoring-average-does-the-driver-matter/

Introduction
Over the year’s performance statistics has become more prevalent in the game of golf. With advancements in technology the PGA Tour has opened up to a statistical approach when analyzing the game and individual progress. The PGA Tour is the world’s premier membership organization for co-sanctioning more than 130 tournaments (PGA Tour, 2020). Established in 2007, the FedEx Cup is a season-long points competition offering unprecedented bonus money while culminating with the FedEx Cup Playoffs which is a series of three events determining the FedEx Cup champion (PGA Tour, 2020). Statistics are not only used to determine the FedEx Cup standings but also used to make predictions and place betting odds. The top 125 players in the FedEx Cup standing are eligible for playoffs which features progressive cuts over a period of three different tournaments (125,75,30). Once cuts are made, the FedEx Cup Playoff finale also known as the Tour Championship is played out. The Tour Championship features starting strokes; A staggered stroke-based system which represents the final FedEx Cup standings, recognizing players for their season performances as well as their play in the first two FedEx Cup Playoff events (PGA Tour, 2020). The total bonus pool is comprised of $60 million with $15 million going to the FedEx Cup Champion. In addition to arguably the most important tournament of the year (The Tour Championship), the winner also gets a three-year exemption for all events on the PGA Tour. FedEx Cup points, standings and other statistics are also closely analyzed when deciding to renew a PGA Tour membership or grant one.

This study is designed to examine the impact of different golf statistics on the FedEx Cup point system and standings. The objective of this paper is to determine which dependant explanatory variables are directly correlated with FedEx Cup points and standings. I am set to take a deeper dive into the statistics of golf to make sense of specific correlations when looking at an individual players performance and their overall standings in the FedEx Cup. If you know anything about golf, general theory states that if you perform well, that usually results in cuts made, and if you make cuts that certainly means you have a better shot at winning tournaments. I will explain the points-based system as well as different literature that you must be aware of when analyzing the data that I am about to present you with. As far as the empirical methodology this will help my study derive specific variables that are correlated with results, as well as their significance and I will also point out other variables that may contain bias and are not as important.

Literature Review
The FedEx Cup regular season is held over the duration of the year, featuring 47 official events. Only tour members are qualified to earn FedEx Cup points. You may be wondering how points are earned? Points are earned based on how well a player performs in each of these tournaments with an emphasis on both tournament wins and high finishes. More specifically top 10 finishes. Official events each award 500 points to first place finishers however there are a few tournaments which the stakes are higher and more points are awarded. These tournaments include The Players Championship, Masters Tournament, PGA Championship, U.S. Open Championship and The Open Championship. First place winners of these tournaments, mainly because they are harder to win, will be awarded an extra 100 points for a total of 600 points towards the FedEx Cup standings. I should also mention that each of the four World Golf Championship events will be awarded 550 points to the winner and additional events will only award 300 FedEx Cup points to the winner. Now that you have some background information on how points are awarded, I think it is necessary to include some graphs to provide a visual.

As mentioned earlier there is a stroke-based system recognizing players for their respective standings. The following graph illustrates this. As you can see the leader starts the final playoff tournament -10 which means that the first seed starts the tournament off with a 10-stroke lead when compared with the 26th-30th ranked players in the FedEx Cup standings. This is pretty substantial lead and has a huge impact in the tournament especially when you take a look at the drop off in money distributions. This next graph illustrates how the FedEx Cup money is distributed amongst the field.

Data
Throughout this paper I will be talking about a lot of different statistics which will ultimately be the explanatory variables (X Independent) when correlating the data that I have found with my predictor variable (Y Dependant). I think it is important to understand what each statistic is and what it means because later on I will aim to explain how each variable influences my hypothesis and the relationships between them. Over the course of this study I will be focusing on some key dependant variables. These variables include: Earnings, AVG PUTT22, TOT PUTT, PUTT%10-15, PUTT%20-25, 3PUTT>25, DRV AVG, SCRBLE FRNG, SCRBLE% 10-20 and SCRBLE% 20-30. As I mentioned earlier all the variables I am using in this study, I have gotten from only the 2019 season. Also, assume that driving distance is measured using yards and putting distance is measured in feet. Unlike other sports, in golf a player’s earnings solely depends on their ability to make cuts. For this study, I will be using the annual season earnings for each player for the 2019 season. AVG PUTT22 refers to the average putting distance resulting in a 2 putt. TOT PUTT refers to the total amount of putts that each golfer converted on in the 2019 season. PUTT%10-15 refers to the percentage that player has of converting on a putt in between 10-15 feet. This also explains the PUTT%20-25. 3PUTT>25 refers to the amount of times it took a player 3 putts to hole out when the golfer was farther than 25 feet from the hole. DRV AVG is the average driving distance measured in yards over the course of the season. SCRBLE FRNG refers to a player’s ability to scramble for a par or even better when not converting on a green in regulation. SCRBLE%10-20 refers to the golfer’s percentage on converting for par when having a greenside shot in between 10-20 yards from the hole. Same explanation goes for SCRBLE% in between 20-30 yards from the hole.

The data I have found and will be using is from PGA.com which is a very credible source. Like I said these statistics are solely from the 2019 season and covers all the tournaments that each player participated in. For this data set I started off with a population size of 300 players. I then took a random sample size of 160 players because some data sets were incomplete. I want to make it clear that I did not cut the bottom half or the top half of the population size and this was a completely random sample size (160 players). I would have liked to examine the full population size but in this case, I am able to gather data on half the population which is pretty rare. There for we can assume the sample mean is the population mean which gives me an unbiased estimator of the population. If I were to only use the top 20 golfers in the world the sample mean would be very different from the population mean. In this case the sample mean would be very biased and not a good estimator of the population at all.
ADD Charts/ Graphs

Theories and Working Hypothesis
In most sports, general theory states that if you perform well then you give yourself the best shot at winning; However, this isn’t always the case in golf. In golf you are the only one who can dictate the outcome and sometimes you have an amazing round but one hole or one shot can change everything. I found this quote from Tiger Woods that puts things into perspective. “This sport is awfully lonely sometimes” “You have to fight it. No one is going to bring you off the mound or call in a sub. You have to fight though it. That’s what makes this game so unique and so difficult mentally”. Although all of this is true, I am trying to prove that certain performance statistics definitely have an impact on not only FedEx Cup points but also how successful a golfer can be. FedEx Cup points accumulated and standing over the year is a great indicator when ranking golfers in the world. Statistics like average driving distance and putting average definitely add up and results prove it. The better performance statistics you have, generally the better a golfer does and ultimately FedEx Cup points are accumulated.

I truly believe average driving distance tends to give a player an advantage in total strokes gained. Here I have found some data to support my claim. “It might be the case that being 10 yards above average in driving distance is now “worth” more (in terms of total strokes-gained) than it was in the past. This could be true due to changing course setups: as courses get longer, players are forced to hit driver off every tee, which makes having 10 extra yards more useful than when players were sometimes choosing shorter clubs off the tee for strategic purposes” ( Data Golf, 2020) As you can see as courses get longer, players who drive their ball longer may have an advantage because they have a shorter distance into the green which results in more strokes gained. More strokes gained ultimately results in players being more successful, making cuts and earning FedEx Cup points. Another statistic that supports my claim. “Players who hit the ball above-average distances on the PGA Tour are also above-average approach players. Therefore, the simple correlation between a golfer’s average driving distance and their performance in part captures the fact that longer players are, on average, better approach players (Data Golf, 2020).

Although it is very satisfying to hit the ball far and to have great ball striking ability, “Putting is often talked about as being the most important aspect of golf” (Noah’s Ark Golf Centre, 2018). If any golfer were to average 2 less putts a round then statistically your score would improve by 2 strokes. In this case if a golfer had a greatest percentage of hitting a putt within 15 feet than another golf, that golfer will have an advantage and typically will have a better score at the end of the round. A better score results in a higher finish which generally means that golfer will earn more points when considering the FedEx Cup point distribution system. A study conducted by Harvard stated that “An extra putt per round increased scoring average by .69 shots” (HARVARDSPORTS, 2019). Like I said things don’t always add up and there are hundreds of different variables in golf to account for but statistically speaking, holding everything else constant, the better average putting percentage and or the less putts you have in a round the better your score will be. Which ultimately reflects your success.

When I take a look at all my statistics and the regression model I have set in place, I will be able to determine which statistics are directly correlated with earning FedEx Cup points and also the significance behind each statistic.

Empirical Methodology

  1. POINTSi = 0 + 1SCOREi + 2DRIVEi + 3FWi + 4GRNi + 5PUTTi + 6TPi + 7CMi + Ui
    Fill in the rest with your variables….

Results and Discussion
But you explain like “based on the regression every extra inch the player has to put on average means they should expect to lose (if the sign is negative) 0.567 fedex cup points…

Conclusion

References

http://www.espn.com/golf/statistics/_/year/2020/count/241

https://datagolf.com/importance-of-driving-distance/

https://www.noahsarkgolf.com/single-post/2018/02/07/How-Important-Are-Putting-statistics

http://harvardsportsanalysis.org/2009/11/predictors-of-pga-tour-scoring-average-does-the-driver-matter/

Sort:  

Congratulations @prodivvy24! You have completed the following achievement on the Hive blockchain And have been rewarded with New badge(s)

You distributed more than 400 upvotes.
Your next target is to reach 500 upvotes.

You can view your badges on your board and compare yourself to others in the Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP