Saturday, March 12, 2011

Analytics and Communication

In his excellent book on the research around the theory of Deliberate Practice, Geoff Colvin distills volumes of often complicated academic research into a clear understandable prose that is easily accessible to a wide audience. Colvin uses a series of carefully chosen anecdotes to explain the various dimensions of this complicated theory (yes, there is more to it than 10,000 hours by the way). What is remarkable about the book, is that it accurately reflects the messages and strength of the research that has been done. There  are varying levels of evidence from the research for different aspects of the theory, and those points are clearly made. The goal of this post is not to sell more copies of Talent is Overrated (though I have provided a link in case you are interested), but rather to make the point that clearly communicating research is at least as important as the work itself. This is just as true for statistical analysis as it is for scholarly research.

At the 2010 Sloan Sports Analytics Conference, I was sitting next to a high level NBA executive at a research presentation. The work being presented was interesting if not revolutionary. When the presentation was over, I went to the front of the room to ask a few questions of the presenter, but was beaten to the punch by the exec I was sitting with who said  (and I paraphrase here a bit) "oh my God, you can talk". And it was true, the presenter had distilled some very complicated analysis down to the core message and accurately conveyed the strengths, potential, and limits of the work in such a way that audience could clearly understand it. If the presenter had not been able to do communicate his research to a non-geek audience (ok, it was SSAC 2010 non-geek is overstating) he would not have been speaking to a full room by the end of his presentation, never mind having extended conversations with NBA execs about it.

Communicating statistical analysis is a careful balance between the strength and limits of the analysis. The story of the analysis has to be conveyed in such a way that a non-geek user of the information can use the analysis properly. Your projection may show that a player is going to improve their rebounding by 20% over the next 3 years, but you also need to convey the risk associated with that analysis. What are the range of likely outcomes? What are the risks?

It is tempting when when working with team executives to make it all too simple and speak in absolutes, especially when others are making similar statements about their point of view. It is incumbent on the analyst not to fall into this trap though, because our analysis does contain variance and we will be wrong. When we are wrong, it becomes easy to dismiss the analysis if we have spoken in absolutes, while if we have strongly communicated (and accepted ourselves) what are research actually says, then, while we may not win every argument, we will win more over the long term.

It is also possible to believe that we are so clever in the techniques we have used to solve a problem, we lose sight of the problem we were trying to solve. It is rare that you will run into an exec who understands or truly cares about how cutting edge the techniques are. They want to know that they are getting good information that they can have confidence in, not that you used some slick new neural network algorithm in R to get the slickest results. This is one of the reasons the communication piece can be so tricky. We have confidence in our results because of the techniques used, and while you may want to have the one sentence description of what you actually did ready in case they ask, management will only have confidence by seeing the results.

After spending hours and hours carefully constructing your analysis, be sure to put a significant amount of time into deciding how to present it. Think like your audience, and what will help them use the analysis properly. If you don't communicate the analysis effectively, then the analysis will be wasted.

Wednesday, March 9, 2011

Players: One Definition



Ask a business man to define a customer, their definition will depend largely on their function within the business. If they are in sales, they will talk about points of contact and sales histories, but if they are in product development, they will talk about demand for new features and usability. Finance, marketing, and H&R would also likely have different definitions for a customer. It is not hard to see however, how a business could benefit from having one comprehensive definition of what a customer is, that brings all of that information together in one place. R&D could then look at sales records to see if the new features under development match the needs and wants of the most profitable customers.

A sports team is no different. Every function within the sport side of a professional team has their own definition for a player. Coaches are often focused on current performance, while the personnel side is often focused on scouting information, and trainers are often focused on health related information. As each of these groups has interest in the information that the other groups have, that is not their focus, and it is often difficult for the various types of information to be synthesized.
Coaches collect, process, and analyze a wide variety of information. Game data includes quantitative data such as performance metrics related to play calls, video, and qualitative grades. Practice data often includes video, specific measurements, and observations. Classroom information includes information on preparedness, and the ability to understand and process game plans.

The personnel operation collects information from a variety of sources. Intelligence is generally qualitative information on a player's background and any current personal issues. Personality information may include quantitative or qualitative psychological assessments as well as anecdotal information gathered from other layers/coaches/friends. Specific skill data can be quantitative or qualitative information on a player's strengths and weaknesses.

The trainers and medical staff focus on a rich set of qualitative and quantitative information regarding a player's injury history including type, treatment, and recovery times. THey often use information on how players train and the frequency with which they train, as well as their pre and post training routines. They also monitor nutrition and hydration.

These are all of course gross simplifications of the broad classes of information utilized by different functions on a team. What is important is not that specific types of information, but rather the synergistic value of keeping one definition of a player that is updated and analyzed by all functions within an organization.


Once all the types of raw data and information on a player are collected consistently collected in one place, then the coach who is wondering how to better motivate a player can see from intelligence information what other coaches have done in the past, or the general manager who is wondering why a high potential prospect is not developing can see that they are not processing information in the classroom well and has poor post-training habits.

While all of this already occurs, it usually occurs when a decision maker in one function asks for information from another function. This may take 5 minutes for a response, it may take 3 days - by which time the original thought that led to the request is gone. An organization that has one definition of a player, has a system that allows thorough and creative analysis to flow freely and not be constrained by the response time of other members of the organization.

The bottom line is that having one definition of what a player is, focuses an organization on what it believes is most important about a player, and drives the resources, strategic thinking, and tactical analysis of the organization through that definition. That process gives the organizations long term strategy the best probability to succeed.

Tuesday, March 8, 2011

Why Sports Analytics?

The "Why" of sports analytics is actually fairly obvious: good sports analytics can help a team win more frequently and more consistently. The real question with sports analytics is not "why?", but "how?" By how I do not mean how do we set up a good sports analytics program - this is a vital question, but not what I am tackling today. Before you determine how to implement a sports analytics program, you first have to understand how sports analytics can help your team win.

Sports analytics can help a team win more frequently and consistently through three general functions: New Information, Synthesis of Information, and Efficient Information. Each team will emphasize a different function of sports analytics based on their core competencies, but a comprehensive sports analytics program will involve all three.
Functions of Sports Analytics
 New Information
The classic use of sports analytics is to provide new information - usually through statistical analysis of performance or game data. This function uses raw, quantitative data to create new information. Team's in all sports have done this privately, creating metrics and projections that help them better measure current and future performance of their players as well as how to best utilize the players they currently have. The power of new information is obvious when compared to the investment community. Investors that have information that other investors do not about the future earning power of a company have a competitive advantage, this why there are laws governing insider trading (hello Martha Stewart).

New Information & the Draft
In sports, if the draft is akin to acquiring an asset in the investment market, than knowing more than about the future potential of the draft eligible players puts your team at a competitive advantage, and more likely to out perform your draft slot - that is you are more likely to draft a player who will perform above the average player selected at that position. One example of this is Chase Budinger and the Houston Rockets. The Rockets are know to have the most advanced analytics program in the NBA and in the 2009 draft they put that program to use. The Rockets traded a future 2nd round pick to the Detroit Pistons for the 44th pick in the draft and selected Chase Budinger. In less than two seasons, Chase has logged over 2700 minutes, hits 35% of his 3s, and has become a regular starter. For an idea on how significantly Chase has out performed the 44th pick, previous players drafted 44th overall include: Reyshawn Terry (yes, that Reyshawn Terry), Tim Pickett, and Lonny Baxter. Houston used their analytic draft system based on college data to provide them with better information on Chase's probable impact on the NBA.

Synthesis of Information
Sports teams, like all organizations, are filled with a many different types of information. There are vasts amounts of quantitative data such as player and team performance metrics, qualitative information such as scouting reports, and hyrbid information such as medical reports. Typically these all reside in various hard drives, spreadsheets, disjoint databases, and word documents. Synergies between the various types of data often exist. That is, putting financial information together with performance data does more than just bring to two types of information together, it creates a new type of information on the value of the player. Good analytic systems bring these various types of information together not just to house them in one place, but to combine them and extract all of the synergistic value out of them.


Efficient Information
Top decision makers with pro sports teams may actually be the busiest people on earth. They are constantly being bombarded with information and requests for their time. Agents, players, fans, owners, and other related parties are always looking to get some time with the top guys and each game, workout, practice, news event etc create more information for them to consider. As Dean Oliver said "stats can see all of the games" which means that decision makers, if they have the time, can consider all of the games. These executives have deep knowledge about their sport, and no one is in a better position to leverage the information that exists into a competitive edge than they are, the limiting factor on them is generally not resources or information, but time. Good information systems give them back some of their time, by creating a more efficient process for them to interact with the information. If an efficient analytic system can easily save a decision maker 10 to 20 minutes a day, that translates into 5 to 10 extra hours a month. That is extra time to spend watching more film, talking through more trade options, exploring additional strategies, and processing even more information. These decision makers will gain a competitive advantage by being able to accomplish more than their adversaries in the same amount of time.

New, better, and efficient information are paths by which a careful investment in sports can yield more wins. It has been estimated for non-sports teams, that an investment in an analytic system with a statistical component is approximately 145% (Davenport & Harris, 2006). While there is no comparable estimate for sports teams, it is not hard to see how analytic systems can help create more wins and keep winning teams on the winning path.

Saturday, March 5, 2011

Advice for the Aspiring Sports Analytic Professional

After two very interesting days at the Sloan Sports Analytics Conference I am more convinced than ever that the most significant improvement teams can make, the area in which they can gain a huge competitive advantage, is in data management. Lets be clear, data management is not exciting. Creating great information systems is not going to get you in the draft room arguing about who to take with the 4th pick in the draft. It is however the skill that is most likely to get an analytic minded person employed with a sports franchise.

Students often ask what they can do to get a job with a team, and more and more, my answer is to acquire good data management skills. Front offices do not lack for people offering them the next great statistic that will tell them who they should be adding to their team. Some of these metrics may even be useful, but teams already have plenty of people telling them, so they probably won't know if yours is actually informative, and they are already bombarded with so much information, thinking about integrating one more type of information is painful to think about.

And that is exactly why data management is the key to the kingdom. Right now sports teams have more data than they know what to do with. The data comes in many forms, from scouting reports, contract parameters, performance data, practice data, injury reports, the list goes on and on... The problem teams have is not that they don't have enough data to consider, but that they don't have a mechanism to efficiently consider all of the data that they do have.

For most of the teams that I have spoken with, data sits in a variety of places. This may include a centralized database that has even a good percentage of the team's data in it, but there are still multiple data silos within the organization. They have their geek box who keeps reams of useful data on excel spreadsheets, they have a trainer who has a ton of health/training related data on their computer, a psychologist who has a host of data (both quantitative and qualitative) on their computer, the cap manager may have their detailed league wide cap model on their computer, and none of this data is collected and processed together. For a decision maker to truly consider all of the different sources of data that they ALREADY USE, even if they have access to it all from their computer, would likely have to have seven or eight windows open, and then scroll from window to window trying to connect the dots between the various data sources. This is a major time waster for people that place a premium on time.

Good data management which includes data structuring, processing, and front end information systems could easily save a group of top level decision makers for a sports team multiple hours a day.  Any analytical professional that can demonstrate to a team that they can pull all the various pieces of information within an organization together, and turn it into information that a decision maker can access from one open window, or one app on their phone, would enable that decision maker to spend more time actually strategizing,  processing information, and making the team better, instead of having to find and synthesize the data.

So all of you brilliant students desperate to work in sports, give up on finding the next PER or adjusted +/-, and get the skills necessary to help give decision makers back some of what they value the most: time.

Thursday, March 3, 2011

Get Your Geek Out of the Box


Organizations that have taken use statistical analysis often do so with a “toe in the water” approach. They hire a well regarded blogger to do a project to see if the information might prove useful. This usually takes the form of some sort of player projection system. The front office then huddles around the results and decide if the analysis points out anything useful to them. If so, they may even bring the analyst in on a full time basis to provide them with information. The analyst then gets a desk in the team offices and, equipped with a high powered computer and the newest version of Excel (and maybe R or Stata) they get to work mining data for useful insights.

This is often the end of the story though. The analyst works diligently and adds value to the organization, providing useful information and everyone is happy. But the organization is not maximizing the value of the analyst, because all the useful insights into player value and game strategy that the analyst may bring to the table are put into a special geek box. All of the other coaches and personnel folks open the box on occasion, look at the information, and even utilize, but then they put it back in the box and do not see it again until the next time they choose to open the geek box.

This leads to a situation in which, when the non analyst wants some information, they either have to go find the geek box or find an alternative geek box. It is often, particularly for coaches out on the road, easier to find information that looks like the information from the geek box online from sources like ESPN or HoopData. So when a coach wants to know something about their opponent, they may turn to alternative sources – outside of the organization. 

The reason the geek box is a problem is that it creates multiple versions of the truth within the organization. Consider the situation in which an analyst has, through careful analysis, developed a metric that isolated a RBs contribution from their offensive line’s efforts. Occasionally the general manager opens the geek box and sees that, while his team has a strong running attack, their RB ranks near the bottom of the advanced metric. The head coach has been too busy to open the geek box and instead saw quickly on ESPN that his RB is in the top five in the league in yards per rush. There are now two very different impressions of the team’s RB floating around. Now, instead of the GM and coach meeting to discuss how to solve their RB problem, they are headed for conflict as to whether to offer a big extension to the RB or not.

The solution to the geek box problem is an enterprise wide approach to analytics. This is the opposite of the toe in the water strategy, it is the canon ball off the high dive approach. It puts all of the various forms of information that sit within the organization – scouting, statistical, medical etc in one place, integrates it all, for anyone within the organization to easily access and utilize. Now, when a scout is on the road and wants to have a more complete view of the player they are about watch, they instantly have the all of the relevant information on the player, and it is the same information that the GM and coach are looking at, so when the three individuals are doing their analysis, they are starting from the same place.

Making information more accessible to all, and keeping everyone on the same page has significant benefits to the organization. My bias is of course to point out the benefit of having the statistical information easily available to all, but the truth is it goes both ways. As analyst, we look to inform the rest of the organization, but we can also learn a great deal from seeing the information that everyone else in the organization is looking at, so when we have conversations, we are starting from the same place as well.

It is time to take the geek out of the box and have a full enterprise wide approach to information in sports organizations.

NFLPA's Secret Leverage


Judge Doty’s recent decision to relieve the NFL owners of their $4billion extortion fee from the networks garnered the predictable reaction from the NFL that the decision would have no impact on the ongoing labor negotiations. That reaction is obviously as phony as a tight rope walker suggesting that removing his net would have no impact on his decision to have a shot and a beer before attempting a high wire walk.  The question is not whether this decision would have an impact on negotiations, but rather how significant the impact will be. 

Mark May of ESPN was recently quoted by Wilbon on PTI that while the owners would certainly like to have the cash, their pockets are so deep that it really does not affect their calculus on the labor front dramatically. This comment is born of May’s personal experience as a player negotiating against many of these owners. May’s experience though was at a different time, and the NFL and the US economy have gone through some major transformations since the last time there was a work stoppage in the NFL.

During the first six or seven years of the last decade, while the NFL continued to solidify its position as the most dominant for in professional sports, there was a burst of economic growth and activity across virtually every industry in the US. CEO’s across the country were pushing to realize higher and higher rates of return for their investors and many leveraged their assets to produce those returns. The debt that many of these companies took on finally proved to be too extensive as the economic crisis of 2008 hit. Many previously very successful business were forced either into bankruptcy or to sell out at pennies on the dollar to rivals.

NFL owners were operating in the same economic world as Washington Mutual, General Motors, and hundreds of small businesses that no longer exist. It occurs to me that it may be a bit naïve to believe that none of the NFL owners got themselves over extended and perhaps are carrying far more debt than their fellow owners realize. If any owners are privately facing this situation, they have been able to delay the reckoning that other businesses have gone through because the money tree that is NFL ownership has continued to flower during the economic downturn.

Without the $4billion from the networks though, for the first time, there may be some owners out there that are looking at the money tree and wondering if getting a labor deal done quickly isn’t the best way to get the tree to flower. The alternative may be rather uncomfortable for some of them.

Wednesday, March 2, 2011

Sports Analytics is Not a Strategy

As I travel towards the Sloan Sports Analytics Conference I  am pondering what sports analtyics actually is. Part of that thinking is spurred by the book Competing on Analytics by Davenport and Harris. One of the first points that Davenport and Harris make is that analytics is not a strategy in and of itself, but rather a tool that should be used to support the "core competency" of an organization. This point is, I believe, one of the core misunderstandings of what it means for a team to start using analytics. There seems to be an impression that statistical analysis (one tool of sports analytics) is a strategy that dictates a certain style of play and/or a certain type of player.

The legendary book Moneyball by Michael Lewis served sports analytics extremely well in detailing how a team that competes with analytics can prosper. The book also, however, reinforced the impression that analytics had a unique purpose - unearth undervalued players and find areas of in

game strategy for which the excepted wisdom was not entirely correct. While this is certainly a viable use of analytics, it is hardly the unique or, for some teams, even the most productive use. The A's employed analytics in this manner because it fit with their core competency: winning with a low payroll. It was the strategic goal of the organization that determined how to best use analytics, instead of analytics dictating a strategy to the team.

Teams have their own unique core competencies and areas in which they choose to have as strengths. In the corporate world, Walmart competes on price so they utilize analytics to better manage their inventory. In the NFL, the Ravens have traditionally competed on elite level defense and evaluation of collegiate talent. Should the Ravens choose to use analytics to help provide a competetive edge, then their initial focus should principally be in support of these functions, which is fundamentally different from how the Saints or Colts might choose to best deploy their analytic assets.

At this stage in the growth of sports analytics, it is incumbent on the analysts to communicate to team executives, that analytics is not about identifying the "best" strategy or players, but rather to work within the core competency of the team and help push that forward. For the most part however, we as analysts have done little to demonstrate the value of analytics beyond identifying new metrics to rank teams and/or players. While I am certainly guilty of this, and to a large degree it has been necessary, I believe that we need to start pushing the discipline beyond this and demonstrate to teams how analytics can be used to help make better decisions, develop a culture of evidence and uniquely support the core competency of each team.

Analysts working for teams that place a high value on their player development function should have a very different focus than one that focuses on finding value in free agent markets. Having had the privilege to work with a variety of talented executives, every analyst should understand that these leaders understand their sport at a truly elite level and have developed long term strategies that they believe in. When an analyst can demonstrate how a strong analytics program can function as a tool to insure that the tactics used to support those strategies are the ones with the highest probability of success, then they will maximize their value to the team.