INFORMS Open Forum

  • 1.  Can analytics predict the Super Bowl?

    Posted 02-01-2016 17:11
    Edited by Mary Leszczynski 02-01-2016 17:19

    In Laura McLay's latest blog post in Punk Rock Operations Research, she references the latest Editor's Cut: Analytics in Sports (INFORMS PubsOnline).  This collection of articles, podcasts and videos highlights how much the prevalence of analytics in sports has grown, with articles spanning from the 1950s to current times. I particularly like the video from HLN about the football coach who never punts (http://pubsonline.informs.org/editorscut/sports/videos).  He speaks about the need to convince players and other coaches about the results from the math models, reiterating that analytics and change management cannot be decoupled!

    With Super Bowl 50 just around the corner, is anyone using predictive or prescriptive analytics to determine if the Broncos or the Panthers will prevail? 

    ------------------------------
    Anne G. Robinson
    Executive Director, Supply Chain Strategy and Forward Operations
    Verizon Wireless, Basking Ridge NJ
    ------------------------------



  • 2.  RE: Can analytics predict the Super Bowl?

    Posted 02-02-2016 07:13
    Edited by Scott Nestler 02-02-2016 11:22

    First, as the Editor of the "Analytics in Sports" volume, I would like to thank Laura McLay for calling it out in her most excellent blog, Punk Rock OR, which I have read for several years now.  Also, thank you to Anne Robinson, the Series Editor, for asking me to put together this collection. This was a fun project because there is such a variety of quality material to choose from in the stable of INFORMS journals.  

     
    Now on to some specific comments about articles / podcasts / videos that Laura mentioned, and some others that interest me personally.  Although one of my goals in putting together this volume was to focus attention on recent articles (from the past 2-4 years), I chose to include the 1971 article by Carter and Machol, and also the 1953 Letter to the Editor (of OR) by Mottley precisely because I wanted to show that people the OR field have been "doing sports analytics" for decades.
     
    The majority of sports related articles appear in either Management Science (one of two flagship publications, the other being Operations Research) or else Interfaces, which has a more practical and applied bent to it than most journals. The article by Willoughby and Kostuk on strategic decision-making in curling (in Decision Analysis Journal) and the 2013 article by Doug Chung (in Marketing Science) are noteworthy in that they show diversity across a number of different INFORMS publications (which totals 14, for those who didn't know).  With regard to curling, I have never played but always enjoyed watching it as a kid when we went to the local country club with family friends and when it is on during the Winter Olympics.  Perhaps I like it because it is similar to "table (bar) shuffleboard."
     
    The article on Scheduling Baseball Umpires ..." by Mike Trick, Hakan Yildiz, and Tallys Yunes is one of a number of classic applications of optimization, that date back a number of years.  I included this one, over the seminal 1998 OR article by Mike Trick & George Nemhauser, "Scheduling a Major College Basketball Conference" primarily due to the "recency bias" in the intent of Editor's Cut.  But, as this series is a living collection that can be updated, rather than a static list, we have the ability to add items to the volume.  So, I'm putting that article in the collection too; look for it soon.
     
    Bukiet, Harold, and Palacios' article on using Markov Chains to develop an optimal batting order in baseball has fascinated me for over a decade now.  I started working on an improvement to it (with an expanded state space) with some friends when I was in graduate school at Maryland.  But, it wasn't computationally tenable.  I think we may be there now, both in terms of hardware and algorithms.  If anyone out there is interested in collaborating on this, please send me a private note.
     
    As a teaching faculty member, I have used some of the suggestions by Paul Kvam and Joel Sokol in "Teaching Statistics Using Sports Examples" to my students. Using sports as a "gateway drug" to math, statistics, and OR is a most excellent idea.  Also, I'm currently teaching a Sports Analytics elective course here at Notre Dame ("Go Irish!), to both MBAs and undergraduates, and have found the entire volume to be a useful tool for stimulating discussions in few of my class sessions. I hope that you find it both entertaining and useful too.
     
    Now, with regards to Anne Robinson's question about Super Bowl 50 (since they assume Americans can't read Roman numerals anymore?)...  Like Yogi Berra, or somebody once upon a time said, "prediction is hard, especially when it is about the future."  My corollary to that would be that prediction is especially difficult when people are involved.  So, while analytics can inform us about the likelihood of future events, randomness due to injury, weather (even in CA), and just plain old luck often seem to rule the day.  
     
    In the chapter "How Competitive are Competitive Sports?" of their book "Scorecasting," Moskowitz and Wortheim talk about differences between various professional sports leagues (e.g. NFL, NBA, NHL, MLB).  While there are similarities between these leagues (which are businesses), there are also differences (like the number or percentage of teams that make the playoffs, whether or not they have a salary cap, etc.) They close the chapter with the following.  "Trying to predict who will win the next Super Bowl is a fool's errand, but trying to predict who will win the next World Series is far easier.  Though you may not be right, you can limit your potential candidates to a handful of teams even before the season begins.  Funny thing about sports:  Distilled to their essence, they're all about competition.  But as an industry, some are more competitive than others."
     
    So what do others out there in the broader INFORMS community think?  Will the Panthers or the Broncos prevail?  (And was that end zone painting incident an honest mistake or not?)
     

    ------------------------------
    Scott Nestler, PhD, CAP
    Associate Teaching Professor
    Mendoza College of Business
    University of Notre Dame
    South Bend, IN



  • 3.  RE: Can analytics predict the Super Bowl?

    Posted 02-05-2016 09:49

    The  Editor's Cut: Analytics in Sports (INFORMS PubsOnline) is a fantastic way of organizing the work that goes on in operations research and sports.  While the papers are great, I particularly liked all the other material (videos, podcasts, and so on).  Academics like me can get caught up in the "paper chase" and forget that there are lots of other ways to get information and research out to the world.  

    Laura's blog Punk Rock Operations Research is a great example of an influential alternative outlet.  I was very happy to see her provide a summary of the Editor's Cut: kind of a Blogger's Cut of an Editor's Cut.   Of course, I am a bit biased: both she and Scott Nestler (in this discussion) said nice things about a paper that Hakan Yildiz, Tallys Yunes and I wrote about scheduling Major League Baseball umpires.

    Maybe an update on that work is in order.  Since that paper, I've continued scheduling the umpires through the company The Sports Scheduling Group, making the 2016 schedule the tenth schedule of ours that they have used.  The problem has changed quite a bit.  Perhaps the biggest change was a couple of years ago when MLB introduced video replay.  That was a very good year to be a minor-league umpire. Two new MLB crews were created, increasing the number of crews from 17 to 19:  eight minor-league umpires got promoted!  That change also made the schedule more complicated, since it was necessary to have two crews in New York at all time to handle the video reviews.  

    But complicated is good for optimization:  our systems were very useful to determine the impact of video replay on the umpires.  And we have ended up with a schedule structure that the crews seem to like with crews handling video replays shortly before their vacations.  Most stay on the east coast, and so have a relatively easy travel schedule leading up to their off weeks.

    The methods have been updated since that paper, with more of an emphasis on large neighborhood search rather than the simple simulated annealing in the paper.  But the problem still remains resistant to solution by exact methods.  The Traveling Umpire Problem (with a new website at TUP Benchmarks | Home) shows that 14 team tournaments are now schedulable, up from the 10 or so in our early work, but the 30 team MLB problem remains well out of reach.  But the heuristics consistently get good, usable schedules.

    Operations research and analytics continue to have an impact on umpire scheduling, and an even stronger impact on league scheduling.

    As for whether analytics can predict the Super Bowl?  Maybe not the winner, but it does have a big effect on the schedule that leads there!

    ------------------------------
    Michael Trick
    Professor
    Carnegie Mellon University
    Pittsburgh PA



  • 4.  RE: Can analytics predict the Super Bowl?

    Posted 02-06-2016 09:14

    Thanks for a fantastic compilation of resources, and the summary by Laura and Mike. I wanted to share info about another compilation of similar resources but with a greater focus on descriptive and predictive analytics. Teradata University Network, a community supported by Teradata to help academics teach data warehousing, analytics, BI concepts,  etc. has recently added an asset library developed by Dr Dave Schrader (a Purdue CS PhD who was working with Teradata until recently and is a TUN Academic Board member) that summarizes many sports analytics applications.  Scott and Dave are now in touch and I am certain we will see more updates on their collaboration. To see Dave's resources, please visit http://www.teradatauniversitynetwork.com/Sports-Analytics/Welcome/.  This resources is free to the faculty and students, but it does require a registration and approval for faculty (because they get access to teaching notes etc.). I will host a session in Nashville about TUN, but please join TUN, and see the excellent collection posted by Dave Schrader.  You will also find many other teaching resources – cases, software demos, access to SAS Visual Analytics, etc. on the site.

     

    Dave also created a number of BSI Videos to illustrate potential applications of analytics. One that relates to sports analytics can be found here. This link does not require registration yet: http://www.teradatauniversitynetwork.com/About-Us/Whats-New/BSI--Sports-Analytics---Precision-Football/

     

    Thanks – Ramesh

     

    Ramesh Sharda, PhD

    Vice Dean, Watson Graduate School of Management

    Regents Prof. and Watson/ConocoPhillips Chair of Mgmt. Sc. & Info Systems

    Spears School of Business/Oklahoma State University

     






  • 5.  RE: Can analytics predict the Super Bowl?

    Posted 02-07-2016 22:27

    I'm not sure if anyone predicted the outcome of the Super Bowl correctly or not; it was a somewhat different result than I expected. But, maybe it should have been expected with two very good defensive teams.  I'd like to thank those who have commented on this thread during the past week, as well as for the individual, off-line responses I received.

    It was interesting to hear some of the updates about referee scheduling from Mike Trick and Ramesh Sharda is spot on regarding the Teradata University Network (TUN) and "Dr. Dave" Schrader.  The resources there are definitely worth checking out.  As you see new articles, videos, or podcasts that you think should be included in the future, please let me know and we will consider them for inclusion.  

    Please join those of us in the INFORMS SpORts section at a couple of upcoming conferences where we have a role:  (1) the INFORMS Conference on Business Analytics and Operations Research in Orlando, FL, Apr. 10-12, where we have a track dedicated to "Sports & Entertainment Analytics" and (2) the Carolinsa Sports Analytics Meeting (CSAM) in Furman, SC (just before INFORMS on Apr. 9th) Also, if anyone will be in Boston next month for MIT Sloan Sports Analytics Conference (SSAC), I will be there.  Please introduce yourself should you see me.

    ------------------------------
    Scott Nestler
    Associate Teaching Professor
    University of Notre Dame
    Granger IN



  • 6.  RE: Can analytics predict the Super Bowl?

    Posted 02-08-2016 06:34
    Here's a short piece on predicting the Super Bowl from Jeff Strickland. Besides being humorous, he points out some of the pitfalls in selecting data (e.g. Look ahead bias)  and making assumptions for predictive modeling. https://www.linkedin.com/pulse/power-prediction-jeffrey-strickland-ph-d-cmsp

    --

    SCOTT NESTLER, PhD, CAP, PStat

    Associate Teaching Professor

    Department of Management

    MENDOZA COLLEGE OF BUSINESS

    UNIVERSITY OF NOTRE DAME

    383 Mendoza College of Business

    Notre Dame, IN 46556

    p:  (574) 631-8117

    e:  snestler@nd.edu

     










  • 7.  RE: Can analytics predict the Super Bowl?

    Posted 02-10-2016 07:19

    I've been following this discussion thread with interest (thank you, Laura McLay, for the initial blog post, and Anne Robinson for starting this thread on INFORMS Connect), and have decided to add my 2 cents, as well as ask a few questions of my own.

    First, I want to thank INFORMS (specifically, Scott Nestler) for including my blog post on building fantasy football teams as part of the Editor's Cut on Analytics in Sports. I feel particularly honored to appear a second time in this same issue through the umpire scheduling article I co-authored with Mike Trick and Hakan Yildiz. I particularly like the fact that an Editor's Cut issue is a live entity that keeps getting updated and improved.

    Scott mentioned an interest in studying baseball batting order. It turns out that two of my colleagues in the Management Science department here at the University of Miami (Anuj Mehrotra and Paul Sugrue) published a paper on this topic in 2007 entitled "An optimization model to determine batting order in baseball." (not published in an INFORMS journal, though; the horror!)

    Using sports to teach OR is indeed a great idea. I recall being a TA for an elective MBA class at the Tepper School called OR Applications, with Mike as the instructor (and from whom I learned a lot about how to teach well). One of the papers we asked the students to read was the Nemhauser and Trick (1998) paper on scheduling the ACC basketball tournament. The students loved it! I talk about our umpire scheduling paper to my MBA students and they always pay close attention. I use it to exemplify soft constraints and the fact that companies, not having a full understanding of the intricacies of a complex problem, sometimes give you a collection of requirements that are infeasible. An OR model is a great tool to convince them that what they want is impossible, as opposed to them thinking "it's just that you're not competent enough to do it."

    I enjoyed learning about the changes that had to be made to the way umpires get scheduled; I now have more details to tell my students. One question for Mike: you said that "most stay on the east cost", but I'm assuming this is just for the period when they're about to handle the video reviews followed by vacation, right? Because I recall one of the requirements was for them to (ideally) visit the home venues of every team.

    I now have a few general questions for the readers of the forum: my blog post on picking a fantasy football team is an example of a case where doing good predictive analytics is *the* crucial step because the prescriptive model that follows can be quite simple. From my anecdotal evidence (a friend from graduate school who worked for a hedge fund, and a boutique portfolio optimization firm who showed me their code), this also appears to happen in portfolio optimization: it is difficult and laborious to get good data (predictions of stock performance), but the optimization model that follows is usually a simple extension of a Markowitz model. Is this a general consensus? I recall a 2013 TutORial by John Beasley, in which he presents several different portfolio optimization models. How often are the fancier ones used in practice? Can you think of other application areas where this also happens? As someone who teaches optimization to future managers, these kinds of examples are quite useful because I can show them easy-to-understand optimization models that are useful in practice as long as the input data is good enough (at which point I wash my hands of the issue and say "and to learn how to get good data, talk to your stats professor"). So convenient! :-)

    I'm going to close by describing my failed attempt at getting what, in my opinion, would be the coolest OR-in-sports job ever. As a defective Brazilian that I am (I don't like soccer and can't dance samba), my true love is for basketball. Upon learning that the NBA's game schedule creator, Matt Winick, who had had that job for 30 years, was stepping down, I emailed Thomas Carelli, SVP Broadcasting, whose team would be in charge of the schedule going forward. I told him I had a PhD student who was very interested in the problem and, if he could provide us with the data and requirements, the student's thesis would be about creating a good NBA schedule. In return, we'd provide them with all of our findings and computer code completely free of charge. The response I got? A polite "Thanks, but no thanks." At least he emailed me back (and it didn't hurt to ask).

    ------------------------------
    Tallys Yunes
    Associate Professor
    University of Miami
    Coral Gables FL



  • 8.  RE: Can analytics predict the Super Bowl?

    Posted 02-04-2016 02:32

    In response to Anne Robinson's question, let me mention my unpublished efforts in NFL football prediction.  I've developed a normal conjugate Bayesian algorithm that predicts game scores. Hidden parameters are, for each of the 32 NFL teams, an offensive skill parameter f, a defensive skill parameter d, and a home advantage parameter h (so 3 x 32 = 96 parameters total).  The likelihood function assumes that observed score (x1, x0) when away team 1 plays home team 0 are given by x1 = f1 - d0 + error, and x0 = f0 + h0 - d1 + error.  My algorithm uses Bayes' rule to perform a conjugate update on a normal prior on the 96 parameters p = ((t1,d1,h1), ... , (f32,d32,h32) and produce a 96-dimensional normal posterior on p.  The algorithm next assumes p is perturbed by normally distributed random error during the following week, and the resulting distribution for p serves as the prior for the following Sunday. I use the mean of p to make predictions.  I estimated the error terms mentioned above empirically.   

    The algorithm only uses home and away game scores for each of the 14-16  games played each week.  No other data is used.  In particular, no injury data is collected.  Not accounting for injuries can lead to "naive" predictions from the algorithm, for example, when a second-string quarterback must play instead of the injured starter. A team that rests starting players when it has clinched a post-season birth also cannot be accounted for. At my discretion, I will sometimes not include scores involving such teams in the Bayesian update.

    The algorithm does pretty well in spite of these data limitations, correctly predicting winners in 60-69% of games each season (average 64.9% over nine seasons).  That is enough to usually keep me in the top 5 out of 20-25 participants in my local football pool.  Occasionally, I win some money.

    The algorithm currently has predicted mean score Carolina 27.08, Denver 20.53 for the Super Bowl. 

    ------------------------------
    Gordon Hazen
    Professor
    Northwestern University
    Evanston IL



  • 9.  RE: Can analytics predict the Super Bowl?

    Posted 02-15-2016 17:32
    Edited by Walt DeGrange 02-15-2016 18:16

    I was surprised nothing came of the Microsoft Surface 3 outage during the Denver Broncos and New England Patriots AFC Championship Game. In the second quarter, the Broncos rolled over the Patriot defense and scored a touchdown quickly to regain the lead. Evidently the Patriots' Microsoft Surface 3s were not functional during the drive. The Patriot defensive coaches were flying blind without near real-time video feedback. At the time, I thought this event might climb to the heights of Deflategate. The outage was blamed on a network problem that was resolved before the end of the second quarter and nothing much was made of the incident.

    I am not going to theorize on any wrongdoing by the Broncos in this situation but this event does bring up some interesting questions when it comes to technology and analytics in the future of sports. We have already seen a breach of an online player evaluation database by the St Louis Cardinals in Major League Baseball in 2015. Is this the first of many events that will have teams hacking other organizations for data? Networks that teams rely on for real-time video and data could also be vulnerable. Then there is the fact that with the increase in the use of technology something is bound to go haywire.

    Rules have been created to deal with technology in the past. An example of this is the radio receiver in the NFL quarterback's and defensive captain's helmets. If one team loses the ability to use these communications then the other team must also give up its ability for coaches to talk directly to the key players on the field.

    It should be interesting to see how league officials deal with the fairness of using all of the sensors and technology in the near future and if teams are willing to take chances to push (or cross) the boundaries of the rules. This is especially important with analytics playing a larger role in on-the-field decision making.

    ------------------------------
    Walter DeGrange
    Principal Operations Research Analyst
    CANA Advisors
    Chapel Hill NC