Category Archives: Uncategorized

what I learned from preparing for a semi-plenary talk

I recently blogged about a semi-plenary talk I gave at the German OR Society Conference. This post is about the process of preparing for that presentation.

First I thought about the story I wanted to tell. I’ve given a lot of research talks before. I understand the general plot of a research talk, but a semi-plenary was not a regular research talk. I wasn’t initially sure how to tell a story in a new way. I asked a wise colleague for advice, which was excellent:

  1. Think about your favorite plenary talks. Model your talk after that (including the amount of math to include in the talk).
  2. Think of the talk as a series of 30 second elevator talks. Let those messages structure your story.
  3. Your audience will want to feel that they’ve learned something. What are the takeaways?

I found that creating an initial set of slides wasn’t so bad once I decided in the story I wanted to tell. I have given so many talks before that I had a huge set of slides that I could pull from. I had too many slides and could not fit into the time slot, and editing and pruning my slides was pure torture.

A few months ago, I read a post by an academic blogger who had recently given a plenary talk. I can’t find the post now but I remember that it took about 40 hours to create a one hour talk. This reminded me of an earlier post on teaching MOOCs (How college is like choosing between going to the movies and Netflix), where an enormous amount of time goes into a single lecture.

Here is why it took so long. I noticed that every time I removed a slide or combined a few slides into a single slide, it affected the story narrative in a major way. In a regular research talk, I find it easy to pick a few details to leave out. Not the case this time. Rather than condense the story, I eventually left some topics out all together or turned the insights from a  paper into a couple of bullet points on a slide. Finding the right balance of detail and insight was a constant challenge.

I ended up having almost no math in my talk. I decided that insights were more important that going through technical details.

I recreated almost all of the visuals from my slides in previous talk. It’s not that my visuals were total crap, it’s just that there was just too much detail and notation in previous figures I made for research talks. I didn’t want confusing visuals getting in the way of the story. Sometimes I added a picture to illustrate an idea or insight that was technical in nature rather than launching into a long narrative to explain a simple point. Here is an example of a new visual explaining the concept of ambulance response times and coverage:

Example of a conceptual slide I used in my talk.

Example of a conceptual slide I used in my talk.

Other times i just needed to make a simpler version of a figure or table that allowed me to look at a single curve or to compare two things, instead of a busier figure that works in a regular research talk. At one point, I changed a figure with four subfigures into a single figure by omitting the other three subfigures. I make nearly all of my figures with Matlab and save my code so that I can easily recreate figures for presentations or paper revisions. Remaking figures wasn’t too taxing, but remaking a lot of figures took some time.

Finally, I learned so much about my research when giving this talk. The end my my talk answered two questions:

  1. Where is emergency medical service research in OR going?
  2. Where does emergency medical service research in OR need to go?

I think about high level issues all the time (after all, I frequently write proposals!). But this was different: I was talking about places where this entire line of research is going, not just mine. When I was answering the question “Where does emergency medical service research in OR need to go?” when making my slides, I learned that my research had already made progress in the right direction. Not all of my ideas are in line with the where this line of research needs to go, and it was worthwhile to realign my priorities.

 

Related posts:

  1. Do you have a 30 second elevator talk about your research?
  2. The most important 30 seconds of your dissertation defense

 


underpowered statistical tests and the myth of the myth of the hot hand

In grad school, I learned about the hot hand fallacy in basketball. The so-called “hot hand” is the person whose scoring success probability is temporarily increased and therefore should shoot the ball more often (in the basketball context). I thought the myth of the hot hand effect was an amazing result: there is no such thing as a hot hand in sports, it’s just that humans are not good at evaluating streaks of successes (hot hand) or failures (slumps).

Flash forward years later. I read a headline about how hand sanitizer doesn’t “work” in terms of preventing illness. I looked at the abstract and read off the numbers. The group that used hand sanitizer (in addition to hand washing) got sick 15-20% less than the control group that only washed hands. The 15-20% difference wasn’t statistically significant so it was impossible to conclude that hand sanitizing helped, but it represented a lot of illnesses averted. I wondered if this difference would have been statistically significant if the number of participants was just a bit larger.

It turns out that I was onto something.

The hot hand fallacy is like the hand sanitizer study: the study design was underpowered, meaning that there is no way to reject the null hypothesis and draw the “correct” conclusion whether or not the hot hand effect or the hand sanitizer effect is real. In the case of the hand sanitizer, the number of participants needed to be large enough to detect a 15-20% improvement in the number of illnesses acquired. Undergraduates do this in probability and statistics courses where they estimate the sample size needed. But often researchers sometimes forget to design an experiment in a way that can detect real differences.

My UW-Madison colleague Jordan Ellenberg has a great article about the myth of the myth of the hot hand on Deadspin and it’s fantastic. He has more in his book How Not to Be Wrong, which I highly recommend.  He introduced me to a research paper by Kevin Korb and Michael Stillwell that compared statistical tests used to test for the hot hand effect on simulated data that did indeed have a hot hand. The “hot” data alternated between streaks with success probabilities of 50% and 90%. They demonstrated that the serial correlation and runs tests used in the ‘early “hot hand fallacy” paper were unable to identify a real hot hand, and therefore, these tests were underpowered and unable to reject the null hypothesis when it was indeed false. This is poor test design. If you want to answer a question using any kind of statistical test, it’s important to collect enough data and use the right tools so you can find the signal in the noise (if there is one) and reject the null hypothesis if it is false.

I learned that there appears to be no hot hand in sports where a defense can easily adapt to put greater defensive pressure on the “hot” player, like basketball and football. So the player may be hot but it doesn’t show up in the statistics only because the hot player is, say, double teamed. The hot hand is more apparent and measurable in sports where defenses are not flexible enough to put more pressure on the hot player, like in baseball and volleyball.

 

 


land O links

Here are a few links for your enjoyment:

  1. Does a five year old need to learn how to code?
  2. A mathematician uses statistics to predict the next Game of Thrones death.
  3. Why academics stink at writing.
  4. An operations researcher argues that airports should screen for Ebola the same way it screens for terrorists (nice job Sheldon Jacobson!). He was also interviewed on MSNBC.
  5. How diversity makes us smarter
  6. Article on why women should learn to love criticism. HT @katemath. “76 percent of the negative feedback given to women included some kind of personality criticism, such as comments that the woman was ‘abrasive,’ ‘judgmental’ or ‘strident.’ Only 2 percent of men’s critical reviews included negative personality comments.” Discuss.
  7. Are you a satisficer or a maximizer?

in defense of model complexity

Recently I wrote a post in defense of model simplicity. I liked a lot of things about that post, but it wasn’t the entire picture. Much of what we do in operations research deals with solving complex problems, and often we can’t settle for anything simple. Simple models can be incredibly useful, but they are generally useful when we are looking at a piece of a system without so many moving parts. Do we make a credit card offer to person X? Yes or no? An educated guess will suffice. A model (simple or complicated) that replaces that educated guess can be a big improvement. But the decision context is inherently simple: we need a model that tells is yes or no.

Operations research does so well when we need an answer to a complex problem with many interconnected parts. Case in point: it’s hard to find a feasible solution in many optimization models (capacitated facility location, scheduling models, vehicle routing problem with time windows, etc.). It’s trivial for finding a feasible solution to a yes-or-no problem

The last two semesters, I’ve team taught an introductory course for engineering freshmen on engineering grand challenges. Nearly all Wisconsin engineering students are admitted to the College of Engineering without a major (they are in a “general engineering” curriculum for the first year), although this is starting to change this year. One of the goals of this course is to introduce the students to different majors. I am in charge of Industrial and Systems Engineering. I have to talk about operations research, manufacturing, and human factors (confession: I really struggle with human factors). I’ve gotten better at telling 18 year olds about why industrial engineering is so cool. I’ve found that using a few examples is the best way to make this point.

(1) My favorite example is explaining why Major League Baseball scheduling is so hard (thanks again Mike Trick! Read more here and here) This example is so intuitive to so many students because they understand the many constraints:

  • 30 teams with 162 games each
  • half home games, half away games
  • each team must play each other a given number of times
  • a team cannot play too many away games in a row
  • travel distances matter: a team can’t fly across the country all the time
  • television revenue make some schedules more attractive than others
  • teams play each other, so you can’t fix a part of the schedule in isolation: everything affects everything else
  • you finally get a schedule you like, and then the Pope asks to visit New York and needs to be in a baseball stadium the day a game is scheduled, forcing you to reschedule the season.

(2) Scheduling people for work shifts is another problem that needs complex models. Decisions are discrete (you work the 8AM shift or you don’t), and there are many constraints, such as hour limits per week, block structure schedules, consecutive scheduling problems, union rules. Paul Rubin has a great blog post on scheduling instability:

The multiperiod nature of scheduling models tends to make them a bit chewy, especially when you need to coordinate schedules of multiple individuals (or machines, or venues) across blocks of time while dealing with multiple constraints and possibly multiple conflicting criteria. Not all complex models are optimization models.

(3) Awhile back, I blogged about an article in Nautilus about optimization and trucking [Link and Link] all about the difficulty in coming up with useful models for effectively delivering goods by truck. For the models to be useful, they need to account for many rules (like breaks) and the human behavior of the truck drivers. The resulting models are pretty complex.

There were union rules, there was industry practice. Tractors can be stored anywhere, humans like to go home at night. “I said we’re going to need a file with 2,000 rules. Trucks are simple; drivers are complicated.” At UPS, a program could come up with a great route, but if it violated, say, the Teamsters Union rules, it was worthless. For instance, time windows need to be built in for driver’s breaks and lunches.

(4) Thus far, I’ve only used scheduling and routing examples with a lot of interconnected parts. But those aren’t the only models that require complexity. My interest in public sector operations research has led me to appreciate so-called “wicked problems” (as opposed to “tame” problems). Wicked problems often addresses the soft side of operations research and is a defense of model complexity. Due to the social component of the problem, there are many stakeholders with contradictory needs. A problem that is wicked quickly unravels due to the connections it has to other issues that are also social, and so on. Russell Ackoff summed this up nicely:

“Every problem interacts with other problems and is therefore part of a set of interrelated problems, a system of problems…. I choose to call such a system a mess.”

I recommend C. West Churchman’s guest editorial in Management Science in 1967, where the term “wicked problems” was coined [pdf: Wicked Problems Churchman 1967] and this nice article on “wicked” problems by John Mingers in OR/MS Today.

Do you have a favorite complex model for a wicked problem?

 

Related reading:


a journey to the German OR Society Conference

Earlier in September, I gave a semi-plenary at the 2014 German OR Conference in Aachen, Germany. It was a wonderful conference and experience that will inspire at least another blog post or two. The German OR Society and Marco Lübbecke were wonderful hosts and conference organizers. There were more than 850 attendees, 500 talks, and an impressive group of plenary and semi-plenary talks.

Earlier I blogged about Mike Trick’s plenary talk on Major League Baseball scheduling and analytics that opened up the conference. I’m finally getting around to blogging about my talk on emergency medical services. For another take, see Mike Trick’s blog post about my talk. I learned a lot by giving the talk and talking to German researchers. Emergency medical services are operated in different ways in different parts of the world. It was refreshing to talk to other researchers who are looking at healthcare delivery issues from a different perspective than we have in the United States. It was also fun to catch up with two of my favorite bloggers (Mike Trick and and Marc-Andre Carle) at social events and meet some Punk Rock OR readers from across the pond.

I posted the slides to my talk below.

I took a few pictures from the conference and from Aachen that capture some of the highlights of the trip.

IMG_0681-0.JPG

At the reception with Mike Trick and Marc-Andre Carle.

20140904-161510-58510203.jpg

At the reception.

IMG_0686.JPG

At the conference.

IMG_0669.JPG

There were pretzels at almost every meal and snack break.

IMG_0674-0.JPG

A statue in a square in Aachen.

IMG_0672-0.JPG

A square in Aachen.

IMG_0689.JPG

What looks like a panther statue.

IMG_0680.JPG

I ran into Belguim and the Netherlands and found the Dreiländerpunkt [three-country point].

IMG_0690.JPG

The conference bags were pretty snazzy.

20140904-161510-58510375.jpg

A snapshot of me blogging about Mike Trick’s keynote as taken by my laptop. I didn’t realize how serious I look–blogging is a lot of fun, I swear!

The German OR society has a great mascot: a GORilla!


land O links

Here are a few links for your weekend reading.

  1. Wealthy Los Angeles K12 school vaccination rates are as low as Sudan’s. We have a paradox: higher overall vaccination rates and higher vulnerability due to risk caused by social networks choosing not to vaccinate.
  2. A few OR/Stat bloggers have written about FiveThirtyEight and data journalism recently. I like Michael Lopez’s (@StatsByLopez) blog post on where FiveThirtyEight stands after six months and Nathan Brixius’s take on FiveThirtyEight’s burrito challenge.
  3. The sad, gradual decline of the fade-out in popular music.
  4. Athene Donald on imposter syndrome and everyone feeling like an imposter sometime.
  5. I want to try code reviews in lab meetings.

 


Major League Baseball scheduling at the German OR Society Conference

Mike Trick talked about his experience setting the Major League Baseball (MLB) schedule at the 2014 German OR Conference in Aachen, Germany. Mike’s plenary talk had two major themes:
1. Getting the job with the MLB
2. Keeping the job with the MLB

The getting the job section summarized advances in computing power and integer programming solvers that have made solving large-scale integer programming (IP) models a reality. Mike talked about how he used to generate cuts for his models, but now the solvers (like CPLEX or Gurobi) add a lot of the cuts automatically as part of pre-processing. Over time, Mike’s approach has become popping his models into CPLEX and then figuring out what the solver is doing so he can exploit the tools that already exist.

Side note: I am amazed at how good the integer programming solvers have become. I recently worked on a variation to the set covering model for which a greedy approximation algorithm exists. The time complexity of the greedy algorithm isn’t great in theory. In practice, the greedy algorithm is slower than the solver (Gurobi, I think) and doesn’t guarantee optimality. I can’t believe we’ve come this far.

Mike also stressed the importance of finding better ways to formulate the problem to create a better structure for the IP solver.  Better formulations can be more complicated and less intuitive, but they can lead to markedly better linear programming bounds. Mike achieved this by replacing his model with binary variables that correspond to team-to-team games (does team i play team j on day t?) with another model whose variables correspond to series (a series is usually 3 games played between teams on consecutive days). Good bounds from the linear programming relaxations help the IP solver find an optimal solution much quicker. Another innovation focused on improving the schedule by “throwing away” much of the schedule (usually about a month) after making needed changes and resolving. Again, this is something that is possible due to advances in computing.

The keeping the job section addressed business analytics and its role in optimization. Mike defined business analytics as using data to make better decisions, something that OR has always done. What is new is using the power of data analytics and predictive modeling to guide prescriptive integer programming models in a meaningful way. The old way was to use point estimates in integer programming models, the new way uses more information (such as the output of a logistic regression) to guide optimization models. The application Mike used was estimating the value of scheduling home games at different times (day vs. night) and day of the week. When embedded in the optimization modeling framework, the end result was that creating a schedule using business analytics could add about $50M to MLB in revenue. 

Mike summed up his talk but talking about how educating the marketing folks is part of the job now. Marketing likes to measure “success” as the number of games that sell out. Operations researchers recognize that sold out games are lost revenue, so the goal has become to schedule games such that games are almost sold out, and making sure that marketing understands this approach.

Related post:

the craft of scheduling Major League Baseball games


Follow

Get every new post delivered to your Inbox.

Join 2,351 other followers