Category Archives: Uncategorized

great mathy popular science books

Bucky_Badger_READ10_5733At the end of the semester, I often recommend fun popular science books to my students about how to approach problems and make better decisions using math, operations research, and critical and quantitative reasoning. My list is growing. Here is my list in no particular order.

~~~

Moneyball: The Art of Winning an Unfair Game by Michael Lewis. This is a great introduction to building models, collecting data, finding “bargains” in the market, drawing conclusions from the models, and differentiating between a good process and good outcomes. Everyone should read Moneyball.

Scorecasting: The Hidden Influences Behind How Sports are Played and Games are Won by Tobias Moskowitz and Jon Wertheim. This is a more mathy book than Moneyball that is hard to put down. Tobias and Jon dissect many papers that quantify sports and sports decision making. They address home field advantage, umpire bias, metrics that are not useful (such as blocked shots in basketball), and elasticity in fandom.

In Pursuit of the Traveling Salesman by Bill Cook. This is the most specialized book on the list, but do not be intimidated. It is highly accessible and is well worth your while. Bill does a wonderful job explaining optimization concepts (often using pictures) and introducing you to the people who made scientific breakthroughs. I thought that maybe one chapter might be tough for someone who is unfamiliar with optimization, but even in that chapter Bill does a superb job of stepping the reader through the steps of various algorithms. I recommend the print version for following along with the many figures and pictures in the book. Read more in my review here.

How Not to be Wrong: The Power of Mathematical Thinking by Jordan Ellenberg. This was my favorite book from 2014. This book is a joy to read by anyone who is even remotely mathematically literate. Jordan’s writing is fun to read and his examples are very relevant. I loved the parts about how not all lines are straight lines – some are curves. This book was well-reviewed by various newspapers. I like Scientific American blogger Evelyn Lamb’s more mathy review.

The Signal and the Noise: Why So Many Predictions Fail — But Some Don’t by Nate Silver. This is a popular science book about prediction and science communication. There are many good takeaways about election forecasting, accuracy of weather predictions, Sabermetrics (Moneyball!), and online poker. It’s a fun read, but I found the mathy parts to be somewhat shallow. I prefer How Not to be Wrong, where Jordan did a better job digging into the math while also remaining accessible.

The Black Swan: The Impact of the Highly Improbable by Nassim Nicholas Taleb. I’ll be honest, I have a love hate relationship with this one (my review simply stated “it was OK”) but it’s worth a read. The discussion of extreme risks is very good, but Taleb is too critical of other people’s models. He loses sight of the fact that all models are wrong, but some are useful.

Traffic: Why We Drive the Way We Do (and What It Says About Us) by Tom Vanderbilt. This is a delightful book about networks and the psychology of driving that I suspect will appeal to my blog readers. Tom is a journalist, and is really good at writing about science. I reviewed the book here and have another post hereTraffic contains many interesting tidbits of knowledge that make for good chit chat during awkward party conversations (I’m not always the life of the party, but Traffic helps!).

~~~

I’m probably leaving something out. What are your favorite popular science/math/operations research books?


some students don’t learn a whole lot in college

A few years ago, researchers Richard Arum and Josipa Roksa released a book called “Academically Adrift” that claims that many students don’t leave college with new knowledge and new skills [Link to an article in the Chronicle]: Here is what they found:

Growing numbers of students are sent to college at increasingly higher costs, but for a large proportion of them the gains in critical thinking, complex reasoning, and written communication are either exceedingly small or empirically nonexistent. At least 45 percent of students in our sample did not demonstrate any statistically significant improvement in Collegiate Learning Assessment [CLA] performance during the first two years of college. [Further study has indicated that 36 percent of students did not show any significant improvement over four years.]

The CLA is a proxy measure for what students learned during college. This suggests that more than a third of college students do not demonstrate any improvement in critical thinking during college. This is a tragedy. College is expensive.

Now there are a few things to note. Most obviously, these results are averaged across all students in all majors at all universities. Your mileage may vary. I share this information with my students on the first day of class. I challenge my students, but I think they will get their money’s worth from my class and will leave with tangible improvements in critical thinking and complex reasoning (and sometimes written communication, but I could do more with writing).

Students who did the best didn’t always go to the best universities (but that helps). Students with high levels of learning:

  • studied alone (yeah for introverts!)
  • had professors with high academic expectations
  • studied traditional liberal arts and sciences (as compared to business, education and communications).

This suggests the only thing I can do as a professor is to have high expectations for students (and to give assignments that raise these expectations).

I realize that in the big picture, some programs are quite competitive and attract the types of students who like being challenged and as a result, are challenged. But I realize it’s more complicated than this: there is a push and pull between professors and students about expectations (see this article about a teaching assistant at Columbia who inflated grades because so many students complained and it’s been widely reported that college students study much less than they used to). In general, these researchers found that professors do not expect much of the students and assign almost no homework.

A follow up report is out [Link] called “Aspiring Adults Adrift.” The authors found that the same students who didn’t learn much in college continue to struggle with employment afterward. What they find is really interesting. The same students that didn’t do well on the CLA were more likely to be unemployed, under employed, employed in a job with low skill requirements, and laid off. In other words, employers are good at recognizing who developed more skills in college and who didn’t.

The research suggests that some students don’t want to be challenged or to learn; they just want a degree. It’s not fun to “teach” students who don’t want to learn anything.

Interestingly, the students themselves cannot tell if they’ve learned a lot in college. They all assume they’ve learned a lot! This is not good. It implies that students are not good consumers when it comes to investing in their educations, and don’t see implications of taking blow off courses or choosing easy programs. (Side note: this is a reason why students should not estimate how much they’ve learned in end of the semester teaching evaluations.) The article ends with an important point:

Yet those same students continue to believe they got a great education, even after two years of struggle [after graduation]. This suggests a fundamental failure in the higher education market — while employers can tell the difference between those who learned in college and those who were left academically adrift, the students themselves cannot.

Finally, correction at the end of the NY Times article made me cringe:

“An earlier version of this article incorrectly used a male courtesy title for Josipa Roksa. She is a woman.”

I am curious about how you challenge students in tough classes. I’ve been given a lot of teaching advice of the years, and most of it hasn’t been very useful or practical (“Just be an extroverted man with CEO hair and you’ll do great!”). Teaching is definitely all about managing expectations, and I’d like to do that without caving and giving everyone an A (I don’t!). I’m sure I have a lot to learn from my readers who I know teach a lot of “hard” courses.


5 links about women and girls in STEM

  1. A new report called Women in Academic Science: A Changing Landscape suggests that women who persevere and stay in a STEM field will find a level playing field.
  2. Why do women undergraduate students change major after taking Calculus I? Not for the same reasons as the men.
  3. The share of women in computer science started falling at roughly the same moment when personal computers featuring games marketed almost entirely to men and boys started showing up in U.S. homes in significant numbers.
  4. The number of high schools offering AP computer science courses have fallen 35% in recent years.
  5. The princess trap: Our daughter is getting into dolls and dress-up. What are programmer parents to do? A someone controversial take on raising daughters. My take on this article is in my tweet below:


Thinking about getting a PhD? Here are some good resources.

Part of my job is to help students figure out if grad school is for them. Over the years, I’ve accumulated a few great resources for students thinking about a PhD. Here are some of my favorites.

Here is a link to the first of a series of ten blog posts by Jean Yang, who decided to leave Google to attend grad school in CS at MIT.

Philip Guo released an e-book on the PhD experience called The PhD Grind and has a 45 minute lecture on why to consider a PhD in CS. He has a bunch of other good posts about academic life as both a student and an assistant professor. He has a post specifically on applying to grad school.

Tim Hopper has a great series about whether one should consider a PhD in a technical field. I like this series because a broad range of people participated and not everyone encourages a PhD.

Chris Chamber’s blog post called “Tough Love: an insensitive guide to thriving in your PhD” has a lot of frank advice for determining whether a PhD is for you.

Below is a slideshare presentation I put together for the VCU math club a few years ago, since several seniors were planning to apply to graduate programs.

This is a woefully incomplete list, but I don’t want to delay posting it any longer. What other useful resources are out there?


how to seat guests at a wedding

How to optimally seat people at a wedding.

SAS started an operations research blog [Link]. Matthew Galati’s first entry is how to optimally seat people at a wedding given assignment preferences. He provides a model that maximizes the total happiness of his guests. His blog post has code, data, and a pictures of a quirky family member or two. It’s a great post worth checking out.

I’ve written about optimization for weddings before. I blogged about a paper entitled “Finding an Optimal Seating Chart” in the Annals of Improbable Research by Meghan L. Bellows and J. D. Luc Peterson shows how to use integer programming to optimally seat guests at a wedding [blog post & paper]. The reader comments to this post are really interesting – many people have used a similar modeling approach.

Geoffrey De Smet provided a link to a wedding planner on github.

Related posts:


how to forecast an election using simulation: a case study for teaching operations research

After extensively blogging about the 2012 Presidential election and analytical models used to forecast the election (go here for links to some of these old posts), I decided to create a case study on Presidential election forecasting using polling data. This blog post is about this case study. I originally developed the case study for an undergraduate course on math modeling that used Palisade Decision Tools like @RISK. I retooled the spreadsheet for my undergraduate course in simulation in Spring 2014 to not rely on @RISK. All materials available in the Files tab.

The basic idea is that there are a number of mathematical models for predicting who will win the Presidential Election. The most accurate (and the most popular) use simulation to forecast the state-level outcomes based on state polls. The most sophisticated models like Nate Silver’s 538 model incorporate things such as poll biases, economic data, and momentum. I wanted to incorporate poll biases.

For this case study, we will look at state-level poll data from the 2012 Presidential election. The spreadsheet contains realistic polling data from before the election. Simulation is a useful tool for translating the uncertainty in the polls to potential election outcomes.  There are 538 electoral votes: whoever gets 270 or more votes wins.

Assumptions:

  1. Everyone votes for one of two candidates (i.e., no third party candidates – every vote that is not for Obama is for Romney).
  2. The proportion of votes that go to a candidate is normally distributed according to a known mean and standard deviation in every state. We will track Obama’s proportion of the votes since he was the incumbent in 2012.
  3. Whoever gets more than 50% of the votes in a state wins all of the state’s electoral votes. [Note: most but not all states do this].
  4. The votes cast in each state are independent, i.e., the outcome in one state does not affect the outcomes in another.

There is some concern that the polls are biased in four of the key swing states (Florida, Pennsylvania, Virginia, Wisconsin). A bias means that the poll average for Obama is too high. Let’s consider biases of 0%, 0.5%, 1%, 1.5%, and 2% and implement (all states affected by the same bias level at the same time). For example, the mean for Wisconsin is 52%. This mean would be 50% – 52% depending on the amount of bias. Side note: Obama was such an overwhelming favorite that it only makes sense to look at biases that work in his favor.

It is very difficult to find polls that are unbiased. Nate Silver of FiveThirtyEight wrote about this issue in “Registered voter polls will (usually) overrate Democrats): http://fivethirtyeight.com/features/registered-voter-polls-will-usually-overrate-democrats/

Inputs:

  1. The poll statistics of the mean and standard deviation for each state.
  2. The number of electoral votes for each state.

Outputs:

  1. The total number of electoral votes for Obama
  2. An indicator variable to capture whether Obama won the election.

Tasks:

(1) Using the spreadsheet, simulate the proportion of votes in each state that are for Obama using a spreadsheet for each of the 5 scenarios. Run 200 replications for each simulation. For each iteration, determine the number of electoral votes in each state that go to Obama and Romney and who won.

(2) Paste the model outputs (the average and standard deviation of the number of electoral votes for Obama and the probability that Obama wins) for each of the five bias scenarios into a table.

(3) What is the probability of a tie (exactly 269 votes)?

Modeling questions to think about:

  1. Obama took 332 electoral votes compared to Romney’s 206. Do you think that this outcome was well-characterized in the model or was it an unexpected outcome?
  2. Look at the frequency plot of the number of electoral votes for Obama (choose any of the simulations). Why do some electoral vote totals like 307, 313, and 332 occur more frequently than the others?
  3. Why do you think a tiny bias in 4 states would disproportionately affect the election outcomes?
  4. How do you think the simplifying assumptions affected the model outputs?
  5. No model is perfect, but an imperfect model can still be useful. Do you think this simulation model was useful?

RESULTS

I don’t give the results to my students ahead of time, but here is a figure of the results using @RISK. The students can see how small changes in poll bias can drastically affect the outcomes. With no bias, Obama has a 98.3% chance of winning and with a 2% bias in a mere four swing states, Obama’s chances go down to 79.3%.

@RISK output for the election model. The histogram shows the unbiased results. The table below tabulates the results for different levels of bias.

@RISK output for the election model. The histogram shows the distribution of electoral votes for the unbiased results. The table below tabulates the results for different levels of bias.

Files.

Here are the instructions, the Excel spreadsheet for Monte Carlo simulation, and the Excel spreadsheet that can be used with @RISK.

More reading:


election analytics roundup

Here are a few election related links:

I’ve blogged about elections a lot before. Here are some of my favorites:


Follow

Get every new post delivered to your Inbox.

Join 2,464 other followers