Monthly Archives: March 2011

land O links

I haven’t had time to read the news much in the last few weeks, but I found a few good links to share this week.

  • Worrying is good for your health. A massive longitudinal study on longevity finds some counter-intuitive (causal?) relationships between what we do and how long we live.
  • Can cities be described by a set of equations? The answer to this ridiculous-sounding question is actually yesThis Wired article qualitatively describes what these equations mean.  Both the good stuff in cities  (productivity, income, intelligence, innovation) and the bad stuff in cities (violent crime, drug consumption, disease outbreaks, shoplifting) scale superlinearly at a rate of ~1.15.  The NY Times magazine also featured an article about this story.
  • Fellow operations researcher Sanjay Saigal is a guest blogger at the Atlantic this week. His first post was about operations research (woo hoo!).  I enjoyed his post on Indian English.  Check this page for Sanjay’s latest posts.
  • The illustrated guide to the PhD is an amusing visual comic about how getting a PhD makes a small dent in human knowledge.
  • This visual made my day:  If Monet painted Darth Vader (Thanks John D. Cook for the link!)

on sharing code

I was recently asked for some of my code that I used in a paper.  First of all, I should firmly state that people should share code. Sharing and openly sharing ideas is, after all, the hallmark of academic research.

I am not often asked for code, and I had a few reactions to a recent request, which was made by a student.  My first reaction was that of a professor: should I give the student a fish or teach the student how to fish?  When I am usually asked for code, the request is extremely short with no context given, meaning that it is hard for me to gauge how hard the student had tried to get the program to work.  Was there a typo in the paper that is causing a problem? Where are they stuck?  What types of programming errors are they getting?  Are they simply being lazy?  I just don’t know.

This particular request relied on code that was about 20 lines of code.   Most of my projects are longer, usually comprising of hundreds or thousands of lines of code (I don’t really count, but they can get hairy).  I found myself wondering if the requester–a graduate student–really had thought about how to run the code or was just emailing me.  I exchanged a few emails with the requester before sending my code to make sure that I wasn’t doing someone’s homework for them.

Another recent request asked for code in a paper that I did the computational work for when I was in graduate school.  I looked at the pseudo-code in the paper draft and wrote the code.  It took me a day or so to get my code working, but it wasn’t particularly painful.  In this case, I was confident that the paper was clear and unambiguous about how to write the code.

My second concern about sharing code is–and I’m being honest here–my code is one giant hack.  I am not a software programmer.  I know what good, elegant code looks like, and it’s not mine.  I often have to cobble together multiple programs to solve a problem from beginning to end.  I often write a script to run many copies of the same program with different inputs.  I always write code for analyzing the solutions and creating figures.  Over the years, I have gotten good at making my code readable to me, so that I can come back to it after months or years and figure out what I did.  But that’s not the same thing as being readable to someone else.  This is a long way of saying that I’m a little embarrassed about sharing my code with others.  Maybe I’m just prudent and am being too hard on myself.  But I am married to a software programmer, so I very aware of how high the bar really is for “good” code.

Having someone look at my code is like inviting someone into my house before straightening up first. It’s one thing to show my messy code to a collaborator but it’s another thing to show my messy code to a stranger.  Sharing papers and tech reports is different–they are polished so they are OK to share.  This can be somewhat addressed by commenting code better.  I always start off commenting code well, but during the fog of debugging, my code usually gets a little out of control, and it’s hard to reign in after awhile.  (I’ve seen other people’s code.  I have some good programming habits–my code could be much, much worse).

However, as I am learning in my discrete optimization course this semester, even simple programming assignments such as implementing the Secretary Problem Markov decision process model can be incredibly difficult for PhD students.  They can benefit from looking at my code.  My homework solution code isn’t as wild and unruly as my research code.  I’m getting used to sharing my code for the homework solutions.

On a related note, this post by Panos Ipeirotis reflects on how to make code more robust to changes, since old code often does not run if it relies on old libraries.  Dr. Ipeirotis is a computer scientist, and it sounds like he writes more elegant code than I do.  I’m still in square one, meaning that I try to make my code readable to someone else.

How self-conscious are you about sharing your code?

 


the first supercomputer was powered by women

I stumbled across an article on ENIAC, the world’s first supercomputer that was built by the Army and unveiled in 1946.  It summarizes twelve factoids about ENIAC. I found these two the most interesting:

5. The original programmers of ENIAC computer were women. The most famous of the group was Jean Jennings Bartik (originally Betty Jennings). The other five women were Kay McNulty, Betty Snyder, Marlyn Wescoff, Fran Bilas, and Ruth Lichterman. All six have been inducted into the Women in Technology International Hall of Fame. When the U.S. Army introduced the ENIAC to the public, it introduced the inventors (Dr. John Mauchly and J. Presper Eckert), but it never introduced the female programmers.

6. Jean Bartik went on to become an editor for Auerback Publishers, and eventually worked for Data Decisions, which was funded by Ziff-Davis Publishing. She has a museum in her name at Northwest Missouri State university in Maryville, Missouri.

Kudos to the women of ENIAC and other women of supercomputing fame!

The women of ENIAC


love & operations research

My February blog post is a bit belated and less technical than it should be.  It has been a busy February for me as I prepare for the arrival of my third child.  I am sure that I will have a few blog posts about love after my little bundle of joy arrives some time during March.

I thoroughly enjoyed Anna Nagurney’s photo essay of why she loves OR as well as Mike Trick’s personal account of finding (and not finding) love via the Secretary problem.  Mike Trick’s blog post reminded me of my father’s criteria for finding a spouse:  he made a list and searched until he met someone who met all of his criteria.  He was chided by many for making a list.  How could anyone meet all of his criteria?  My father is apparently a satisficer not an optimizer.  It turned out that his problem was not over-constrained and infeasible: he met and married my mother, and they will be celebrating their 40th wedding anniversary later this year.

I tell this story about my father because I did not use operations research in choosing a spouse.  Maybe I didn’t need to–my father set such a wonderful example about laying out the right criteria that I think I had a good road map to start with.

I tend to use operations research to run my household so that I can optimally balance work and family life (I will blog about that in the near future).  I have been known to mention critical paths, bottlenecks, and value focused thinking during breakfast conversations about planning a hectic day with my husband.  He doesn’t always appreciate when I try to optimize him, but as a nerd, he always knows that my intentions are good.  And frequently my ideas about scheduling are helpful with budgeting our time on a busy day.  However, my INFORMS blog challenge post about love can be summarized in a single sentence: Do not try to optimize your spouse.

On a somewhat related note, I taught the Secretary problem to my discrete optimization course in February.  As mentioned by Mike Trick, the secretary problem can be used to model finding a spouse (although we’d both agree that it’s not a perfect model for finding love).  The optimal solution to the Secretary problem involves rejecting the first ~exp(-1) candidates (a proportion of the total number of candidates n), meaning that the odds that rejecting the love of your life has probability ~0.368!  However, this is an asymptotic result in terms of the number of love interests.  Most of us mere mortals will end up dating a finite number of people, so we will fare somewhat better (unless we choose a suboptimal strategy).

I required my students to program the Markov decision process model for the Secretary Problem as a homework assignment.  This meant that I had to do the same to be able to go over the solutions.  This homework problem turned out to be really challenging for them, since it was their first time writing a recursion.  I hadn’t written a recursion in awhile, but got my code to work after a few minutes.  Just for kicks, here is the optimal solution value, the optimal threshold, and a simulation over 10,000 replications illustrating the results.

The optimal policy for the secretary problem.  This shows the number of candidates to reject up front.  After that, the optimal policy is to hire the next candidate that is the best seen so far.

The optimal policy for the secretary problem/spouse finding problem. This shows the number of candidates to reject up front. After that, the optimal policy is to hire/marry the next candidate that is the best seen so far (the blue line shows the first candidate that could be hired/married).

The probability of hiring the best secretary/choosing the right spouse as a function of the number of candidates available

The probability of hiring the best secretary/choosing the right spouse as a function of the number of candidates available (n)

The probability of hiring a secretary/finding a spouse as a function of the number of candidates (n) simulated over 10,000 replications

The probability of hiring a secretary/finding a spouse and the probability of hiring the *best* secretary/choosing the right spouse as a function of the number of candidates (n) simulated over 10,000 replications

 


Follow

Get every new post delivered to your Inbox.

Join 2,666 other followers