big data and operations research
Posted by Laura McLay on January 16, 2013
Sheldon Jacobson and Edwin Romeijn, the OR and SES/MES program directors at NSF, respectively, talked about the role of operations research in the bigger picture of scientific research at the INFORMS Computing Society Conference in Santa Fe last week. Quite often, program managers at funding agencies dole out advice on how to get funded. This is useful, but it doesn’t answer the more fundamental question of why they can only fund so many projects?
Sheldon and Edwin answered this question by noting that OR competes for scientific research dollars with every other scientific discipline. One way to both improve our funding rates and to give back to our field is to make a case for how operations research should get a bigger slice of the research funding pie.
Sheldon specifically mentioned OR’s role in “big data.” Most of us work or do research where data plays an integral role, and it seems like this is a great opportunity for our field. I’ve been thinking about the difference between “data” and “big data” in terms of operations research. Big data was a popular term in 2012 despite how there is no good definition of how “big” or diverse the data must be before the data become “big data.” NSF had a call for proposals for core techniques for big data. The call summarized how they define big data:
The phrase “big data” in this solicitation refers to large, diverse, complex, longitudinal, and/or distributed data sets generated from instruments, sensors, Internet transactions, email, video, click streams, and/or all other digital sources available today and in the future.
I like this definition of big data, since it acknowledges that the challenges do not only lie in the size of the data; complex data in multiple formats and data that changes rapidly is also included.
I ultimately decided not to write a proposal for this solicitation, but I did earmark it as something to think about for the future. This call required that the innovation needed to be on the big data side, meaning that projects that utilize big data in new applications would not be funded. Certainly, OR models and methods benefit from a data-rich environment, since it leads to new OR models and methods. Here, data is mainly used as a starting point from which to explore new areas. But this means that there is no innovation on the Big Data side. Instead, the innovation will be on the OR side. Does big data in OR mean that we will continue to do what we have been doing well, just with bigger data?
This is an open question for our field: how will bid data fundamentally change what we do in operations research?
My previous post on whether analytics is necessarily data-driven and whether analytics includes optimization can be viewed as a step towards an answer to this question. But I’m not close to coming up with an answer to this question. Please let me know what you think.