Today I’ve been doing some work on a project for a Data Assimilation class: an implementation of an ensemble Kalman Filter that uses an SEIR model and Google Flu Trends from 2003 to track flu incidence and model parameters in a big coupled model of the 10 Health and Human Services surveillance regions. When I first put everything together I assumed that each region was just a 10th of the total US population, because it was simpler than trying to track down actual population data. I know this a pretty bad assumption, and I think this has been causing some inference quirks like concluding that the outbreak was a complete pandemic in every region (this, plus living in Boulder and having just visited Las Vegas has caused “The Stand” to loom in the back of my mind a lot over the last week).
Anyways, I’ve been trying to do a bettter with my population estimates. Unfortunately the HHS website was a total bust for easy-to-locate population numbers. Various abuses of Google’s fancy search bar such as “population MA+NY” also turned up bupkiss. “What I really want”, I thought to myself, “is a software that can interpret my mangled, semi-symbolic queries, search a giant database, and then return the queried value to me. It’d be something like…a…computational knowledge…engine…” Cue flashback to freshman year of undergrad; the “MyMathLab” homework website open in one tab and WolframAlpha in the other, feverishly copy+pasting problems 10 minutes before midnight.
I was actually a pretty big fan of WolframAlpha for my entire undergrad career. At the time I was totally unfamiliar with Mathematica, and so having another tool for troubleshooting or double checking my calculus (especially one that could accept pretty mangled or gnarly input) was invaluable in some of my upper level physics courses. I even went so far as to buy the phone app; it was only 2.99, but I still think that indicates a certain amount of affection and loyalty for the software. Iron Man has JARVIS, Holmes has Watson, and I have Stephen Wolfram (apologies to Dr. Wolfram if you are, for some reason, reading this).
Back to the present: I took my search efforts over to WolframAlpha and beheld glorious success. The website can actually accept a query of the form “(Arkansas+Louisiana+New Mexico+Oklahoma,+Texas population in 2003)/(population of United States in 2003)” and return a value (that I’m just going to assume is accurate. Error bars would be mindblowing, but beggars can’t be choosers). That’s more or less the point of this post. WolframAlpha (and Mathematica) really is an amazing product. I’m not sure what the upper bound of sophistication would be if you were to try and fully integrate it into your inference procedures, but even at this level it’s really amazing. And now they also make apps for the iPhone that provide reference for various specialized topics like cat breeds.