What is the Spatiotemporal Epidemiological Modeler (STEM)? The Spatiotemporal Epidemiological Modeler (STEM) tool is designed to help scientists and public health officials create and use spatial and temporal models of emerging infectious diseases. These models can aid in understanding and potentially preventing the spread of such diseases.
Policymakers responsible for strategies to contain disease and prevent epidemics need an accurate understanding of disease dynamics and the likely outcomes of preventive actions. In an increasingly connected world with extremely efficient global transportation links, the vectors of infection can be quite complex. STEM facilitates the development of advanced mathematical models, the creation of flexible models involving multiple populations (species) and interactions between diseases, and a better understanding of epidemiology.
How does it work? The STEM application has built in Geographical Information System (GIS) data for almost every country in the world. It comes with data about country borders, populations, shared borders (neighbors), interstate highways, state highways, and airports. This data comes from various public sources.
STEM is designed to make it easy for developers and researchers to plug in their choice of models. It comes with spatiotemporal Susceptible/Infectious/Recovered (SIR) and Susceptible/Exposed/Infectious/Recovered (SEIR) models pre-coded with both deterministic and stochastic engines.
The parameters in any model are specified in XML configuration files. Users can easily change the weight or significance of various disease vectors (such as highways, shared borders, airports, etc). Users can also create their own unique vectors for disease. Further details are available in the user manual and design documentation.
The original version of STEM was available for downloading on IBM's Alphaworks. It contained easy to follow instructions and many examples (various diseases and maps of the world).
New developers who want to work on STEM can find useful tools, conventions, and design information in the Welcome STEM Developers article.
The STEM code repository is hosted on the Eclipse Technology code repository.
More About STEM
STEM takes advantage of Equinox (the Eclipse implementation of OSGI) to make all of its components available as Eclipse Plugins. This means that both the models and data that come with STEM are reusable, exchangeable, extendable, and - if you don't like them - they are replaceable. The Disease Models in STEM are based on standards compartment models at the level of Anderson and May. Today we have both stochastic and deterministic implementations of SI, SIR, and SEIR models. We are also developing models for specific diseases including a global model for seasonal influenza. Any model in STEM requires a solver to integrate the differential equations. Today we have both Finite Difference and Runge Kutta mathematical solvers for every model. The solvers themselves are provided as plugins.
Regardless of the computational approach used to predict the future state of a disease, any simulation or model requires "denominator" data. The numerator data usually comes from public health data on specific reportable conditions. You can think of numerator data as an “initial condition” or input into a model. Historic (numerator) data can be used to validate a model. However, for every model there is a need for a large variety of denominator data. This includes information on the population itself as well as information on the vectors of disease transmission. Depending on the disease of interest, these vectors might include a transportation model or even a model of a migrating wild animal population. The Eclipse Framework gives users a simple “plug and play” software architecture with a drag and drop interface so when composing a new model or scenario users can drag specific data of interest into their model and literally "compose" a new scenario. The data sets distributed with STEM include geography, transportation systems (e.g., roads, air travel), and population for the 244 countries and dependent areas defined by International Standards Organizations. We make no guarantees about the accuracy of the data provided, and users may certainly chose to plug in their own data replacing what comes with STEM. The data provided comes from open public sources including
- The CIA Fact Book
- United Nations Environment Programme (UNEP)
- The U.S. Census Bureau Tiger files
- National Population Census Facts can also be found through http://www.census.gov/aboutus/stat_int.html
The population in STEM is based on census data for the administrative divisions that correspond to the polygons visible on the STEM maps. Where possible, administrative data was included down to ISO-3166 administrative level 2 (equivalent of counties in the U.S.). In some cases we were only able to obtain data at administrative level 0 (country level data). For those instances where we do have admin 2 data, the relative population distributions between subdivisions within the countries were validated against The ORNL LandScan 2007(TM)/UT-Battelle, LLC data set. The LandScan 2007™ High Resolution Global Population Data Set copyrighted by UT-Battelle, LLC, operator of Oak Ridge National Laboratory under Contract No. DE-AC05-00OR22725 with the United States Department of Energy. The United States Government has certain rights in this Data Set. NEITHER UT-BATTELLE, LLC NOR THE UNITED STATES DEPARTMENT OF ENERGY, NOR ANY OF THEIR EMPLOYEES, MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LEGAL LIABILITY OR RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR USEFULNESS OF THE DATA SET.
STEM itself does not include the LandScan(TM) data. In fact, in all cases the total population at the national level is based on national census facts by country and not based on LandScan(TM). Almost all the census Facts are referenced to (or scaled to) year 2006. For users interested in creating, on their own, a new STEM population set referenced to LandScan(TM) itself, LandScan(TM) Dataset licenses are available free of charge for U.S. Federal Government, for United Nations Humanitarian efforts, and educational research use. A user defined data set could, if desired, also replace the administrative divisions in STEM with a user defined grid of arbitrary resolution.
Development of STEM was supported in part by the U.S. Air Force Surgeon General’s Office (USAF/SG) and administered by the Air Force District of Washington (AFDW) under Contract Number FA7014-07-C-0004. Neither Eclipse, the United States Airforce, IBM, nor any of their employees, nor any contributors to STEM, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of the stem data sets. The Air Force has not accepted the products depicted and issuance of a contract does not constitute Federal endorsement of the IBM Almaden Research Center.
Even though the population data is based on a particular set of facts derived or interpolated from a census in some year, you may want to do a simulation in a different year. Even if you replace the data included in STEM with your own data set, you may want to study years other than the year in which that data set was "considered valid." For this reason, population in STEM is not represented as a static label value. Population is, instead, represented by a population model. The purpose of a population model is to handle general effects on a population that are not caused by a specific disease outbreak. Right now, there is only one Population Model available that allows you to define a background birth and death rate for your scenario. Some things to note:
1. Background birth rate and death rate are no longer available within the standard disease models. They have been moved to the Population Model. So for old scenarios where you had your birth/death rate defined you need to create a new population model, specify your birth/death rate and drag it into your scenario.
2. To create a population model, go to the menu (New -> Population Model).
3. A population model is just like a disease model (it ends up under decorators in the project explorer) and it will store its own log files under its name in the log folder.
4. Disease models and population models should work well together and synchronize up background birth/deaths and disease deaths among each other each iteration. If you don't have a population model in your scenario, the background birth/death rate is 0. When two or more diseases are running simultaneously, they will incorporate each other's disease deaths into their calculations.
For more information on several of the available population models please see STEM Population Models.
User designed scenarios include selected denominator data, initial conditions, geographic data (a region of the world or the entire world), mathematical disease models, a solver, start and stop dates, etc. These scenarios themselves are represented in the Eclipse framework so they too can be build upon and exchanged. So STEM is intended to provide a common collaborative platform that enables sharing; the import and export of models that they can be easily exchanged among researchers. A researcher who has developed a detailed country model that includes population demographics can import a component with specialized disease mathematics from another researcher and combine the two. The new combination can then be re-exported (with descriptive metadata) for others to use. The design goal of STEM is for the country sub-model to be a standardized community resource maintained and refined by many different contributors. Over time, the country model will become more accurate, detailed, and valuable. With data available as plug-ins, researchers will be free to contribute data in their field of expertise. Thanks to Eclipse, researchers will find it easier to compare and share different models because their underlying components will be the same. On the STEM homepage you can find links to some scenarios that you can import, run, and modify in this way.