Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

STEM

Revision as of 13:42, 16 July 2010 by Judyvdouglas.verizon.net (Talk | contribs) (Getting Started)

STEM TOP BAR.gif

The Spatio-Temporal Epidemiological Modeler (STEM) is a tool designed to help scientists and public health officials create and use models of emerging infectious diseases. STEM uses mathematical models of diseases (based on differential equations) to simulate the development or evolution of a disease in space and time (e.g., avian flu or salmonella). These models could aid in understanding, and potentially preventing, the spread of such diseases. STEM also comes pre-configured with a vast amount of reference or denominator data for the entire world. By using and extending the data and models in STEM it is possible to rapidly prototype and test models for emerging infectious disease. STEM also provides tools to help you compare and validate your models. As an open source project, the ultimate goal of STEM is to support and encourage a community of scientists that not only use STEM as a tool but also contribute back to it. STEM is designed so that models and scenarios can be easily shared, extended, and built upon.

Links

  • Watch full length STEM Tutorials on YouTubeTM
    1. In English
    2. In Hebrew
    3. In Japanese
    4. In Spanish


Getting Started

Welcome STEM Developers From Eclipsepedia Jump to: navigation, search

Contents [hide]

   * 1 Introduction
         o 1.1 Step-by-step for STEM developers
   * 2 Software Engineering
         o 2.1 Coding Conventions and Guidelines
               + 2.1.1 Copyright Statement
               + 2.1.2 National Language Support
         o 2.2 Bug Reporting
         o 2.3 STEM Mailing List
         o 2.4 STEM Newsgroup
         o 2.5 STEM IRC Chat
         o 2.6 Submitting code to the STEM project
         o 2.7 Creating a new standalone STEM application
         o 2.8 Software Engineering Documentation To Do's
               + 2.8.1 Building The System
   * 3 STEM subprojects
         o 3.1 STEM - GoogleEarth Interface
               + 3.1.1 Prerequisites
               + 3.1.2 Running STEM and GoogleEarth
               + 3.1.3 STEM-GoogleEarth Preferences
         o 3.2 STEM Reports SubProject
   * 4 STEM Data
         o 4.1 Definitions
         o 4.2 An Introduction to STEM's Properties Files
         o 4.3 Administrative Levels and Properties Files
         o 4.4 Updating values in a property file
         o 4.5 An Introduction to STEM Map Files
         o 4.6 Data Location and Usage
   * 5 Other Information
         o 5.1 Tools
         o 5.2 Optional Eclipse Features and Plug-ins
         o 5.3 Reference Sources
         o 5.4 Books

STEM Contents Page [edit] Introduction

This article is intended to be the starting point for developers working on the Spatio-Temporal Epidemiological Modeler Project (STEM) code base. It contains a detailed description of how the project is organized, how to find all of the resources associated with the project, and how to install, configure and use the necessary development environment.

One of the advantages of this document is that it gets newcomers “up and running” faster and it allows them to contribute to the project immediately by correcting any inaccuracies or omissions the document contains or by clarifying any descriptions. As a newcomer to the project you have a totally unique and extremely valuable perspective, if you can't find something or it doesn't make sense to you then you're probably not alone. When you find out the answer to your quandary, please record it here so those that come after you can “stand on your shoulders”. Welcome aboard!

The STEM code base is written in Java™ Version 5 and is organized as a set of well defined components using the eclipse (www.eclipse.org) plug-in tool framework. The components are integrated through a well defined “extension point” mechanism that makes the entire code base highly extensible. The use of the eclipse framework also provides for multi-platform portability.

Interestingly, eclipse is also the STEM project's Java™ development environment. If this seems a bit strange at first, don't worry, as you understand more about eclipse and the project this will make more sense. The project is organized and managed as an open source project. Much of the inspiration, philosophy and techniques for its management are taken directly from the book producing open source software by Karl Fogel (see the references later in this document). [edit] Step-by-step for STEM developers

   * Read this article.
   * When you find missing, confusing, erroneous information, please contribute by fixing the problem. This Wiki allows any Bugzilla registered user to update content. Please do!
   * Visit the STEM project page. Bookmark it.
   * Obtain a Bugzilla account. [1]
   * Subscribe to the project mailing list. The mailing list is used for all internal developer communication.
   * Subscribe to the Newsgroup. [2]
   * Install Java 5.0 and Eclipse 3.3 [3]
   * Obtain the source code for the STEM project from the code repository. [4] 

You can also review the article on Eclipse Development Resources http://wiki.eclipse.org/index.php/Development_Resources [edit] Software Engineering [edit] Coding Conventions and Guidelines

As with any product being built by a team, there are various areas where standards, conventions, and other guidelines can play a role in helping to ensure that the resulting product presents to developers and customers as a unified whole rather than as a loose collection of parts worked on by a variety of individuals each with their own styles and ways of working.

The STEM project will use the conventions and guidelines used by the Eclipse project. See the Eclipse Development Conventions and Guidelines.

The most important guidelines for STEM code are that it have good JavaDoc documentation and not generate compiler warnings.

The Eclipse compiler verifies the code against a preference list and will generate warning messages for code that does not follow rules specified in the compiler preferences. For example, it will check for "unchecked Generic type operations" and give a warning. We ask you remove all of the Eclipse compiler warnings before submitting the code. Also preferences for the following JavaDoc options should be changed to generate warnings and the warnings removed by fixing the code and/or creating accurate and informative Javadoc comments,

window->preferences->java->compiler->javadoc:
   Process Javadoc comments
     Malformed Javadoc Comments:         warning
     Missing Javadoc tags(public)        warning
     Missing Javadoc comments(Public)    warning

If you are the owner or creator of a STEM subproject, you can create more restrictive standards by checking in a set of project preferences. However, please do not remove warnings by checking in project preferences that ignore bad coding practices.


[edit] Copyright Statement

A copyright should be on any human created text document including HTML,Java, properties, XML, XSD. Auto generated/machine made text files does not need to have a copyright. Such files are compiled from a human created file and it is not practical to add copyright to them (as we wouldn't add copyright to class files). To be specific, we are referring to auto generated JavaDocs (HTML) and EMF models (JAVA). For .java files the copyright statement should follow the package statement and proceed the import statements. The Copyright statement is further described here


package org.eclipse.stem; /*************************************************************************

  • Copyright (c) 2006,2007 IBM Corporation and others.
  • All rights reserved. This program and the accompanying materials
  • are made available under the terms of the Eclipse Public License v1.0
  • which accompanies this distribution, and is available at
  • http://www.eclipse.org/legal/epl-v10.html
  • Contributors:
  • IBM Corporation - initial API and implementation
                                                                                                                                                  • /

import ... ;


To have the above automatically inserted in new code, do the following:

   * select the above comment (from /* to */)
   * select Window->preferences
   * select Java->Code Style->CodeTemplates->Types
   * select edit
   * paste the copyright statement in place of the existing "Author" statement.
   * select OK 

[edit] National Language Support

STEM supports languages other than English. The basic process is to provide properly named properties files that mirror the "native" English properties files. These files are grouped into a "plug-in fragment" which acts as an "add on" to a plug-in and adds the files of the fragment to the plug-in as if they were there originally. We separate them so that additional languages can be added without changing the original plug-in.

If a native file is named "messages.properties", then the corresponding file with Spanish translations for the messages would be named "messages_es.properties". It is also possible to provide translations that are specific to a particular country, for instance, for Canadian English, the corresponding file would be "messages_en_CA.properties". For Californian English it would be "messages_en_US_CA.properties". This follows the regular Java conventions.

To get us started, I've created one new plug-in fragment called "org.eclipse.ohf.stem.ui.nl1" which contains the properties files for the main UI component of STEM in the plug-in "org.ecliopse.ohf.stem.ui". Each of the other plug-ins with translatable strings will require their own (yet to be created) plug-in fragment. I also created properly named, untranslated, properties files for Californian English, Canadian English, Spanish, Hebrew, Tamil and Chinese in the new fragment.

To start STEM (Eclipse) in a language different from the working language of the operating system, you need to use the "-nl" command line parameter. For example, use "-nl en_US_CA" to start STEM in Californian English, or "-nl es" to start it in Spanish. You might find it useful to create a new launch configuration in Eclipse with the parameter specified. I have one for each language.

1) Use the "Run..." menu item to open the "Create, manage, and run configurations" dialogue. 2) Duplicate the Eclipse application you use to launch STEM and rename the new one to reflect the language (e.g., "STEM Californian") 3) Select the "Arguments" tab 4) In the "Program arguments" text box enter the command line parameter (E.g., "-nl en_US_CA" without the quotes, for Californian English). 5) Apply 6) Run

Open the Active Simulations View and you should see a different title than normal.


Here are some good resources for more detailed information:

http://www.eclipse.org/articles/Article-Internationalization/how2I18n.html

http://www.eclipse.org/tptp/home/documents/process/development/translation_rules_of_thumb.html

"Building Commercial Quality Plug-ins", Clayberg,

"The Java Developer's Guide to ECLIPSE", D'Anjou, et al. [edit] Bug Reporting

Eclipse uses the Bugzilla system for reporting and processing Bug reports and enhancement request for the STEM project. Once a problem or enhancement has been submitted to Bugzilla, the Bugzilla entry should be used for subsequent discussion of the issue.

   * The following page is used for creation of a Bugzilla account. [5]
   * The following page is used to set the user email preferences. [6]
   * The bug reporting page is: [7] Specify STEM as the component
   * The following page can be used to find bugs in STEM https://bugs.eclipse.org/bugs/query.cgi
         o Enter Technology as the Classification
         o Enter STEM as the Product 

The following article shows the life cycle of a bug from entering the system to being closed. http://www.eclipse.org/projects/dev_process/bugzilla-use.php

Before reporting a bug, please read the Eclipse bug writing guidelines and please search to see if the bug has already been reported.

There are some important considerations when you submit a bugzilla report. If it is a problem that you are reporting, then the promptness of a resolution is very much dependent on the completeness of the report. In most cases, a problem needs to be reproduced in order to fix it. Before you press submit; read your report over and think about whether if you were the STEM developer that gets the problem, would you have the information you need to reproduce the problem.

An important help in problem determination is the Error Log view on the STEM window. If any exceptions or serious errors occurred they will show up there. These entries can be exported to a file and attached to the bugzilla entry.

Most of the development of STEM was done on Windows with Eclipse 3.2 and lately 3.3. If you are using a different platform, be sure to mention this just in case this is a platform problem.

If you are submitting a suggestion for a new feature or enhancement, include some justification and enough details that it can be evaluated. Since others may want to contribute to a discussion of the issue, it would be a good idea to post a pointer to the bugzilla entry on the STEM newsgroup. [edit] STEM Mailing List

Send stem-dev mailing list submissions to <stem-dev@eclipse.org>

To subscribe or unsubscribe via the World Wide Web, visit

   https://dev.eclipse.org/mailman/listinfo/stem-dev 

or, via email, send a message with subject of body 'help' to

    stem-dev-request@eclipse.org

You can reach the person managing the list at

   stem-dev-owner@eclipse.org

When replying, please edit your Subject line so it is more more specific than "Re: Contents of stem-dev digest..."


The developer's mailing list should be used to communicate with other STEM developers. It is preferred over private emails. [edit] STEM Newsgroup

The Newsgroup for STEM is news://news.eclipse.org/eclipse.technology

To access the newsgroup you will need to go to the following page and request a userid and password.

   http://www.eclipse.org/newsgroups/ 

A special Newreader userid and password will be emailed to you. This userid and password is used to subscribe to the Newsgroup using whatever news reader you choose.


The newsgroup is for all of the components of Eclipse, not just STEM. We suggest that when you post to the newsgroup, you include STEM in the title to ensure it being read by the STEM community.

More information about access to the newsgroup is here: http://www.eclipse.org/newsgroup.php


For those who use FireFox and Thunderbird as their web browser and email client, the following may help in setting up Thunderbird to read the STEM newsgroup.

   * From menubar, select Tools->Account Setting
   * Select Add Account
   * Select Newsgroup Account then Next
   * Enter your name and email address then Next .
   * Enter news.eclipse.org for Newsgroup Server then Next
   * Enter the userid and password that were sent to you when you signed up.
   * Enter the account name. The default of News.Eclipse.org is OK.
   * You should now have "News.Eclipse.Org" listed in your "Folders"
   * Select the News.Eclipse.org folder and then select Manage NewsGroup Subscriptions
   * Expand the list of newsgroups until you find Eclipse.technology and select it.
   * You should now have the list of previous postings to the newsgroup and it will be updated every time you get your mail from the server. 

[edit] STEM IRC Chat

The IRC Chat channel for OHF STEM developers is on the freenode server with the other eclipse chat channels (see http://wiki.eclipse.org/index.php/IRC )

The URL is: irc://irc.freenode.net/#eclipse-stem

You can access it either through a browser with a chat extension (like "ChatZilla" for Firefox) or through the IRC client available in the Eclipse Communications Framework (ECF). This page will probably be useful: http://freenode.net/faq.shtml#nicksetup

Using ECF to access the chat

You'll find the ECF IRC client in the "Communications Perspective". There is a drop down menu with an icon of a human with a voice bubble to the left. Select IRC from that menu. When connecting to the irc server use "yourname@irc.freenode.net" where "yourname" will be your login name otherwise known as your "nick" in IRC speak.

Follow the directions on this page http://freenode.net/faq.shtml#nicksetup

and then "join" the channel with "/join #eclipse-stem"

Using Chatzilla to access the chat

Chatzilla is a plugin for the FireFox Browser. To install it go to: !https://addons.mozilla.org/en-US/firefox/addon/16 and click on the Install Now button. After restarting Firefox, click on Tools->Chatzilla to start Chatzilla.

   * To connect to the eclipse-stem channel, click on:
     irc://irc.freenode.net/#eclipse-stem
   * To join the channel automatically on startup, right-click on the channel tab and click Open This Channel at Startup
   * To register your nickname, issue the following commands in the Message input area:
     /msg nickserv register <PASSWORD> 

Under the Chatzilla menubar entry is a Preferences item that will allow you to customize Chatzilla to make IRC chatting easier.

   * To set it up so that you’re automatically identified when you join the network, use these instructions:
         o ChatZilla → Preferences. On the side menu, you’ll see a list of networks to which you’ve connected. Select irc.freenode.net.
         o Click the Lists tab. You’ll see an area where you can set commands to auto-perform.
         o Hit Add.
         o Enter the following. Do not put the backslash, and replace <password> with your actual password:
           msg nickserv identify <PASSWORD> 

More hints on using Chatzilla are available at http://chatzilla.hacksrus.com/faq/ [edit] Submitting code to the STEM project

If you are not a project member with full committer access to SVN, the best way to contribute source code changes is to follow these steps:

   * Obtain the source code with anonymous access to SVN.
   * Make your changes using Eclipse.
   * Test the changes using the most recent version of STEM.
   * Prepare a patch as described below.
   * If there is a Bugzilla item that describes the need for the change that you are making. then post a description of the patch to the Bugzilla entry and attach the patch.
   * If there is no Bugzilla entry, please create one and attach your patch to it. It is much preferred that all patches be related to an appropriate bugzilla entry.
   * If for some reason a bugzilla entry is not appropriate, email your patch to a member of the STEM team who has committer authority, along with a note describing the patch. 

Preparing a patch (Eclipse users)

Eclipse can create and apply patches in the unified diff format. Assuming you are working out of an Eclipse SVN project created from the STEM SVN repository, you can create a (possibly, multi-file) patch by selecting the desired resource scope in the package explorer and doing "Team/Create Patch...". Because the patch must be applied to the same Eclipse resource it was generated from it is probably safest to select the project itself.

Notification of code changes

You can be notified when anyone commits code to the STEM project. This can be useful if you are actively working on STEM coding and want to know when changes are made that might affect you. And if you are not a committer, you might want to know when a committer has committed your code. The instructions for subscribing to the cvs-commit mailing list are at the following URL:

   * https://dev.eclipse.org/mailman/listinfo/cvs-commit 

[edit] Creating a new standalone STEM application

At some point you may want to create a new version of the standalone Stem application. Or more likely, you may need to verify that changes you have made will work in the standalone version. To create a new temporary version of the STEM application for testing do the following.

   * Select the org.eclipse.stem.ui project
         o Select feature.product
         o From the Overview tab
               + select Export '. 

It should request the name of the zip file to be generated and build it. It will run for many minutes. The resulting zip file can be expanded in a new directory like "c:\teststem" and invoked from a command window. [edit] Software Engineering Documentation To Do's

TODO

   * Describe the basic development philosophy of the project and the basic ideas behind “agile” software development.
   * Explain how to test the system.
   * Explain how to submit a bug report.
   * Explain how to use JUnit. 

[edit] Building The System

STEM developers can use the headless build mechanism (build plug-ins automatically outside the Eclipse IDE) for building distributions for various OS platforms.

The instructions below are for running the build on a Linux machine.

Before using the headless build, make sure you have the following software prerequisites:

   * Eclipse Platform 3.5, make sure that the following Eclipse features are also installed:
         o BIRT (and all its prerequisite plugins) 
   * (Alternatively, you can download the BIRT Report Designer All-in-one v2.5.0 or higher)
   * Eclipse DeltaPack that matches the version of the above Eclipse
   * JDK 5.0 or higher
   * SVN for Ant (from here) 


To build STEM using this mechanism you will need to follow these steps:

   * Check-out from STEM SVN the plugin org.eclipse.stem.releng
   * Copy the local.sh-template to local.sh, edit it and change the values within it to fit your local platform:
         o MAJOR_VERSION - The version number of the STEM product you are building (e.g., 0.3.0)
         o JAVA_HOME - Path to the JDK to be used
         o ECLIPSE_HOME - The Eclipse SDK to be used for building 
   * Put the lib directory from the SVN for Ant Zip file under a directory named '.ant' in your home dir
   * From a command line, change the directory into the above plugin directory and execute the build.sh script 


The build script will first fetch the latest plugins code from the SVN and then build those plugins. The last step would be to pack those plugins into an executable product package. If the build finishes successfully, you should see a "BUILD SUCCESSFUL" message at the end of logging messages. The output of the build process is a set of Zip files, one for each different OS platform. Those are located under folder: build\I.WeeklyBuild. [edit] STEM subprojects

STEM consist of a basic core system and (in the future) many projects that either use the basic system or enhance it. The following section describe the subprojects that use STEM or enhance it.

   * GoogleEarth
         o Interface to the GoogleEarth application. 
   * Reports
         o Interface to the Eclipse BIRT product. 
   * DataLog
         o Simple support for logging STEM data. 


[edit] STEM - GoogleEarth Interface

STEM is a computer software system for defining and visualizing simulations of the geographical spread of contagious diseases. One way to visualize this geographical information is to overlay it on the GoogleEarth 3D world model.

The STEM-GoogleEarth project (STEM-GE) enables the logging of the spatial data with disease statistics included as it runs. Either simultaneously or after the fact, the logged data(in the form of KML files) is read by GoogleEarth and displayed by mapping disease state to color intensity.

To run the STEM-GoogleEarth Interface you need to do the following: [edit] Prerequisites

You must have installed the GoogleEarth application which is available for personal use from the GoogleEarth download site. You should verify that GoogleEarth works correctly on your machine by starting it and verifying that you can browse the 3D image because some older computers do not have the 3D graphics capabilities required by GoogleEarth. It is a fun application to play with and when you are done you can leave it running or not. If it is not active, STEM will start it. [edit] Running STEM and GoogleEarth

   * If you are running from the STEM source distribution:
         o Update STEM to the latest code level and ensure it is refreshed and rebuilt.
         o From project org.eclipse.stem.ui.ge select servlet.xml
         o Select RunAs->AntBuild
         o From Eclipse, Run Stem using stem.product in org.eclipse.stem.ui (Described earlier here. 
   * If you are running the STEM standalone executable, the above steps were already done for you and you just need to start STEM.exe. 


   * In the STEM workspace window
         o Windows->GoogleEarth->View
               + A window will be displayed that contains a Display button and 

a set of Radio buttons that select the disease aspect to be displayed.

   *
         o
               + At this point the GoogleEarth application should start if it was not already started. 
         o Optionally, select Windows->Preferences->STEM->visualization->GoogleEarth
               + Specify any preferences that you want to use. The defaults are probably OK. 
         o Using the Scenarios view: select your desired scenario.
               + For example: STEM->Geography->Continent->NorthAmerica
               + doubleClick on Spanish Flu to start the simulation. 
         o After the simulation has run for 7 or 8 cycles
               + Pause the simulation
               + Click the display button in the GoogleEarth View.
               + You should see the GoogleEarth map showing some red area in the Northeast areas of the USA.
               + Start the simulation again
               + Pause and Click the GoogleEarth Display button every few cycles. These red areas should grow as the infections spread both across county borders and across the airline connections.
               + You can use the GoogleEarth view to change the aspect displayed to any of:
                     # Infections - Red
                     # Exposed - Yellow
                     # Recovered - Green
                     # Susceptible - Blue 

As distributed, the GoogleEarth interface runs in manual mode. Because of memory usage of both STEM and GoogleEarth they do not run well together unless your workstation has more than 1 Gigabyte of memory. If you do have lots of memory, you can use the GoogleEarth preferences to change so that as the simulation runs the results are shown simultaneously on the GoogleEarth display.


There are numerous ways that you can use GoogleEarth with your simulations. We will not describe them all. The STEM-GE Preference Page is described below. That is where you specify many options that control what the STEM-GE interface does. The most common use of STEM-GE is to run a simulation and have it write KML files to a folder. Simultaneously GoogleEarth is reading these KML files via a webServer and displaying the results of the simulation on the GoogleEarth screen. [edit] STEM-GoogleEarth Preferences

The STEM-GoogleEarth interface has a set of preference that control how the application works. This preference page is accessed by going to windows->preferences->visualization and selecting the GoogleEarth entry.

You will get the following window: Image:GEPreferences.jpg

   * Preferences 
   * Choose the Method used to display STEM Results
         o LogOnly
           With this option, the KML files are written but not displayed by GoogleEarth. This would be used when you are either going to display the GoogleEarth visualization at a later time or if you are going to run GoogleEarth from another system with the KML files on a shared disk.
         o "Log+Servlet"
           With this option, the KML files are written and then displayed by GoogleEarth. GoogleEarth actually requests the file from a webserver Servlet which reads the file and sends it to GoogleEarth.
         o "AsyncServlet"
           The KML is written on every cycle, overlaying the file written for the previous cycle. GoogleEarth asks for new data on a predetermined interval and is sent the current KML. The advantage over the previous method is that it helps keep GoogleEarth from falling to far behind the STEM processing.
         o "DirectLaunch"
           With this option, the KML files are directly sent to GoogleEarth without using an intermediate web server. This can cause problems because GoogleEarth may get files faster than it can process them but it is more efficient.
         o "ManualDisplay"
           The map is generated by user clicking the Display button. This is the default option.
   * "Folder for KML logging:"This is the folder where STEM will write the KML files that GoogleEarth will read. If it already contains KML files, the user will be given the oportunity to delete them, keep them or choose a new folder.
   * "User internal webserver"
     This is used to cause the webserver built into Eclipse to be used.
   * "Hostname:port for external webserver"
     This is the required hostname and port for an external webserver. Normally the internal webserver would be used so this is not needed but there are cases where one might want to use an existing web server.
   * "Automatically startup GoogleEarth"
     If specified then when the STEM-GoogleEarth view is started, then the GoogleEarth application is also launched.
   * "Automatically process every simulation"
     if specified, then when you start any simulation running, it will automatically have its processing be mapped to GoogleEarth. Only the first one will be displayed by GE since it would be counterproductive to show 2 different views at the same time.
   * "Write KML files only every N th cycle"
     If the simulation does not change rapidly from cycle to cycle, significant overhead can be saved by only sending data to GoogleEarth every Nth cycle.

[edit] STEM Reports SubProject

The Reports subproject provides an interface to the Eclipse BIRT component. BIRT is an open source Eclipse-based reporting system that integrates with STEM to produce custom reports.

   *
         o instructions to be provided later ** 


[edit] STEM Data

In Process documentation This section is in the very early stages of documentation. Please help complete it.


An important component of STEM is the large collection of data that it contains. This section will attempt to describe the different data that is available, where it is and how it is used. [edit] Definitions

Administration Level The administration level for geographic data refers to the "political resolution" of the data sets. The United Nations defines political divisions called "Administration Levels".

The highest level is "UN Administration Level 0" which corresponds to "countries". We have 244 such "countries", we get our definition of a country from the ISO-3166-1 codes, of which there are 244. There is a ISO-3166-1 code for the United States (a country), and one for Puerto Rico (not a country, but a slightly separate political division). For the United States the ISO-3166-1 2 character code is US and the ISO-3166-1 3 character code is USA.

The next level down is "UN Administration Level 1", which for the United States are the states and for Canada are the provinces and territories. Below that is "UN Administration Level 2" The definition of these areas varies from country to country which is why we use the "level" instead. Level 2 is the limit on our current data sets. In the future we could add higher resolutions. For instance, for a small country like Israel we might go to level 3 and use something like a census track as the location.


Latitude: Latitude gives the location of a place on Earth north or south of the Equator. Latitude is an angular measurement in degrees (marked with °) ranging from 0° at the Equator to 90° at the poles (90° N for the North Pole or 90° S for the South Pole). In Stem, latitudes north of the Equator are positive values and south of the equator are negative.

Longitude: Longitude describes the location of a place on Earth east or west of a north-south line called the Prime Meridian. Longitude is given as an angular measurement ranging from 0° at the Prime Meridian to +180° eastward and −180° westward. The Greenwich meridian is the universal prime meridian or zero point of longitude.

ISO3166 code ISO 3166 is a three-part geographic coding standard for coding the names of countries and dependent areas, and the principal subdivisions thereof. The official name is Codes for the representation of names of countries and their subdivisions.

   * ISO 3166-1 codes for country and dependent area names.
         o ISO 3166-1 alpha-2 two-letter country codes
         o ISO 3166-1 alpha-3 three-letter country codes
         o ISO 3166-1 numeric three-digit country codes
         o ISO 3166-2 Codes for the representation of names of countries and their subdivisions -- Part 2: Country subdivision code - defines codes for the principal subdivisions of a country or dependent area. 

A list of all the county (Administration 0) codes is here: http://www.davros.org/misc/iso3166.html [edit] An Introduction to STEM's Properties Files

In STEM II, we store data about a country and its administrative divisions in properties files. Properties files are used to define identifiers and to set population and area data at all levels (i.e. country, state, or county level). There are four main types of properties files: area, population, nodes, and names. The purpose of each property file will be explained later.

In STEM II, most of the data is standardized ISO 3166 or FIPS (Federal Information Processing Standards) data. According to Wikipedia, "ISO 3166 is a three-part geographic coding standard for coding the names of countries and dependent areas, and the principal subdivisions thereof." ISO 3166 specifies standard alpha-2 codes, alpha-3 codes, and three-digit country codes as well. For example, for the USA, the alpha-2 code would be US, the alpha-3 code is USA, and the three-digit country code is 840. In addittion, we use FIPS codes whenever possible to create identifiers that represent a political administration. For instance, the FIPS for New York County (i.e., Manhattan) is 36061 while Kings County (i.e., Brooklyn) is 36047. The first two digits of the FIPS identify the state. [edit] Administrative Levels and Properties Files

Administrative levels correspond to political divisions of a country. A level 0 administration identifies an entire country (e.g. USA or Mexico). A level 1 administration corresponds to a subdivision of a country which can be a state, territory, parish, or a province. As an example we have that California, Colorado, and New York are all level 1 administrations of the USA. Similarly, level 2 administrations are subdivisions of a level 1 administration. Orange County, Monterey County, and Napa County are all level 2 administrations that belong to California.

A property file is a plain-text file that contains either area data, population data, or other relevant data related to the political divisions of a country at different administration levels. Properties files are located under org.eclipse.ohf.stem.internal.data\resources\data\country. There are four types of properties files :

   * Names property file : This file defines the identifiers for every administrative division in a country at each level. For example, for the USA, at level 0 we would have "USA = United States". At level 1, we would have "US-CA = California" , "US-CO = Colorado", and "US-NY = New York". 

At level 2, for Orange, Monterey, and Napa counties within California we have "US-CA-06059 = Orange County", "US-CA-06053 = Monterey County", and "US-CA-06055 = Napa County". There is a single names property file for every country. The corresponding names property file for the USA is USA_names.properties. For level 2 administrations, the five digits found on the identifiers (i.e. "06053" for identifier "US-CA-06053" are defined as follows : the leftmost two digits identify the level 1 administration ( "06" -> California ) while the remaining digits, which can be up to four, identify the level 2 administration ("053" -> Monterey County).

   * Nodes property file : This file provides additional information about identifiers of administrative divisions. There is a nodes property file for each administration level. There is a fixed format that is used in these files and it is as follows: 
# Format: Code = Name, ISO-3166-2 numeric code, Two letter code   
For example, at level 0 for the USA we have 
"USA = United States, 840, US". 
At level 1, for California, Colorado, and New York we have :
  "US-CA = California, 06, CA" ,  
  "US-CO = Colorado, 08, CO", and  
  "US-NY = New York, 36, NY". 

The nodes property files corresponding to the USA are USA_0_node.properties, USA_1_node.properties, and USA_2_node.properties.

   * Area property file : This file contains area data (in square kilometers) for administrative divisions. There is an area property file for each administration level. 

In the case of the USA at level 0, we have "USA = 9161923". At level 1, we have "US-CA = 163695.57", "US-CO = 104093.57", and "US-NY = 54556.00" for California, Colorado and New York respectively. At level 2, for Orange, Monterey, and Napa counties within California we have "US-CA-06059 = 2043.5006", "US-CA-06053 = 8603.9405", and "US-CA-06055 = 1952.8510". The area property files corresponding to the USA are USA_0_area.properties, USA_1_area.properties, and USA_2_area.properties.


   * Population property file : This file contains population data for administrative divisions. There is a population property file for each administration level. In the case of the USA at level 0, we have "USA = 298444215". 

At level 1, we have "US-CA = 33871648 ", "US-CO = 4301261", and "US-NY = 18976457" for California, Colorado and New York respectively. At level 2, for Orange, Monterey, and Napa counties within California we have "US-CA-06059 = 2846289", "US-CA-06053 = 401762", and "US-CA-06055 = 124279". The population property files corresponding to the USA are USA_0_human.properties, USA_1_human.properties, and USA_2_human.properties. [edit] Updating values in a property file

To update a value in a property file you should follow a few simple steps. First, given that you know the name of the country and/or administrative division to be updated, look under org.eclipse.stem.internal.data\resources\data\country\<three letter identifier for the country> for the folder that belongs to a given country. In the case of the USA, we would go to org.eclipse.stem.internal.data\resources\data\country\USA where all the property files are located. Second, search in the names property file to find the identifier of the administration we want to update. Third, open the corresponding file (area, population, or node file) where the update will take place. Finally, search for the identifier and once found, update its value. As an example, let's say we want to update by increasing 100 square kilometers the area of Orange County which belongs to California, USA. Then, we would open org.eclipse.stem.internal.data\resources\data\country\USA\USA_names.properties and do a search for the identifier of "Orange County" which is "US-CA-06059". Then, since we learn from the names property file that Orange County is a level 2 administration, we would open "org.eclipse.stem.internal.data\resources\data\country\USA\USA_2_area.properties" and do a new search for identifier "US-CA-06059". Our search will find ""US-CA-06059 = 2043.5006" Finally, we do the update by adding 100 square kilometers to the current value "US-CA-06059 = 2143.5006".

Once we've updated the correct file(s), we need to run org.eclipse.stem.internal.data\resources\src\Main.java. By running Main.java we create a new serialized file that will take the changes into account. The new serialized file can be found under org.eclipse.stem.internal.data\temp\data\scenario\disease. Next time STEM II executes a scenario, it will load the new serialized file. [edit] An Introduction to STEM Map Files

STEM Maps

Maps for STEM are XML files that follow a GML format. These files are located under: org.eclipse.stem.geography\resources\data\geo\country\XXX where XXX is the three letter identifier for the country. There is also a second set of files at org.eclipse.stem.geography\resources\data\geo\country_reduced Which set is used is determined by a preference at Windows->Preferences->STEM->Visualization->MapDataManagement

There can be multiple map files for a country, one for each administration level. As an example, for the USA we have the following two maps: USA_1_MAP.xml and USA_2_MAP.xml. According to Wikipedia, GML is an XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features. As an example, the set of GML elements that define Orange county are:

 //Sample taken from
org.eclipse.stem.geography\resources\
  data\geo\country\USA\USA_2_MAP.xml
...
<gml:Polygon gml:id="US-CA-06053">
<gml:outerBoundaryIs>
<gml:LinearRing>
<gml:posList>
36.9186 -121.7019 36.9197 -121.6999 ... 36.9186 -121.7019
</gml:posList>
</gml:LinearRing></gml:outerBoundaryIs>
</gml:Polygon>
...

Updating a Map

To update a map, that is, to provide more accurate latitude and longitude data for a given location we follow a few steps. We find the identifier for the location we are trying to update. This step is similar to the step described for updating values in a property file. Once we have found it, we replace the latitute and longitude values contained within the <gml:postList> </gml:postList> tags. We need to make sure that the starting and ending latitute and longitude values are the same, otherwise it wont be accepted as a closed polygon. In the example above, note that for Monterey County ("US-CA-06053") the starting latitude and longitude values "36.9186 -121.7019" are the same as the ending ones "36.9186 -121.7019".

The country_reduced files have had the border data smoothed to reduce the number of lat/long points used to describe a border. It can take sometimes thousands of lat/long points to describe a border accurately. By smoothing the border data so that the border is described by many fewer points, we can save both CPU and memory usage. [edit] Data Location and Usage

STEM Projects that contain data The following STEM projects contain data files that are used in STEM. In most cases these projects also contains code that processes the data.

   * Eclipse Project names
         o org.eclipse.stem.geography
           Contains the Latitude/Longitude data for the Administration areas.
         o org.eclipse.stem.internal.data
           Contains the properties files that describe all of the data in STEM.
         o org.eclipse.stem.data
           No longer actively used and should be ignored.
         o org.eclipse.stem.utility
           Code and data used to prepare the properties files. 

Property files in org.eclipse.stem.internal.data The following property files are found in the following directory. NNN is the 3 character Country code and N is the administration level.

   * resource/data/country/XXX/ There is a separate subdirectory for each country.
         o XXX_N_area.properties This property file contains the area of the administration level in square kilometers. The Key is the key for the administration level ( level 0 = USA; Level 1 = US-CA; Level 2 = US-CA-06075
         o XXX_N_human_2006_population.properties This property file contains the population of the administration level.
         o XXX_N_node.properties This property file contains the full name of the administration area. >> we need an example of how to access it <<
         o XXX_names.properties This property file contains a xref of both Level 1 and level 2 names. 
   * resource/data/relationship This contains the following subdirectories.
         o airtransport
         o commonborder See below.
         o airtransport
         o airtransport 


   * resource/data/decorator/disease This contains property files that describe existing diseases. 


Spacial Data (Geographic Coordinates) that provides the latitude and longitude coordinates for the borders of the Administration areas.


Project: org.eclipse.stem.geography
Path:   resource/data/geo/country/XXX/
     XXX_0_MAP.xml
     XXX_1_MAP.xml
     XXX_2_MAP.xml

Example URI: platform:/plugin/org.eclipse.stem.geography/
                   resources/data/geo/country/USA/USA_2_MAP.xml


Key Format: 
  For level 1 data, the key is the ISO3166-2 code. 
    An ISO3166-2 code is composed as follows: 
      Two letter country code followed by up to three 
      alphanumeric characters for the level 1 administration.
  For level 2 data, the key is the ISO3166-2 code followed by
  up to six digits. 
    The leftmost two digits indicate the level 1 container of a 
    level 2 administration (i.e. California is a level 1
    container for Orange County; a level 2 administration).  
    The two digits were taken from a lexicographic sorting of 
    all the level 1 administrations within a country. 
    Similarly, the four leftmost digits indicate the level 2
    administration. 
    Again, these four digits are an index into the lexicographic
    sorting of all level 2 administrations within a level 1
    administration.


Common Border Data that describes which Administration areas have common borders.


Project: org.eclipse.stem.internal.data
Path:   resource/data/relationship/commonborder/
     XXX_2_XXX_2.properties  
               Common Borders between admin areas within country XXX 
     XXX_2_YYY_2.properties  
               Common Borders between admin areas of country XXX to 
               the admin areas of Country YYY where XXX and YYY have 
               a common border.
     The property files will list the bordering admin level 2 areas. 
Example URI: platform:/plugin/org.eclipse.stem.internal.data/
               resources/data/relationship/
               commonborder/AFG_2_CHN_2.properties


Generated by:  NeighborUtility.java      
Located in:    org.eclipse.stem.internal.data

[edit] Other Information [edit] Tools

The following is a list of software that is useful to have while working on the STEM project.

Tool


Version


URL

OpenOffice


V2.0.1


http://www.openoffice.org

IBM's Java™ 5 JDK


5


http://w3.hursley.ibm.com/java/jim

Eclipse IDE


V3.3


http://www.eclipse.org/downloads/

Jdraw


V1.1,4


[http://www.j-domain.de/homepage.php?page=20

[edit] Optional Eclipse Features and Plug-ins Plug-in Description URL Subclipse Subversion client Subversion client UMLet UML class diagram drawing tool http://www.umlet.com/ Metrics Java™ code metrics computations http://metrics.sourceforge.net Merline Generator GEF generator from EMF http://sourceforge.net/projects/merlingenerator [edit] Reference Sources [edit] Books

[8]Java in a Nutshell, Fifth Edition Flanagan, D., O'Reilly, 2005, ISBN 0-596-00773-6. This is a great reference for Java™. Indispensable. [9]Eclipse Modeling Framework Budinsky, F., et al, Addison-Wesley. 2003. ISBN 0131425420. This is THE book on the eclipse modeling framework (EMF). Note- a second edition is due out very soon, and is available for pre-order [10]Official Eclipse 3.0 FAQs Arthorne, J., Laffra, C., Addison-Wesley, 2004, ISBN 0321268385. This is an excellent source of quick answers to great questions. [11]Eclipse:Building Commercial-Quality Plug-Ins Clayberg, E., Rubel, D., Addison-Wesley, 2004. ISBN: 0321228472. Good reference, but starts out a bit too simply. [12]Eclipse Distilled Carlson, D., Addison-Wesley, 2005. ISBN 0321288157. A concise guide with advanced explanations of how to exploit the features of eclipse. Very useful for people who are already familiar with eclipse because it adds an extra layer of sophistication to the reader's bag-of-tricks. [13]Eclipse Rich Client Platform: Designing, Coding, and Packaging Java™ Applications McAffer, J., et al., Addison-Wesley, 2005, ISBN 0321334612. [14]Eclipse Cookbook Holzner, S., O'Reilly, 2004, ISBN 0-596-00710-8. Good “how to” guide to use eclipse, less for eclipse plug-in development itself.

[15] Eclipse Holzner, S., O'Reilly, 2004, ISBN 0-596-00641-1. Good starter for new eclipse users/developers.</P> "The Definitive Guide to SWT and JFace Harris, R., Warner, R., Apress, 2004, ISBN 1-509059-325-1.</P>

[16]UML for Java™ Programmers Martin, R., Prentice Hall, 2003, ISBN 0131428489. Good “what you need to know” introduction to UML.

[17]Producing Open Source Software Fogel, K., O'Reilly, 2005, ISBN 0-596-00759-0. This is a great book on how to manage software development in general not just open source. We're using it as a guide for this project.</I> Retrieved from "http://wiki.eclipse.org/Welcome_STEM_Developers"

   * Home
   * Privacy Policy
   * Terms of Use
   * Copyright Agent
   * Contact
   * About Eclipsepedia

Copyright © 2010 The Eclipse Foundation. All Rights Reserved

This page was last modified 19:16, 24 June 2010 by James Kaufman. Based on work by Werner Keil and Yossi Mesika, Eclipsepedia user(s) Edlund.almaden.ibm.com and others.

STEM Documentation

Introduction

A Global H1N1 Simulation.

What is the Spatiotemporal Epidemiological Modeler (STEM)? The Spatiotemporal Epidemiological Modeler (STEM) tool is designed to help scientists and public health officials create and use spatial and temporal models of emerging infectious diseases. These models can aid in understanding and potentially preventing the spread of such diseases.

Policymakers responsible for strategies to contain disease and prevent epidemics need an accurate understanding of disease dynamics and the likely outcomes of preventive actions. In an increasingly connected world with extremely efficient global transportation links, the vectors of infection can be quite complex. STEM facilitates the development of advanced mathematical models, the creation of flexible models involving multiple populations (species) and interactions between diseases, and a better understanding of epidemiology.

How does it work? The STEM application has built in Geographical Information System (GIS) data for almost every country in the world. It comes with data about country borders, populations, shared borders (neighbors), interstate highways, state highways, and airports. This data comes from various public sources.

STEM is designed to make it easy for developers and researchers to plug in their choice of models. It comes with spatiotemporal Susceptible/Infectious/Recovered (SIR) and Susceptible/Exposed/Infectious/Recovered (SEIR) models pre-coded with both deterministic and stochastic engines.

The parameters in any model are specified in XML configuration files. Users can easily change the weight or significance of various disease vectors (such as highways, shared borders, airports, etc). Users can also create their own unique vectors for disease. Further details are available in the user manual and design documentation.

The original version of STEM was available for downloading on IBM's Alphaworks. It contained easy to follow instructions and many examples (various diseases and maps of the world).

New developers who want to work on STEM can find useful tools, conventions, and design information in the Welcome STEM Developers article.

The STEM code repository is hosted on the Eclipse Technology code repository.

What's New

New Graph Wizard

STEM now supports user creation of custom lattices via the New Graph wizard. The wizard makes use of "Graph Generators," a new concept. A graph generator is a pluggable component that is able to generate a graph (a collection of nodes and edges) either algorithmically or from an external file. Currently, we have implemented an abstract Lattice Graph Generator with a Square Lattice Implementation. A user can specify the size of the lattice as well as several options for how the nearest neighbor (Common Border) are organized. In the future we plan to also support creating a New Graph From File.

Infectors, Inoculators, and Population Initializers Now Support UIDs for Arbitrary Graphs

Several wizards (Infector/Inoculator, Population Model, Population Initializer) now use a consolidated location picker dialog that gives users the option of selecting any location within the currently selected project. This allows the user to pick for instance one of the automatically generated grids on the lattice graph. Whatever UIDs exist in the user’s graph can be applied. The dialog also filters down the number of possible locations dramatically from all regions in the world to only the ones that are applicable in the current project.

External File Implemented for Creating New Infectors and Inoculators

There are now several new options available when a new infector or inoculator is created. First, it is possible to create an infector or inoculator by importing data from external files in the form of the comma separated files used by the STEM logger. A collection of infectors/inoculators can either be created from the first, last or any manually specified row from such files. We have found this feature extremely valuable when "boot-strapping" the initial state of a disease from a steady state (e.g., the seasonal flu state of the population in the summer time for northern hemisphere).

Older News

Weekly STEM Conference Call

STEM LOGO.jpg

The STEM community has a weekly conference call. The call is held most Wednesdays at 1PM ET (10AM PDT). For more information, or if you wish to join, please send mailto:judyvdouglas@verizon.net

How You Can Contribute to STEM

New Contributors to STEM are always welcome (please contact the developers). This includes not only researchers interested in disease modeling but also experts in GIS data or any other data that might be important in understanding or modeling the spread of infectious disease. We also welcome input from users and contributions to our documentation.

To contribute to STEM, please use the standard Eclipse process. Open a "bug" in our bugzilla (https://bugs.eclipse.org/bugs/) A bug can be more than just a new defect - it can also be a new feature or other contribution. You can attach your contribution as a "patch" to your bug (http://wiki.eclipse.org/index.php/Bug_Reporting_FAQ) Please also feel free to e-mail the STEM development team, many of whom are Eclipse Committers The STEM Development Team. For those interested in joining the project, we also have a weekly phone call and a newsgroup, etc.

Publications and Presentations on STEM

Talks Online Featuring a recent talk at Epidemics 2009

Keil WP. "Spatio-Temporal Epidemiologic Modeler (STEM)", Eclipse DemoCamp 2009, Vienna, Austria, November 2009. [1]

Edlund S, Bromberg M, Chodick G, Douglas J, Ford D, Kaufman Z, Lessler J, Marom R, Mesika Y, Ram R, Shalev V, Kaufman J. 2009. "A spatiotemporal model for influenza." HIC 2009, Frontiers of Health Informatics, Canberra, Australia, August 19-21, 2009. http://www.hisa.org.au/system/files/u2233/hic09-2_StefanEdlund.pdf

Kaufman J, Edlund S, Douglas J. 2009. "Infectious disease modeling: creating a community to respond to biological threats." Statistical Communications in Infectious Diseases, Vol 1, Issue 1, Article 1. The Berkeley Electronic Press. [2]

Kaufman J, Edlund S, Bromberg M, Chodick G, Lessler J, Mesika Yossi, Ram R, Douglas J, Kaufman Z, Levanthal A, Marom R, Shalev V. 2009. "Temporal and spatial effects of lunar calendar holidays on influenza A transmission in Israel." Accepted for presentation at Epidemics 2, Athens, Greece, December 2009.

Hulse, C. L., Conant, J. L., Kaufman, J. H., Edlund, S. B., Ford, D. A., “Development and Utilization of a Spatial and Temporal Modeling System to Investigate Disease Outbreaks in Vermont”, PHIN 2009 [3]

Edlund S, Bromberg M, Chodick G, Douglas J, Ford D, Kaufman Z, Lessler J, Marom R, Mesika Y, Ram R, Shalev V, Kaufman J. 2009. "A spatiotemporal model for influenza." submitted eJHI.

Keil WP. 2009. "The spatiotemporal epidemiological modeller (STEM)." Presentation at epicenter 2009 Conference, Dublin, Ireland, Aug 28, 2009.

Edlund S, Kaufman J, Douglas J, Bromberg M, Kaufman A, Chodick G, Marom R, Shalev V, Lessler J, Mesika Y, Ram R, Leventhal A. 2009. "A study of two spatiotemporal models for seasonal influenza." in preparation.

Ford DA, Kaufman JH, Mesika Y. 2009 (In press). "Modeling in space and time: a framework for visualization and collaboration." In D. Zeng et al. (eds), Infectious Disease Informatics. New York: Springer.

Lessler J, Kaufman JH, Ford DA, Douglas JV. 2009. "The cost of simplifying air travel when modeling disease spread," PLoS ONE 4(2): e4403. doi: 10.1371/journal.pone.004403.

Kaufman, J.H., Edlund S, Ford, D.A. "Spatio-Temporal Epidemiologic Modeler: Application in Middle Eastern Countries", PHIFP colloquium at U.S. Centers For Disease Control, Atlanta. May 8, 2009

Kaufman, J.H., Edlund S, Ford, D.A., Renly, S., R. Ram, Y. Messika, et al. "Spatio-Temporal Epidemiologic Modeler: Evaluation of 10 years of Influenza Data from Maccabi Healthcare". Presentation at Israel Centers for Disease Control, Tel Aviv, Israel. April 16, 2009

Kaufman JH, Conant JL, Ford DA, Kirihata W, Jones B, Douglas JV. "Assessing the accuracy of spatiotemporal epidemiological models," in D. Zeng et al. (Eds): BioSecure 2008, LNCS 5354, pp. 143-154, 2008. Also presented at BioSecure 2008, Biosurveillance and Biosecurity Workshop, Raleigh, NC, Dec 2, 2008.

Ford DA, Kaufman JH. 2008. "The spatiotemporal epidemiological modeller (STEM)." Presentation at the Joint Session Homeland/Humanitarian Preparedness for Pandemic Influenza, Washington, DC, Oct 13, 2008.

Kaufman JH, Conant JL, Ford DA, Kirihata W, Douglas JV, Jones BA. 2008 (December). "Assessing the accuracy of spatiotemporal epidemiological models. In D Zeng et al. (eds): BioSecure 2008, LNCS 5354, pp. 143-154.

Kaufman JH, Ford DA, Mesika Y, Lessler J. 2008 (December). Modeling disease spread in space and time: extending and validating an open source tool for public health. Epidemics (EPID2008): First International Conference on Infectious Disease Dynamics. Asilomar, CA, December 1-3, 2008.

Keil WP, Ford DA, Kaufman JH. 2007 (In press), "Eclipse OHF STEM", Eclipse Magazin, S&S Verlag, July 2007

Ford DA, Kaufman JH, Eiron I, "An extensible spatial and temporal epidemiological modeling system," International Journal of Health Geographics 2006, 5:4 http://www.ij-healthgeographics.com/content/5/1/4 (17Jan2006)

Keil WP. 2005. "Eclipse Open Health Framework (OHF)." Presentation at JavaPolis 2005 Conference, Antwerp, Belgium, Dec 12, 2005.

More About STEM

STEM Plugins

STEM takes advantage of Equinox (the Eclipse implementation of OSGI) to make all of its components available as Eclipse Plugins. This means that both the models and data that come with STEM are reusable, exchangeable, extendable, and - if you don't like them - they are replaceable. The Disease Models in STEM are based on standards compartment models at the level of Anderson and May. Today we have both stochastic and deterministic implementations of SI, SIR, and SEIR models. We are also developing models for specific diseases including a global model for seasonal influenza. Any model in STEM requires a solver to integrate the differential equations. Today we have both Finite Difference and Runge Kutta mathematical solvers for every model. The solvers themselves are provided as plugins.

Data

Regardless of the computational approach used to predict the future state of a disease, any simulation or model requires "denominator" data. The numerator data usually comes from public health data on specific reportable conditions. You can think of numerator data as an “initial condition” or input into a model. Historic (numerator) data can be used to validate a model. However, for every model there is a need for a large variety of denominator data. This includes information on the population itself as well as information on the vectors of disease transmission. Depending on the disease of interest, these vectors might include a transportation model or even a model of a migrating wild animal population. The Eclipse Framework gives users a simple “plug and play” software architecture with a drag and drop interface so when composing a new model or scenario users can drag specific data of interest into their model and literally "compose" a new scenario. The data sets distributed with STEM include geography, transportation systems (e.g., roads, air travel), and population for the 244 countries and dependent areas defined by International Standards Organizations. We make no guarantees about the accuracy of the data provided, and users may certainly chose to plug in their own data replacing what comes with STEM. The data provided comes from open public sources including:

Population

World Population Density.

The population in STEM is based on census data for the administrative divisions that correspond to the polygons visible on the STEM maps. Where possible, administrative data was included down to ISO-3166 administrative level 2 (equivalent of counties in the U.S.). In some cases we were only able to obtain data at administrative level 0 (country level data). For those instances where we do have admin 2 data, the relative population distributions between subdivisions within the countries were validated against The ORNL LandScan 2007(TM)/UT-Battelle, LLC data set. The LandScan 2007™ High Resolution Global Population Data Set copyrighted by UT-Battelle, LLC, operator of Oak Ridge National Laboratory under Contract No. DE-AC05-00OR22725 with the United States Department of Energy. The United States Government has certain rights in this Data Set. NEITHER UT-BATTELLE, LLC NOR THE UNITED STATES DEPARTMENT OF ENERGY, NOR ANY OF THEIR EMPLOYEES, MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LEGAL LIABILITY OR RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR USEFULNESS OF THE DATA SET.

STEM itself does not include the LandScan(TM) data. In fact, in all cases the total population at the national level is based on national census facts by country and not based on LandScan(TM). Almost all the census Facts are referenced to (or scaled to) year 2006. For users interested in creating, on their own, a new STEM population set referenced to LandScan(TM) itself, LandScan(TM) Dataset licenses are available free of charge for U.S. Federal Government, for United Nations Humanitarian efforts, and educational research use. A user defined data set could, if desired, also replace the administrative divisions in STEM with a user defined grid of arbitrary resolution.

Development of STEM is supported, in part by the U.S. Air Force Surgeon General’s Office (USAF/SG) and administered by the Air Force District of Washington (AFDW) under Contract Number FA7014-07-C-0004. Neither Eclipse, the United States Airforce, IBM, nor any of their employees, nor any contributors to STEM, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of the stem data sets. The Air Force has not accepted the products depicted and issuance of a contract does not constitute Federal endorsement of the IBM Almaden Research Center.

Population Models

Even though the population data is based on a particular set of facts derived or interpolated from a census in some year, you may want to do a simulation in a different year. Even if you replace the data included in STEM with your own data set, you may want to study years other than the year in which that data set was "considered valid." For this reason, population in STEM is not represented as a static label value. Population is, instead, represented by a population model. The purpose of a population model is to handle general effects on a population that are not caused by a specific disease outbreak. Right now, there is only one Population Model available that allows you to define a background birth and death rate for your scenario. Some things to note:

1. Background birth rate and death rate are no longer available within the standard disease models. They have been moved to the Population Model. So for old scenarios where you had your birth/death rate defined you need to create a new population model, specify your birth/death rate and drag it into your scenario.

2. To create a population model, go to the menu (New -> Population Model).

3. A population model is just like a disease model (it ends up under decorators in the project explorer) and it will store its own log files under its name in the log folder

4. Disease models and population models should work well together and synchronize up background birth/deaths and disease deaths among each other each iteration. If you don't have a population model in your scenario, the background birth/death rate is 0. When two or more diseases are running simultaneously, they will incorporate each other's disease deaths into their calculations.

Scenarios

User designed scenarios include selected denominator data, initial conditions, geographic data (a region of the world or the entire world), mathematical disease models, a solver, start and stop dates, etc. These scenarios themselves are represented in the Eclipse framework so they too can be build upon and exchanged. So STEM is intended to provide a common collaborative platform that enables sharing; the import and export of models that they can be easily exchanged among researchers. A researcher who has developed a detailed country model that includes population demographics can import a component with specialized disease mathematics from another researcher and combine the two. The new combination can then be re-exported (with descriptive metadata) for others to use. The design goal of STEM is for the country sub-model to be a standardized community resource maintained and refined by many different contributors. Over time, the country model will become more accurate, detailed, and valuable. With data available as plug-ins, researchers will be free to contribute data in their field of expertise. Thanks to Eclipse, researchers will find it easier to compare and share different models because their underlying components will be the same. On the STEM homepage you can find links to some scenarios that you can import, run, and modify in this way.

The STEM Community

Realizing the potential of STEM as an open source tool depends upon the involvement of researchers across settings. STEM developers are working closely with early leaders in the STEM community to provide them with the tools they need.


The STEM Development Team

  • James H. Kaufman, Ph.D., is manager of the Public Health Research project in the Department of Computer Science at the IBM Almaden Research Center. He received his B.A. in Physics from Cornell University and his PhD in Physics from U.C.S.B. He is a Fellow of the American Physical Society and a Distinguished Scientist of the ACM. During his research career Dr. Kaufman has made contributions to several fields ranging from simulation science to magnetic device technology. His scientific contributions include work on pattern formation, conducting polymers, diamond like carbon, superconductivity, experimental studies of the Moon Illusion, and contributions to distributed computing, privacy protection, and grid middleware. His current research interests include Public Health, Interoperable Health Information Infrastructure, Electronic Health Records, and Epidemiological Modeling. He is one of the original creators of STEM. His hobbies include martial arts, water color painting, and tropical fish.(kaufman@almaden.ibm.com)
  • Daniel Ford, Ph.D., is a Research Staff Member in the Department of Computer Science at the IBM Almaden Research Center. Daniel is the Eclipse Project Lead for STEM. He designed and implemented the initial versions of STEM, including the core composable graph framework that gives STEM its ability to represent arbitrary models. He received his Ph.D., in Computer Science from the University of Waterloo. (daford@almaden.ibm.com)
  • Stefan Edlund is a senior software engineer in the Healthcare Research team at IBM Almaden develoing new technologies related to the public health domain. Stefan has over 10 years experience in IBM, having worked on a broad area of technologies such as DB2 query visualization, intelligent personal calendars, exploratory Lotus applications, location based services and more recently in content management and content replication as well as development of an email search and discovery product (IBM eDiscovery Manager). Stefan's current research interests include development of new STEM disease models including diseases involving multiple populations and multiple serotypes. Stefan holds a MS degree in computer science from the Royal Institute of Technology in Stockholm. He currently holds over 15 US patents. (edlund@almaden.ibm.com)
  • Matthew Davis is a graduate of the University of Oklahoma where he earned both his BS and MS in computer science. During his time at OU, he spent a significant amount of time developing a series of web portals with the aim of aiding in building student communities. These applications were later open sourced and adopted by several universities across the United States. Additionally, he spent time teaching in the computer science department at OU, primarily as an instructor for a senior-level computer graphics course. During the summer of 2006, Matt participated in the IBM Extreme Blue internship program at Almaden. He later joined IBM Research where, as an Eclipse Committer on the OHF project he helped develop the Eclipse Open Healthcare Framework (OHF) Bridge. The OHF Bridge is a web services platform that enables access to OHF actor profiles from non-Java applications. Matt is currently working on a server side implementation of STEM. (mattadav@us.ibm.com)
  • Judith V. Douglas is the lead technical writer for STEM. She coordinates STEM documentation and is coauthor on many of the STEM scientific publications. She holds master's degrees from Northwestern University and the Johns Hopkins Bloomberg School of Public Health and has published extensively in healthcare informatics. (judyvdouglas@verizon.net)
  • Barbara A. Jones, Ph.D., is a theoretical physicist at the IBM Almaden Research Center. She has contributed many mathematical algorithms, and has a long-term interest in applying advanced mathematical and physics methods to problems in epidemiology.
  • Justin Lessler, Ph.D., is in the Epidemiology Department at the Bloomberg School of Public Health at the Johns Hopkins University in Baltimore, Maryland.
  • Yossi Mesika is a research staff member in Healthcare and Life Sciences, at the IBM Haifa Research Lab in Haifa, Israel. He received his B.Sc. in Computer Engineering from Technion, the Israel Institute of Technology in Haifa. He joined IBM in 2003 and has contributed to several Healthcare projects that deal with interoperability. An Eclipse committer, he has also contributed to the WADO component in the Eclipse Open Health Framework. (mesika@il.ibm.com)
  • Roni Ram is a research staff member in the Healthcare and Life Sciences group, IBM Haifa Research Lab. She received her B.Sc. and M.Sc. in computer sciences from the Technion, Israel Institute of Technology in Haifa, Israel. Since joining IBM in 1996, she has worked on several projects involving user interfaces and IP telephony. For the last three years, she has been working on interoperability among health care organizations with focus on the public health domain.
  • Arik Kershenbaum works part-time at the IBM Haifa Research Lab and is a doctoral student at the Department of Evolutionary and Environmental Biology at the University of Haifa, Israel (personal website). He is developing add-ons and applications for STEM in the fields of zoonotic disease spread, particularly vector borne diseses. In addition, he is looking at other collaborative applications for the STEM framework in the fields of ecology and zoology, where ecosystems can be represented as a graph network. He has an undergraduate degree in Natural Sciences from the University of Cambridge in England.
  • Werner Keil is freelance IT Architect, Eclipse RCP Develper and Consultant having worked for governments and Global 500 companies worldwide. He has worked for more than 20 years as project manager, software architect, analyst and consultant on leading-edge technologies for Banking, Insurance, Telco/Mobile, Media and Public sector/Healthcare. Werner is committing member of the Eclipse Foundation and Babel Language Champion (German). As well as active member of the Java Community Process, including his role as UCUM/Java Lead, JavaEE 6 EG and Executive Committee Member(SE/EE). (werner.keil@gmx.net)
  • Matthias Filter is a research staff member at the Federal Institute for Risk Assessment(BfR), Germany.

VERMONT'

Researchers in Vermont are using STEM to model disease outbreaks in the state. They have created a model at the town/city level and are using transportation corridors (interstates and highways) as the pathways for disease spread. Currently they are investigating the potential spread of Pandemic Flu (based on Spanish Flu disease parameters) in a variety of scenarios and are examining how various interventions would mitigate the spread of the disease. In the future, they hope to look at how environmental changes will affect the emergence and spread of zoonotic infections.

The Vermont researchers are

  • Joanna "Jo" Conant. Jo graduated from Middlebury College in 2003 and is now a medical student at the University of Vermont College of Medicine. She is considering a career in Public Health, though also exploring other specialties. An avid skier, Jo moved from the deserts of Phoenix, Arizona, to the mounts of Vermont, where she enjoys skiing 100+ days each year. She now lives in Warren, Vermont, with her husband and dog.
  • Charles "Chuck" Hulse. Chuck graduated from Bucknell University in 1982, received his PhD in Chemistry from the University of Virginia in 1989 and his MD from the University of North Carolina at Chapel Hill in 1995. He completed his family medicine residency at the Department of Family Medicine of the University of Vermont College of Medicine in 1998. After serving as chief resident, he joined the facult and is now an Associate Professor of Family Medicine. A native of Eastern Long Island, Chuck has an intense interest in nature and is an aspiring nature photographer. He lives with his family in the beautiful Champlain Islands where he raises heirloom vegetables, fruits and berries, bees, chickens, goats, and sheep.


The STEM Developers Emeritus

  • Iris Eiron was a researcher at the IBM Almaden Research Lab before relocating to the IBM Research Lab in Haifa, Israel, where she continues to contribute to the development and implementation of a national health care information infrastructure. Together with Matthew Hammer and James Kaufman, Iris was one of the creators of the original version of STEM.
  • Matthew Hammer was an undergraduate at the University of Wisconsin. He is majoring in computer science with an interest in the field of programming languages. Mr. Hammer worked as an IBM research intern in the summers of 2003 and 2004. Together with Iris Eiron and James Kaufman, Matthew was one of the creators of the original version of STEM.
  • Ohad Greenshpan is part of the Healthcare and Life Sciences group at the IBM Haifa Research Labs. Mr. Greenshpan is an MSc student for Bioinformatics in Ben-Gurion University, concentrating on Protein Folding algorithms and Structural Bioinformatics. Prior to joining IBM, Mr. Greenshpan was a member of the Genecards team in Weizmann Institute of Science.
  • Nelson A. Perez was a software engineer for the Healthcare Informatics Research Group at IBM Almaden. Nowadays, Nelson is mostly interested in software engineering, distributed computing, social computing, and web technologies. He holds an MS degree in computer science from the University of California at Riverside.
  • John Thomas is a Java developer for IBM. He was previously one of the lead programmers for the IBM Almaden TSpaces project and also a member of the OptimalGrid Project at the Almaden Research Center. (jthomas119@gmail.com)

Eclipse Project Mentors

Our Eclipse Project Mentors are: Ed Merks mailto:ed.merks@gmail.com and Chris Aniszczyk mailto:zx@eclipsesource.com

Back to the top