Difference between revisions of "Welcome STEM Developers"
(→Optional Eclipse Features and Plug-ins)
|Line 728:||Line 728:|
| UML class diagram drawing tool
| UML class diagram drawing tool
| [http://..com/ http://..com/]
Revision as of 12:14, 19 January 2010
This article is intended to be the starting point for developers working on the Spatio-Temporal Epidemiological Modeler Project (STEM) code base. It contains a detailed description of how the project is organized, how to find all of the resources associated with the project, and how to install, configure and use the necessary development environment.
One of the advantages of this document is that it gets newcomers “up and running” faster and it allows them to contribute to the project immediately by correcting any inaccuracies or omissions the document contains or by clarifying any descriptions. As a newcomer to the project you have a totally unique and extremely valuable perspective, if you can't find something or it doesn't make sense to you then you're probably not alone. When you find out the answer to your quandary, please record it here so those that come after you can “stand on your shoulders”. Welcome aboard!
The STEM code base is written in Java™ Version 5 and is organized as a set of well defined components using the eclipse (www.eclipse.org) plug-in tool framework. The components are integrated through a well defined “extension point” mechanism that makes the entire code base highly extensible. The use of the eclipse framework also provides for multi-platform portability.
Interestingly, eclipse is also the STEM project's Java™ development environment. If this seems a bit strange at first, don't worry, as you understand more about eclipse and the project this will make more sense. The project is organized and managed as an open source project. Much of the inspiration, philosophy and techniques for its management are taken directly from the book producing open source software by Karl Fogel (see the references later in this document).
Step-by-step for STEM developers
- Read this article.
- When you find missing, confusing, erroneous information, please contribute by fixing the problem. This Wiki allows any Bugzilla registered user to update content. Please do!
- Visit the STEM project page. Bookmark it.
- Obtain a Bugzilla account. 
- Subscribe to the project mailing list. The mailing list is used for all internal developer communication.
- Subscribe to the Newsgroup. 
- Install Java 5.0 and Eclipse 3.3 
- Obtain the source code for the STEM project from the code repository. 
You can also review the article on Eclipse Development Resources http://wiki.eclipse.org/index.php/Development_Resources
Coding Conventions and Guidelines
As with any product being built by a team, there are various areas where standards, conventions, and other guidelines can play a role in helping to ensure that the resulting product presents to developers and customers as a unified whole rather than as a loose collection of parts worked on by a variety of individuals each with their own styles and ways of working.
The STEM project will use the conventions and guidelines used by the Eclipse project. See the Eclipse Development Conventions and Guidelines.
The most important guidelines for STEM code are that it have good JavaDoc documentation and not generate compiler warnings.
The Eclipse compiler verifies the code against a preference list and will generate warning messages for code that does not follow rules specified in the compiler preferences. For example, it will check for "unchecked Generic type operations" and give a warning. We ask you remove all of the Eclipse compiler warnings before submitting the code. Also preferences for the following JavaDoc options should be changed to generate warnings and the warnings removed by fixing the code and/or creating accurate and informative Javadoc comments,
window->preferences->java->compiler->javadoc: Process Javadoc comments Malformed Javadoc Comments: warning Missing Javadoc tags(public) warning Missing Javadoc comments(Public) warning
If you are the owner or creator of a STEM subproject, you can create more restrictive standards by checking in a set of project preferences. However, please do not remove warnings by checking in project preferences that ignore bad coding practices.
A copyright should be on any human created text document including HTML,Java, properties, XML, XSD. Auto generated/machine made text files does not need to have a copyright. Such files are compiled from a human created file and it is not practical to add copyright to them (as we wouldn't add copyright to class files). To be specific, we are referring to auto generated JavaDocs (HTML) and EMF models (JAVA). For .java files the copyright statement should follow the package statement and proceed the import statements. The Copyright statement is further described here
package org.eclipse.stem; /************************************************************************* * Copyright (c) 2006,2007 IBM Corporation and others. * All rights reserved. This program and the accompanying materials * are made available under the terms of the Eclipse Public License v1.0 * which accompanies this distribution, and is available at * http://www.eclipse.org/legal/epl-v10.html * * Contributors: * IBM Corporation - initial API and implementation *************************************************************************/ import ... ;
To have the above automatically inserted in new code, do the following:
- select the above comment (from /* to */)
- select Window->preferences
- select Java->Code Style->CodeTemplates->Types
- select edit
- paste the copyright statement in place of the existing "Author" statement.
- select OK
National Language Support
STEM supports languages other than English. The basic process is to provide properly named properties files that mirror the "native" English properties files. These files are grouped into a "plug-in fragment" which acts as an "add on" to a plug-in and adds the files of the fragment to the plug-in as if they were there originally. We separate them so that additional languages can be added without changing the original plug-in.
If a native file is named "messages.properties", then the corresponding file with Spanish translations for the messages would be named "messages_es.properties". It is also possible to provide translations that are specific to a particular country, for instance, for Canadian English, the corresponding file would be "messages_en_CA.properties". For Californian English it would be "messages_en_US_CA.properties". This follows the regular Java conventions.
To get us started, I've created one new plug-in fragment called "org.eclipse.ohf.stem.ui.nl1" which contains the properties files for the main UI component of STEM in the plug-in "org.ecliopse.ohf.stem.ui". Each of the other plug-ins with translatable strings will require their own (yet to be created) plug-in fragment. I also created properly named, untranslated, properties files for Californian English, Canadian English, Spanish, Hebrew, Tamil and Chinese in the new fragment.
To start STEM (Eclipse) in a language different from the working language of the operating system, you need to use the "-nl" command line parameter. For example, use "-nl en_US_CA" to start STEM in Californian English, or "-nl es" to start it in Spanish. You might find it useful to create a new launch configuration in Eclipse with the parameter specified. I have one for each language.
1) Use the "Run..." menu item to open the "Create, manage, and run configurations" dialogue. 2) Duplicate the Eclipse application you use to launch STEM and rename the new one to reflect the language (e.g., "STEM Californian") 3) Select the "Arguments" tab 4) In the "Program arguments" text box enter the command line parameter (E.g., "-nl en_US_CA" without the quotes, for Californian English). 5) Apply 6) Run
Open the Active Simulations View and you should see a different title than normal.
Here are some good resources for more detailed information:
"Building Commercial Quality Plug-ins", Clayberg,
"The Java Developer's Guide to ECLIPSE", D'Anjou, et al.
Eclipse uses the Bugzilla system for reporting and processing Bug reports and enhancement request for the STEM project. Once a problem or enhancement has been submitted to Bugzilla, the Bugzilla entry should be used for subsequent discussion of the issue.
- The following page is used for creation of a Bugzilla account. 
- The following page is used to set the user email preferences. 
- The bug reporting page is:  Specify STEM as the component
- The following page can be used to find bugs in STEM https://bugs.eclipse.org/bugs/query.cgi
- Enter Technology as the Classification
- Enter STEM as the Product
The following article shows the life cycle of a bug from entering the system to being closed. http://www.eclipse.org/projects/dev_process/bugzilla-use.php
Before reporting a bug, please read the Eclipse bug writing guidelines and please search to see if the bug has already been reported.
There are some important considerations when you submit a bugzilla report. If it is a problem that you are reporting, then the promptness of a resolution is very much dependent on the completeness of the report. In most cases, a problem needs to be reproduced in order to fix it. Before you press submit; read your report over and think about whether if you were the STEM developer that gets the problem, would you have the information you need to reproduce the problem.
An important help in problem determination is the Error Log view on the STEM window. If any exceptions or serious errors occurred they will show up there. These entries can be exported to a file and attached to the bugzilla entry.
Most of the development of STEM was done on Windows with Eclipse 3.2 and lately 3.3. If you are using a different platform, be sure to mention this just in case this is a platform problem.
If you are submitting a suggestion for a new feature or enhancement, include some justification and enough details that it can be evaluated. Since others may want to contribute to a discussion of the issue, it would be a good idea to post a pointer to the bugzilla entry on the STEM newsgroup.
STEM Mailing List
Send stem-dev mailing list submissions to <firstname.lastname@example.org>
To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject of body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more more specific than "Re: Contents of stem-dev digest..."
The developer's mailing list should be used to communicate with other STEM developers. It is preferred over private emails.
The Newsgroup for STEM is news://news.eclipse.org/eclipse.technology
To access the newsgroup you will need to go to the following page and request a userid and password.
A special Newreader userid and password will be emailed to you. This userid and password is used to subscribe to the Newsgroup using whatever news reader you choose.
The newsgroup is for all of the components of Eclipse, not just STEM. We suggest that when you post to the newsgroup, you include STEM in the title to ensure it being read by the STEM community.
More information about access to the newsgroup is here: http://www.eclipse.org/newsgroup.php
For those who use FireFox and Thunderbird as their web browser and email client, the following may help in setting up Thunderbird to read the STEM newsgroup.
- From menubar, select Tools->Account Setting
- Select Add Account
- Select Newsgroup Account then Next
- Enter your name and email address then Next .
- Enter news.eclipse.org for Newsgroup Server then Next
- Enter the userid and password that were sent to you when you signed up.
- Enter the account name. The default of News.Eclipse.org is OK.
- You should now have "News.Eclipse.Org" listed in your "Folders"
- Select the News.Eclipse.org folder and then select Manage NewsGroup Subscriptions
- Expand the list of newsgroups until you find Eclipse.technology and select it.
- You should now have the list of previous postings to the newsgroup and it will be updated every time you get your mail from the server.
STEM IRC Chat
The IRC Chat channel for OHF STEM developers is on the freenode server with the other eclipse chat channels (see http://wiki.eclipse.org/index.php/IRC )
The URL is: irc://irc.freenode.net/#eclipse-stem
You can access it either through a browser with a chat extension (like "ChatZilla" for Firefox) or through the IRC client available in the Eclipse Communications Framework (ECF). This page will probably be useful: http://freenode.net/faq.shtml#nicksetup
Using ECF to access the chat
You'll find the ECF IRC client in the "Communications Perspective". There is a drop down menu with an icon of a human with a voice bubble to the left. Select IRC from that menu. When connecting to the irc server use "email@example.com" where "yourname" will be your login name otherwise known as your "nick" in IRC speak.
Follow the directions on this page http://freenode.net/faq.shtml#nicksetup
and then "join" the channel with "/join #eclipse-stem"
Using Chatzilla to access the chat
Chatzilla is a plugin for the FireFox Browser. To install it go to: !https://addons.mozilla.org/en-US/firefox/addon/16 and click on the Install Now button. After restarting Firefox, click on Tools->Chatzilla to start Chatzilla.
- To connect to the eclipse-stem channel, click on:
- To join the channel automatically on startup, right-click on the channel tab and click Open This Channel at Startup
- To register your nickname, issue the following commands in the Message input area:
/msg nickserv register <PASSWORD>
Under the Chatzilla menubar entry is a Preferences item that will allow you to customize Chatzilla to make IRC chatting easier.
- To set it up so that you’re automatically identified when you join the network, use these instructions:
- ChatZilla → Preferences. On the side menu, you’ll see a list of networks to which you’ve connected. Select irc.freenode.net.
- Click the Lists tab. You’ll see an area where you can set commands to auto-perform.
- Hit Add.
- Enter the following. Do not put the backslash, and replace <password> with your actual password:
msg nickserv identify <PASSWORD>
More hints on using Chatzilla are available at http://chatzilla.hacksrus.com/faq/
Submitting code to the STEM project
If you are not a project member with full committer access to SVN, the best way to contribute source code changes is to follow these steps:
- Obtain the source code with anonymous access to SVN.
- Make your changes using Eclipse.
- Test the changes using the most recent version of STEM.
- Prepare a patch as described below.
- If there is a Bugzilla item that describes the need for the change that you are making. then post a description of the patch to the Bugzilla entry and attach the patch.
- If there is no Bugzilla entry, please create one and attach your patch to it. It is much preferred that all patches be related to an appropriate bugzilla entry.
- If for some reason a bugzilla entry is not appropriate, email your patch to a member of the STEM team who has committer authority, along with a note describing the patch.
Preparing a patch (Eclipse users)
Eclipse can create and apply patches in the unified diff format. Assuming you are working out of an Eclipse SVN project created from the STEM SVN repository, you can create a (possibly, multi-file) patch by selecting the desired resource scope in the package explorer and doing "Team/Create Patch...". Because the patch must be applied to the same Eclipse resource it was generated from it is probably safest to select the project itself.
Notification of code changes
You can be notified when anyone commits code to the STEM project. This can be useful if you are actively working on STEM coding and want to know when changes are made that might affect you. And if you are not a committer, you might want to know when a committer has committed your code. The instructions for subscribing to the cvs-commit mailing list are at the following URL:
Creating a new standalone STEM application
At some point you may want to create a new version of the standalone Stem application. Or more likely, you may need to verify that changes you have made will work in the standalone version. To create a new temporary version of the STEM application for testing do the following.
- Select the org.eclipse.stem.ui project
- Select feature.product
- From the Overview tab
- select Export '.
It should request the name of the zip file to be generated and build it. It will run for many minutes. The resulting zip file can be expanded in a new directory like "c:\teststem" and invoked from a command window.
Software Engineering Documentation To Do's
- Describe the basic development philosophy of the project and the basic ideas behind “agile” software development.
- Explain how to test the system.
- Explain how to submit a bug report.
- Explain how to use JUnit.
Building The System
STEM developers can use the headless build mechanism (build plug-ins automatically outside the Eclipse IDE) for building distributions for various OS platforms.
The instructions below are for running the build on a Linux machine.
Before using the headless build, make sure you have the following software prerequisites:
- Eclipse Platform 3.5, make sure that the following Eclipse features are also installed:
- BIRT (and all its prerequisite plugins)
- (Alternatively, you can download the BIRT Report Designer All-in-one v2.5.0 or higher)
- Eclipse DeltaPack that matches the version of the above Eclipse
- JDK 5.0 or higher
- SVN for Ant (from here)
To build STEM using this mechanism you will need to follow these steps:
- Check-out from STEM SVN the plugin org.eclipse.stem.releng
- Copy the local.sh-template to local.sh, edit it and change the values within it to fit your local platform:
- MAJOR_VERSION - The version number of the STEM product you are building (e.g., 0.3.0)
- JAVA_HOME - Path to the JDK to be used
- ECLIPSE_HOME - The Eclipse SDK to be used for building
- Put the lib directory from the SVN for Ant Zip file under a directory named '.ant' in your home dir
- From a command line, change the directory into the above plugin directory and execute the build.sh script
The build script will first fetch the latest plugins code from the SVN and then build those plugins. The last step would be to pack those plugins into an executable product package.
If the build finishes successfully, you should see a "BUILD SUCCESSFUL" message at the end of logging messages.
The output of the build process is a set of Zip files, one for each different OS platform. Those are located under folder: build\I.WeeklyBuild.
STEM consist of a basic core system and (in the future) many projects that either use the basic system or enhance it. The following section describe the subprojects that use STEM or enhance it.
- Interface to the GoogleEarth application.
- Interface to the Eclipse BIRT product.
- Simple support for logging STEM data.
STEM - GoogleEarth Interface
STEM is a computer software system for defining and visualizing simulations of the geographical spread of contagious diseases. One way to visualize this geographical information is to overlay it on the GoogleEarth 3D world model.
The STEM-GoogleEarth project (STEM-GE) enables the logging of the spatial data with disease statistics included as it runs. Either simultaneously or after the fact, the logged data(in the form of KML files) is read by GoogleEarth and displayed by mapping disease state to color intensity.
To run the STEM-GoogleEarth Interface you need to do the following:
You must have installed the GoogleEarth application which is available for personal use from the GoogleEarth download site. You should verify that GoogleEarth works correctly on your machine by starting it and verifying that you can browse the 3D image because some older computers do not have the 3D graphics capabilities required by GoogleEarth. It is a fun application to play with and when you are done you can leave it running or not. If it is not active, STEM will start it.
Running STEM and GoogleEarth
- If you are running from the STEM source distribution:
- Update STEM to the latest code level and ensure it is refreshed and rebuilt.
- From project org.eclipse.stem.ui.ge select servlet.xml
- Select RunAs->AntBuild
- From Eclipse, Run Stem using stem.product in org.eclipse.stem.ui (Described earlier here.
- If you are running the STEM standalone executable, the above steps were already done for you and you just need to start STEM.exe.
- In the STEM workspace window
- A window will be displayed that contains a Display button and
a set of Radio buttons that select the disease aspect to be displayed.
- At this point the GoogleEarth application should start if it was not already started.
- Optionally, select Windows->Preferences->STEM->visualization->GoogleEarth
- Specify any preferences that you want to use. The defaults are probably OK.
- Using the Scenarios view: select your desired scenario.
- For example: STEM->Geography->Continent->NorthAmerica
- doubleClick on Spanish Flu to start the simulation.
- After the simulation has run for 7 or 8 cycles
- Pause the simulation
- Click the display button in the GoogleEarth View.
- You should see the GoogleEarth map showing some red area in the Northeast areas of the USA.
- Start the simulation again
- Pause and Click the GoogleEarth Display button every few cycles. These red areas should grow as the infections spread both across county borders and across the airline connections.
- You can use the GoogleEarth view to change the aspect displayed to any of:
- Infections - Red
- Exposed - Yellow
- Recovered - Green
- Susceptible - Blue
As distributed, the GoogleEarth interface runs in manual mode. Because of memory usage of both STEM and GoogleEarth they do not run well together unless your workstation has more than 1 Gigabyte of memory. If you do have lots of memory, you can use the GoogleEarth preferences to change so that as the simulation runs the results are shown simultaneously on the GoogleEarth display.
There are numerous ways that you can use GoogleEarth with your simulations. We will not describe them all. The STEM-GE Preference Page is described below. That is where you specify many options that control what the STEM-GE interface does. The most common use of STEM-GE is to run a simulation and have it write KML files to a folder. Simultaneously GoogleEarth is reading these KML files via a webServer and displaying the results of the simulation on the GoogleEarth screen.
The STEM-GoogleEarth interface has a set of preference that control how the application works. This preference page is accessed by going to windows->preferences->visualization and selecting the GoogleEarth entry.
You will get the following window: File:GEPreferences.jpg
- Choose the Method used to display STEM Results
With this option, the KML files are written but not displayed by GoogleEarth. This would be used when you are either going to display the GoogleEarth visualization at a later time or if you are going to run GoogleEarth from another system with the KML files on a shared disk.
With this option, the KML files are written and then displayed by GoogleEarth. GoogleEarth actually requests the file from a webserver Servlet which reads the file and sends it to GoogleEarth.
The KML is written on every cycle, overlaying the file written for the previous cycle. GoogleEarth asks for new data on a predetermined interval and is sent the current KML. The advantage over the previous method is that it helps keep GoogleEarth from falling to far behind the STEM processing.
With this option, the KML files are directly sent to GoogleEarth without using an intermediate web server. This can cause problems because GoogleEarth may get files faster than it can process them but it is more efficient.
The map is generated by user clicking the Display button. This is the default option.
- "Folder for KML logging:"This is the folder where STEM will write the KML files that GoogleEarth will read. If it already contains KML files, the user will be given the oportunity to delete them, keep them or choose a new folder.
- "User internal webserver"
This is used to cause the webserver built into Eclipse to be used.
- "Hostname:port for external webserver"
This is the required hostname and port for an external webserver. Normally the internal webserver would be used so this is not needed but there are cases where one might want to use an existing web server.
- "Automatically startup GoogleEarth"
If specified then when the STEM-GoogleEarth view is started, then the GoogleEarth application is also launched.
- "Automatically process every simulation"
if specified, then when you start any simulation running, it will automatically have its processing be mapped to GoogleEarth. Only the first one will be displayed by GE since it would be counterproductive to show 2 different views at the same time.
- "Write KML files only every N th cycle"
If the simulation does not change rapidly from cycle to cycle, significant overhead can be saved by only sending data to GoogleEarth every Nth cycle.
STEM Reports SubProject
The Reports subproject provides an interface to the Eclipse BIRT component. BIRT is an open source Eclipse-based reporting system that integrates with STEM to produce custom reports.
- instructions to be provided later **
In Process documentation This section is in the very early stages of documentation. Please help complete it.
An important component of STEM is the large collection of data that it contains. This section will attempt to describe the different data that is available, where it is and how it is used.
Administration Level The administration level for geographic data refers to the "political resolution" of the data sets. The United Nations defines political divisions called "Administration Levels".
The highest level is "UN Administration Level 0" which corresponds to "countries". We have 244 such "countries", we get our definition of a country from the ISO-3166-1 codes, of which there are 244. There is a ISO-3166-1 code for the United States (a country), and one for Puerto Rico (not a country, but a slightly separate political division). For the United States the ISO-3166-1 2 character code is US and the ISO-3166-1 3 character code is USA.
The next level down is "UN Administration Level 1", which for the United States are the states and for Canada are the provinces and territories. Below that is "UN Administration Level 2" The definition of these areas varies from country to country which is why we use the "level" instead. Level 2 is the limit on our current data sets. In the future we could add higher resolutions. For instance, for a small country like Israel we might go to level 3 and use something like a census track as the location.
Latitude: Latitude gives the location of a place on Earth north or south of the Equator. Latitude is an angular measurement in degrees (marked with °) ranging from 0° at the Equator to 90° at the poles (90° N for the North Pole or 90° S for the South Pole). In Stem, latitudes north of the Equator are positive values and south of the equator are negative.
Longitude: Longitude describes the location of a place on Earth east or west of a north-south line called the Prime Meridian. Longitude is given as an angular measurement ranging from 0° at the Prime Meridian to +180° eastward and −180° westward. The Greenwich meridian is the universal prime meridian or zero point of longitude.
ISO3166 code ISO 3166 is a three-part geographic coding standard for coding the names of countries and dependent areas, and the principal subdivisions thereof. The official name is Codes for the representation of names of countries and their subdivisions.
- ISO 3166-1 codes for country and dependent area names.
- ISO 3166-1 alpha-2 two-letter country codes
- ISO 3166-1 alpha-3 three-letter country codes
- ISO 3166-1 numeric three-digit country codes
- ISO 3166-2 Codes for the representation of names of countries and their subdivisions -- Part 2: Country subdivision code - defines codes for the principal subdivisions of a country or dependent area.
A list of all the county (Administration 0) codes is here: http://www.davros.org/misc/iso3166.html
An Introduction to STEM's Properties Files
In STEM II, we store data about a country and its administrative divisions in properties files. Properties files are used to define identifiers and to set population and area data at all levels (i.e. country, state, or county level). There are four main types of properties files: area, population, nodes, and names. The purpose of each property file will be explained later.
In STEM II, most of the data is standardized ISO 3166 or FIPS (Federal Information Processing Standards) data. According to Wikipedia, "ISO 3166 is a three-part geographic coding standard for coding the names of countries and dependent areas, and the principal subdivisions thereof." ISO 3166 specifies standard alpha-2 codes, alpha-3 codes, and three-digit country codes as well. For example, for the USA, the alpha-2 code would be US, the alpha-3 code is USA, and the three-digit country code is 840. In addittion, we use FIPS codes whenever possible to create identifiers that represent a political administration. For instance, the FIPS for New York County (i.e., Manhattan) is 36061 while Kings County (i.e., Brooklyn) is 36047. The first two digits of the FIPS identify the state.
Administrative Levels and Properties Files
Administrative levels correspond to political divisions of a country. A level 0 administration identifies an entire country (e.g. USA or Mexico). A level 1 administration corresponds to a subdivision of a country which can be a state, territory, parish, or a province. As an example we have that California, Colorado, and New York are all level 1 administrations of the USA. Similarly, level 2 administrations are subdivisions of a level 1 administration. Orange County, Monterey County, and Napa County are all level 2 administrations that belong to California.
A property file is a plain-text file that contains either area data, population data, or other relevant data related to the political divisions of a country at different administration levels. Properties files are located under org.eclipse.ohf.stem.internal.data\resources\data\country. There are four types of properties files :
- Names property file : This file defines the identifiers for every administrative division in a country at each level. For example, for the USA, at level 0 we would have "USA = United States". At level 1, we would have "US-CA = California" , "US-CO = Colorado", and "US-NY = New York".
At level 2, for Orange, Monterey, and Napa counties within California we have "US-CA-06059 = Orange County", "US-CA-06053 = Monterey County", and "US-CA-06055 = Napa County". There is a single names property file for every country. The corresponding names property file for the USA is USA_names.properties. For level 2 administrations, the five digits found on the identifiers (i.e. "06053" for identifier "US-CA-06053" are defined as follows : the leftmost two digits identify the level 1 administration ( "06" -> California ) while the remaining digits, which can be up to four, identify the level 2 administration ("053" -> Monterey County).
- Nodes property file : This file provides additional information about identifiers of administrative divisions. There is a nodes property file for each administration level. There is a fixed format that is used in these files and it is as follows:
# Format: Code = Name, ISO-3166-2 numeric code, Two letter code For example, at level 0 for the USA we have "USA = United States, 840, US". At level 1, for California, Colorado, and New York we have : "US-CA = California, 06, CA" , "US-CO = Colorado, 08, CO", and "US-NY = New York, 36, NY".
The nodes property files corresponding to the USA are USA_0_node.properties, USA_1_node.properties, and USA_2_node.properties.
- Area property file : This file contains area data (in square kilometers) for administrative divisions. There is an area property file for each administration level.
In the case of the USA at level 0, we have "USA = 9161923". At level 1, we have "US-CA = 163695.57", "US-CO = 104093.57", and "US-NY = 54556.00" for California, Colorado and New York respectively. At level 2, for Orange, Monterey, and Napa counties within California we have "US-CA-06059 = 2043.5006", "US-CA-06053 = 8603.9405", and "US-CA-06055 = 1952.8510". The area property files corresponding to the USA are USA_0_area.properties, USA_1_area.properties, and USA_2_area.properties.
- Population property file : This file contains population data for administrative divisions. There is a population property file for each administration level. In the case of the USA at level 0, we have "USA = 298444215".
At level 1, we have "US-CA = 33871648 ", "US-CO = 4301261", and "US-NY = 18976457" for California, Colorado and New York respectively. At level 2, for Orange, Monterey, and Napa counties within California we have "US-CA-06059 = 2846289", "US-CA-06053 = 401762", and "US-CA-06055 = 124279". The population property files corresponding to the USA are USA_0_human.properties, USA_1_human.properties, and USA_2_human.properties.
Updating values in a property file
To update a value in a property file you should follow a few simple steps. First, given that you know the name of the country and/or administrative division to be updated, look under org.eclipse.stem.internal.data\resources\data\country\<three letter identifier for the country> for the folder that belongs to a given country. In the case of the USA, we would go to org.eclipse.stem.internal.data\resources\data\country\USA where all the property files are located. Second, search in the names property file to find the identifier of the administration we want to update. Third, open the corresponding file (area, population, or node file) where the update will take place. Finally, search for the identifier and once found, update its value. As an example, let's say we want to update by increasing 100 square kilometers the area of Orange County which belongs to California, USA. Then, we would open org.eclipse.stem.internal.data\resources\data\country\USA\USA_names.properties and do a search for the identifier of "Orange County" which is "US-CA-06059". Then, since we learn from the names property file that Orange County is a level 2 administration, we would open "org.eclipse.stem.internal.data\resources\data\country\USA\USA_2_area.properties" and do a new search for identifier "US-CA-06059". Our search will find ""US-CA-06059 = 2043.5006" Finally, we do the update by adding 100 square kilometers to the current value "US-CA-06059 = 2143.5006".
Once we've updated the correct file(s), we need to run org.eclipse.stem.internal.data\resources\src\Main.java. By running Main.java we create a new serialized file that will take the changes into account. The new serialized file can be found under org.eclipse.stem.internal.data\temp\data\scenario\disease. Next time STEM II executes a scenario, it will load the new serialized file.
An Introduction to STEM Map Files
Maps for STEM are XML files that follow a GML format. These files are located under: org.eclipse.stem.geography\resources\data\geo\country\XXX where XXX is the three letter identifier for the country. There is also a second set of files at org.eclipse.stem.geography\resources\data\geo\country_reduced Which set is used is determined by a preference at Windows->Preferences->STEM->Visualization->MapDataManagement
There can be multiple map files for a country, one for each administration level. As an example, for the USA we have the following two maps: USA_1_MAP.xml and USA_2_MAP.xml. According to Wikipedia, GML is an XML grammar defined by the Open Geospatial Consortium (OGC) to express geographical features. As an example, the set of GML elements that define Orange county are:
//Sample taken from org.eclipse.stem.geography\resources\ data\geo\country\USA\USA_2_MAP.xml ... <gml:Polygon gml:id="US-CA-06053"> <gml:outerBoundaryIs> <gml:LinearRing> <gml:posList> 36.9186 -121.7019 36.9197 -121.6999 ... 36.9186 -121.7019 </gml:posList> </gml:LinearRing></gml:outerBoundaryIs> </gml:Polygon> ...
Updating a Map
To update a map, that is, to provide more accurate latitude and longitude data for a given location we follow a few steps. We find the identifier for the location we are trying to update. This step is similar to the step described for updating values in a property file. Once we have found it, we replace the latitute and longitude values contained within the <gml:postList> </gml:postList> tags. We need to make sure that the starting and ending latitute and longitude values are the same, otherwise it wont be accepted as a closed polygon. In the example above, note that for Monterey County ("US-CA-06053") the starting latitude and longitude values "36.9186 -121.7019" are the same as the ending ones "36.9186 -121.7019".
The country_reduced files have had the border data smoothed to reduce the number of lat/long points used to describe a border. It can take sometimes thousands of lat/long points to describe a border accurately. By smoothing the border data so that the border is described by many fewer points, we can save both CPU and memory usage.
Data Location and Usage
STEM Projects that contain data The following STEM projects contain data files that are used in STEM. In most cases these projects also contains code that processes the data.
- Eclipse Project names
Contains the Latitude/Longitude data for the Administration areas.
Contains the properties files that describe all of the data in STEM.
No longer actively used and should be ignored.
Code and data used to prepare the properties files.
Property files in org.eclipse.stem.internal.data The following property files are found in the following directory. NNN is the 3 character Country code and N is the administration level.
- resource/data/country/XXX/ There is a separate subdirectory for each country.
- XXX_N_area.properties This property file contains the area of the administration level in square kilometers. The Key is the key for the administration level ( level 0 = USA; Level 1 = US-CA; Level 2 = US-CA-06075
- XXX_N_human_2006_population.properties This property file contains the population of the administration level.
- XXX_N_node.properties This property file contains the full name of the administration area. >> we need an example of how to access it <<
- XXX_names.properties This property file contains a xref of both Level 1 and level 2 names.
- resource/data/relationship This contains the following subdirectories.
- commonborder See below.
- resource/data/decorator/disease This contains property files that describe existing diseases.
Spacial Data (Geographic Coordinates) that provides the latitude and longitude coordinates for the borders of the Administration areas.
Project: org.eclipse.stem.geography Path: resource/data/geo/country/XXX/ XXX_0_MAP.xml XXX_1_MAP.xml XXX_2_MAP.xml Example URI: platform:/plugin/org.eclipse.stem.geography/ resources/data/geo/country/USA/USA_2_MAP.xml
Key Format: For level 1 data, the key is the ISO3166-2 code. An ISO3166-2 code is composed as follows: Two letter country code followed by up to three alphanumeric characters for the level 1 administration. For level 2 data, the key is the ISO3166-2 code followed by up to six digits. The leftmost two digits indicate the level 1 container of a level 2 administration (i.e. California is a level 1 container for Orange County; a level 2 administration). The two digits were taken from a lexicographic sorting of all the level 1 administrations within a country. Similarly, the four leftmost digits indicate the level 2 administration. Again, these four digits are an index into the lexicographic sorting of all level 2 administrations within a level 1 administration.
Common Border Data that describes which Administration areas have common borders.
Project: org.eclipse.stem.internal.data Path: resource/data/relationship/commonborder/ XXX_2_XXX_2.properties Common Borders between admin areas within country XXX XXX_2_YYY_2.properties Common Borders between admin areas of country XXX to the admin areas of Country YYY where XXX and YYY have a common border. The property files will list the bordering admin level 2 areas. Example URI: platform:/plugin/org.eclipse.stem.internal.data/ resources/data/relationship/ commonborder/AFG_2_CHN_2.properties
Generated by: NeighborUtility.java Located in: org.eclipse.stem.internal.data
The following is a list of software that is useful to have while working on the STEM project.
IBM's Java™ 5 JDK
Optional Eclipse Features and Plug-ins
|Subclipse||Subversion client||Subversion client|
|UMLet||UML class diagram drawing tool||http://www.umlet.com/|
|Metrics||Java™ code metrics computations||http://metrics.sourceforge.net|
|Merline Generator||GEF generator from EMF||http://sourceforge.net/projects/merlingenerator|
Eclipse Modeling Framework Budinsky, F., et al, Addison-Wesley. 2003. ISBN 0131425420. This is THE book on the eclipse modeling framework (EMF). Note- a second edition is due out very soon, and is available for pre-order
Eclipse:Building Commercial-Quality Plug-Ins Clayberg, E., Rubel, D., Addison-Wesley, 2004. ISBN: 0321228472. Good reference, but starts out a bit too simply.
Eclipse Distilled Carlson, D., Addison-Wesley, 2005. ISBN 0321288157. A concise guide with advanced explanations of how to exploit the features of eclipse. Very useful for people who are already familiar with eclipse because it adds an extra layer of sophistication to the reader's bag-of-tricks.
Eclipse Cookbook Holzner, S., O'Reilly, 2004, ISBN 0-596-00710-8. Good “how to” guide to use eclipse, less for eclipse plug-in development itself.</P>  Eclipse Holzner, S., O'Reilly, 2004, ISBN 0-596-00641-1. Good starter for new eclipse users/developers.</P> "The Definitive Guide to SWT and JFace Harris, R., Warner, R., Apress, 2004, ISBN 1-509059-325-1.</P>
Producing Open Source Software Fogel, K., O'Reilly, 2005, ISBN 0-596-00759-0. This is a great book on how to manage software development in general not just open source. We're using it as a guide for this project.</I>