Skip to main content

Notice: this Wiki will be going read only early in 2024 and edits will no longer be possible. Please see: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/wikis/Wiki-shutdown-plan for the plan.

Jump to: navigation, search

PTP/designs/SCI

< PTP‎ | designs
Revision as of 08:20, 20 August 2010 by G.watson.computer.org (Talk | contribs) (Installation)

Introduction

SCI (Scalable Communication Infrastructure) is a light-weight communication library which provides scalable message transmission functions for a client-server model, especially for a central server associated with a large number of clients. Internally, SCI makes use of a classical tree-based hierarchical structure to build up message transmission paths among server and clients. Typically, the server can be considered as front end and the clients can be considered as back ends.

Installation

Get the source code from the Eclipse CVS repository or download it into your Eclipse workspace. To download the SCI source from the Eclipse CVS repository:

  • CD to a directory where the SCI source will be extracted
  • Set CVSROOT by issuing the command export CVSROOT=:pserver:anonymous@dev.eclipse.org/cvsroot/tools
  • Checkout the SCI source by issuing the command cvs checkout org.eclipse.ptp/tools/sci/org.eclipse.ptp.sci
  • The SCI source will be located in the org.eclipse.ptp/tools/sci/org.eclipse.ptp.sci subdirectory
  • Note that you can specify qualifiers to the cvs command to extract SCI source other than at the HEAD (latest) level.

To download the SCI source into your Eclipse workspace:

  • Start Eclipse
  • Open the Eclipse installation wizard by clicking the Help menu then clicking Install New Software.
  • To download the initial SCI source, select the latest Eclipse release download site, Helios in 2010, from the Work with: dropdown. To download SCI updates, select the "Eclipse Project Update site"
  • Open the General Purpose Tools node in the software list and check the checkbox next to "PTP Scalable Communication Infrastructure (SCI)"
  • Click Next and follow the remaining prompts in the installation wizard
  • The SCI source code will be installed in the plugins directory of your Eclipse installation. The installation process will create a subdirectory where the directory name includes a time stamp. For instance, org.eclipse.ptp.sci_1.0.0.201006142322.
  • Once you download the SCI source, you must transfer the entire contents of this directory to the system where you will build SCI using FTP or other file transfer mechanism.

1. For source code, in SCI’s root directory, do

./configure
make
make install

If you want to enable the OpenSSL security mechanism in SCI, an option --enable-openssl can be specified:

./configure --enable-openssl

2. Launch scid: Assume SCI is installed into the directory /opt/sci, scid is located in /opt/sci/sbin. You must have root privileges to start scid. Once you have root privileges, start scid as /opt/sci/sbin/scid. You can also modify your system startup scripts to start scid at system startup.

Topology

Typically, a SCI session contains the following processes:

  • A front end process (FE).
  • One or multiple back end processes (BEs).

If using standalone agent mode which will be explained below, there will also be

  • zero or multiple agent processes (scia).

The processes build up a tree-based structure. The front end is the tree root and the back ends are the leaves. The communications are between the front end and the back ends. The messages are forwarded by agents/embedded agents and messages also can be filtered by the plug-ins running in front end or agents/embedded agents when they are passing messages either upstream or downstream.

SCI supports both stand-alone agent mode and embedded agent mode which can be specified by the environment variable SCI_EMBED_AGENT=[yes|no]. The interfaces for both modes are almost identical and are transparent to users with the exception of a connection call-back function which can be only used in the embedded mode. For stand-alone mode, there are scia processes which help to forward messages while for the embedded mode, the scias are embedded into the user’s back end which are called EA here. The tree-based hierarchical structures are the same for both modes. The front end is actually the root agent.

Stand-alone agent mode
Embedded agent mode

Note: a big block stands for a real entire back end which contains the user application code (BE) and the embedded agents (EA). Embedded agents are linked with the back end and run on separate threads from the application code. Not all the back ends have embedded agents. A back end will receive configuration information during SCI_Initialize() which indicates if it should create embedded agents.

Transmitting Messages

The SCI_Bcast and SCI_Upload functions are the most important functions for transmitting messages. SCI_Bcast is used to send messages from front end to any of the back ends while SCI_Upload is used to send messages from a back end to the front end.

SCI provides a one-sided communication model for message transmission, that is, there is a message handler which is a call-back function registered when SCI_Initialize() is called. The call-back function is called automatically when a message is received. SCI has both interrupt mode and polling mode. The message handler is triggered by an incoming message in either mode. For interrupt mode, this handler is called in a separate SCI thread while for polling mode, it is called within the SCI_Poll() function. Typically, it is the main thread which is blocking on SCI_Poll() when waiting for a message.

Filtering

SCI provides a plug-in mechanism for messages filtering which is done in SCI agents (scia or EA, also FE). Only one filter can be used for downstream (FE->BEs) per agent while multiple filters can be cascaded for upstream (BEs->FE) per agent/embedded agent.


Groups

A group may contain one or multiple back end ids. Users can create groups and communicate with groups of back end through a set of group APIs. Any back end id is considered as a predefined group id consisting of only that back end. The SCI_GROUP_ALL is also a predefined group which contains all the back ends.

Internally in each agent/embedded agent/front end, each group has two kinds of information, the first is the back end ids belonging to this group, the second is the direct successors of this agent/embedded agent/front.

Launch modes

SCI implements two launch modes, internal launch and external launch. Internal launching means the SCI back ends are forked by scid directly when the front end calls SCI_Initialize(). The external launching mode is intended for when the back ends have to be launched by a third party job launcher other than scid, for example, the back end is a debug engine and it has to be launched by another job launcher such as POE/PMD. Then users can use this mode; just set the SCI_CLIENT_ID and SCI_JOB_KEY environment variables before the back end calls SCI_Initialize(), if the front end has the same job key and calls SCI_Initialize(), the back end can find its parent agent and connect back to the SCI tree hierarchy. Both the front end and back ends should export SCI_USE_EXTLAUNCHER=yes. The embedded agent mode can not used with this mode.

Working model

The figures below are two examples for typical programming models. The handler is triggered once a message arrives, which is the concept of one-sided communication in this design.

Working model 1
Working model 2

APIs

All the function definitions and the related data structures can be referenced in sci.h and man pages. All the APIs except SCI_Initialize() and SCI_Terminate() are thread-safe.

SCI Environment initialize/terminate/query functions

int SCI_Initialize(sci_info_t *info);

This must be the first function called in a SCI job for both front end and back end.

typedef struct {
    sci_end_type_t          type;
    SCI_self_init_hndlr     *connect_hndlr;
    union {
        sci_fe_info_t           fe_info;
        sci_be_info_t           be_info;
    } _u;
} sci_info_t;
#define fe_info _u.fe_info;
#define be_info _u.be_info
type
used to specify SCI_FRONT_END or SCI_BACK_END.
connect_hndlr
used when users want to define their own connection method. Only embedded agent mode can use it.
fe_info
used in the front end while be_info is used in the back end.
typedef struct {
    sci_mode_t           mode;
    SCI_msg_hndlr        *hndlr;
    void                 *param;
    SCI_err_hndlr        *err_hndlr;
    char                 *hostfile;
    char                 *bepath;
    char                 **beenvp;
    sci_filter_list_t    		filter_list;
    char                 **host_list;
    char                 reserve[64];
} sci_fe_info_t;
mode
used to specify SCI_INTERRUPT or SCI_POLLING.
hndlr
a call back function which is called when messages arrive.
param
the message handler hndlr’s input parameter.
hostfile or host_list
used to specify the hostnames or IP addresses where the back ends will be launched, these two parameters are exclusive. If hostfile field is set in this structure, it can be overridden at runtime through the environment variable SCI_HOST_FILE. Normally using a hostfile is convenient, but if the host list is retrieved from somewhere at runtime, using the host_list directly is preferable.

The host list entries can be either host names or IP addresses. if IP addresses are used, no name resolution will happen internally. When working on a very large scale, users may prefer to use IP addresses to reduce processing time.

bepath
used to specify the path of the back end, which can be changed at runtime through the environment variable SCI_BACKEND_PATH.
beenvp
used to pass the environment variables other than SCI_ to the back ends. The last element in this array is required to be NULL. The format of each environment variable is “XXX=YYY”.
filter_list
used to specify a set of filters to be loaded during initializing. The filters will be loaded before the back ends are launched and before any messages are uploaded.
err_hndlr
intended for failover and recovery in the future. Simply set it to NULL to ignore it.
typedef struct {
    sci_mode_t       mode;
    SCI_msg_hndlr    *hndlr;
    void             *param;
    SCI_err_hndlr    *err_hndlr;
    char             reserve[64];
} sci_be_info_t;
mode
used to specify SCI_INTERRUPT or SCI_POLLING.
hndlr
the message handler
param
the message handler’s input parameter.
err_hndlr
intended for failover and recovery in the future. This field should be set to NULL.

int SCI_Terminate()

This is the last function in a SCI job, the entire SCI session will terminate and all resources are freed when this function returns.

int SCI_Query(sci_query_t query, void *ret_val);

typedef enum {
    JOB_KEY,
    NUM_BACKENDS,
    BACKEND_ID,
    POLLING_FD,
    NUM_FILTERS,
    FILTER_IDLIST,
    AGENT_ID,
    NUM_SUCCESSORS,
    SUCCESSOR_IDLIST,
    HEALTH_STATUS,
    AGENT_LEVEL
} sci_query_t;

The individual enumerations for sci_query_t have the following meanings

  • JOB_KEY: get the job key.
  • NUM_BACKENDS: get the number of back ends under this agent or FEBACKEND_ID: get the backend id.
  • POLLING_FD: get the file descriptor which can be used by select or poll.
  • NUM_FILTERS: get the number of filters.
  • FILTER_IDLIST: get the filters’ ids. Query NUM_FILTERS to get the number of filters first, then you can allocate proper space to get the filter id list. The output variable ret_val should be an int array. AGENT_ID: get the agent id;
  • NUM_SUCCESSORS: get the number of successors.
  • SUCCESSOR_IDLIST: get the ids of the successor list. Query NUM_SUCCESSOR first, then you can allocate proper space to get the successor id list. The output variable ret_val should be an int array.
  • HEALTH_STATUS: get the working status of a SCI front end, agent, or back end. (normal or exited).
  • AGENT_LEVEL: get the agent’s level. (FE is at level 0, the FE’s direct successors are level 1, and the successors’ successors are level 2 and so on).

Below is a table to summarize where each SCI query may be issued:

Front End Filter Back End
JOB_KEY X X X
NUM_BACKENDS X X
BACKEND_ID X
POLLING_FD X X
NUM_FILTERS X X X
FILTER_IDLIST X X X
AGENT_ID X X
NUM_SUCCESSORS X X
SUCCESSOR_IDLIST X X
HEALTH_STATUS X X X
AGENT_LEVEL X X

SCI Communication functions

int SCI_Bcast(int filter_id, sci_group_t group, int num_bufs, void *bufs[], int sizes[]);

This is used to broadcast messages from FE to BEs, only FE can call it. SCI_Bcast() sends a single message this is composed of all the message fragments in bufs.

filter_id
the id of the filter which is going to filter a message when it is processed by an agent/embedded agent/front end.
group
the destination of the message; it can be a user defined group id or simply a back end id. SCI_GROUP_ALL means the destinations are all the back ends.
num_bufs
the number of bufs.
bufs[]
the messages array.
sizes[]
the messages’ length array corresponding to bufs[].

int SCI_Upload(int filter_id, sci_group_t group, int num_bufs, void *bufs[], int sizes[]);

This function is used to upload messages from a back end to the front end.

filter_id
the id of the filter which is going to filter the message.
group
ignored
num_bufs, bufs[], sizes[]
have the same meaning as the parameters in SCI_Bcast.

int SCI_Poll(int timeout);

This function will block and wait until a message arrives or the timeout interval is reached.

timeout
the timeout in milliseconds; < 0 means no timeout. >=0 means waiting until timeout.

The return value will be 0 for success and SCI_ERR_POLL_TIMEOUT when a timeout happens.

SCI Group manipulation functions

int SCI_Group_create(int num_bes, int *be_list, sci_group_t *group);

This function is used to create a new group. It is a blocking call so the caller can assume group is ready to use upon the return of the function.

num_bes
the number of back ends in the be_list.
be_list
the back end id list to be contained in the new group.
group
an output parameter; it is the created new group and can be used by SCI_Bcast as a set of destinations.

int SCI_Group_free(sci_group_t group);

This function is used to free an existing group which was previously created by SCI_Group_create.

group
the group to be freed.

int SCI_Group_operate(sci_group_t group1, sci_group_t group2, sci_op_t op, sci_group_t *newgroup);

group1 & group2
the groups participating in the operation.
newgroup
the result group.
typedef enum {
    SCI_UNION,
    SCI_INTERSECTION,
    SCI_DIFFERENCE
} sci_op_t;
SCI_UNION
the newgroup is the union of group1 & group2.
SCI_INTERSECTION
the newgroup is the intersection of group1 & group2.
SCI_DIFFERENCE
the newgroup is the difference of group1 & group2.

int SCI_Group_operate_ext(sci_group_t group, int num_bes, int *be_list, sci_op_t op, sci_group_t *newgroup);

group
an existing group.
num_bes and be_list
the back end ids to be added to or removed from the group
newgroup
the result group.
op
has the same meaning as the one in SCI_Group_operate.

int SCI_Group_query(sci_group_t group, sci_group_query_t query, void *ret_val);

group
the group id to be queried.
ret_val
the output parameter which saves the result. The user is responsible for allocating sufficient space to hold the result.. Typically GROUP_MEMBER_NUM is called before GROUP_MEMBER and GROUP_SUCCESSOR_NUM is called before GROUP_SUCCESSOR in order to determine the size of the result.
typedef enum {
    GROUP_MEMBER_NUM,
    GROUP_MEMBER,
    GROUP_SUCCESSOR_NUM,
    GROUP_SUCCESSOR
} sci_group_query_t;
GROUP_MEMBER_NUM
get the number of members of this group. This number is the number of successors of the front end or agent where this function is called.
GROUP_MEMBER
get the member list of this group.
GROUP_SUCCESSOR_NUM
get the number of successors which have the group members. A successor means it is a direct child of the caller and some of the group members are under it. This is intended for a filter which wants to send messages through a specified path.
GROUP_SUCCESSOR
get the successor list which has the group members.

SCI Filter related functions

int SCI_Filter_load(sci_filter_info_t *filter_info);

This function is used to load a filter plug-in.

filter_info
contains the information for the filter plug-in to be loaded by dlopen in all the agents/embedded agents/front end. The back ends do not load the filters, but they have the filter list information.
typedef struct {
    int              filter_id;
    char             *so_file;
} sci_filter_info_t;
filter_id
the id of this filter.
so_file
location of this filter plug-in.

int SCI_Filter_unload(int filter_id);

This function is used to unload a filter whose id is filter_id.

int SCI_Filter_bcast(int filter_id, int num_successors, int *successor_list, int num_bufs, void *bufs[], int sizes[]);

This function broadcasts the messages downstream to the destinations specified by the successor_list; it must be called in the filter.

filter_id
stands for the filter to handle this message in the next hop agent/embedded agent; if set SCI_FILTER_NULL, it actually means the original filter specified in the SCI_Bcast which is called in the front end.
num_bufs, bufs[], and sizes[]
have the same meaning as SCI_Bcast.

int SCI_Filter_upload(int filter_id, sci_group_t group, int num_bufs, void *bufs[], int sizes[]);

This function is used to transmit the messages to another filter or upper layer and it must be called in the filter.

filter_id
the destination filter in the same agent/embedded agent/front end; if set to SCI_FILTER_NULL, the message will be set back with the original filter id specified in SCI_Upload in the back end and transmitted to the parent agent/embedded agent/front end. The filters are cascaded with this function.
group
ignored.
num_bufs, bufs[], and sizes[]
have the same meaning as SCI_Upload.

Back to the top