Jump to: navigation, search

Difference between revisions of "Jetty/Feature/Stress Testing CometD"

< Jetty‎ | Feature
m
m
(One intermediate revision by the same user not shown)
Line 11: Line 11:
 
*[[#Editing the Jetty configuration for CometD testing|Editing the Jetty configuration for CometD testing]]
 
*[[#Editing the Jetty configuration for CometD testing|Editing the Jetty configuration for CometD testing]]
  
*[[#Running the Jetty Bayeux test clientRunning the Jetty Bayeux test client]]
+
*[[#Running the Jetty Bayeux test client|Running the Jetty Bayeux test client]]
  
 
*[[#Interpreting the results|Interpreting the results]]
 
*[[#Interpreting the results|Interpreting the results]]
Line 24: Line 24:
 
For a Linux system, change the file descriptor limit in the /etc/security/limit.conf file. Add the following two lines (or change any existing <tt>nofile</tt> lines):
 
For a Linux system, change the file descriptor limit in the /etc/security/limit.conf file. Add the following two lines (or change any existing <tt>nofile</tt> lines):
  
 +
<source lang="bash">
  
 +
* hard nofile 40000
 +
* hard nofile 40000
 +
 +
</source>
  
<source lang="bash"></source> * hard nofile 40000 * hard nofile 40000 fckLRfckLRYou can tune many other values in the server stack; the [http://knowledgehub.zeus.com/articles/2005/09/02/tuning_zxtm_for_maximum_performance zeus ZXTM] documentation provides a good overview.fckLRfckLR==Installing, configuring and running CometD==fckLRfckLRThe CometD client and server are now in the [http://cometd.org/ CometD Project at The Dojo Foundation], including downloads and documentation.fckLRfckLR==Installing, configuring and running the Jetty server==fckLRfckLRJetty installation is easy to accomplish. See [[Jetty/Starting/Downloads|Downloading Jetty]], and [[Jetty/Howto/Run_Jetty|How to Run Jetty]].fckLRfckLR===Editing the Jetty configuration for CometD testing===fckLRfckLRFor the purposes of CometD testing, you need to edit the standard configuration of Jetty (&lt;tt&gt;etc/jetty.xml&lt;/tt&gt; to change the connector configuration as follows:fckLRfckLR* Increase the max idle time.fckLR* Increase the low resources connections.fckLRfckLRThe relevant section to update is:fckLRfckLR<source lang="xml"></source> <call name="addConnector"><arg><new class="org.eclipse.jetty.nio.SelectChannelConnector"><set name="host"><systemproperty name="jetty.host"></systemproperty></set> <set name="port"><systemproperty name="jetty.port" default="8080"></systemproperty></set> <set name="maxIdleTime">300000</set> <set name="Acceptors">2</set> <set name="statsOn">false</set> <set name="confidentialPort">8443</set> <set name="lowResourcesConnections">25000</set> <set name="lowResourcesMaxIdleTime">5000</set> </new> </arg> </call> fckLRfckLRTo run the server with the additional memory needed for the test, use:fckLRfckLR<source lang="java"></source> java -Xmx2048m -jar start.jar etc/jetty.xml fckLRfckLRYou should now be able to point a browser at the server at either:fckLRfckLR* <nowiki>http://localhost:8080/</nowiki>fckLR* <nowiki>http://yourServerIpAddress:8080/</nowiki>fckLRfckLRSpecifically try out the CometD chat room with your browser to confirm that it is working.fckLRfckLR==Running the Jetty Bayeux test client==fckLRfckLRThe Jetty CometD Bayeux test client generates load simulating users in a chat room. To run the client:fckLRfckLR<source lang="bash"></source> cd $JETTY_HOME/contrib/cometd/client bin/run.sh fckLRfckLR{{note|Depending on the version you might need to create a lib/cometd directory and put the cometd-api, cometd-java-server and cometd-java-client in it.fckLRfckLR* lib/cometd/fckLR* lib/cometd/cometd-api-1.0.beta9.jarfckLR* lib/cometd/cometd-client-6.1.19.jarfckLR* lib/cometd/cometd-server-6.1.19.jarfckLR}}fckLRfckLRThe client has a basic text UI that operates in two phases: 1) global configuration 2) test runs. An example global configuration phase looks like:fckLRfckLR<source lang="bash"></source> # bin/run.sh 2008-04-06 13:43:57.545::INFO: Logging to STDERR via org.eclipse.log.StdErrLog server[localhost]: 192.126.8.11 port[8080]: context[/cometd]: base[/chat/demo]: rooms [100]: 10 rooms per client [1]: max Latency [5000]: fckLRfckLRUse the Enter key to accept the default value, or enter a new value and then press &lt;tt&gt;Enter&lt;/tt&gt;. The parameters and their meaning are:fckLRfckLR* ''server''–Host name or IP address of the server running Jetty with CometDfckLR* ''8080''–Port (8080 unless you have changed it in jetty.xml)fckLR* ''context''–Context of the web application running CometD (CometD in the test server).fckLR* ''base''–Base Bayeux channel name used for chat room. Normally you would not change this.fckLR* ''rooms''–Number of chat rooms to create. This value combines with the number of users to determine the users per room. If you have 100 rooms and 1000 users, then you will have 10 users per room and every message sent is delivered 10 times. For runs with &gt;10k users, 1000 rooms is a reasonable value.fckLR* ''rooms per client''–Allows a simulated user to subscribe to multiple rooms. However, as these are randomly selected, values greater than 1 mean that the client is unable to accurately predict the number of messages that will be delivered. Leave this at 1 unless you are testing something specific.fckLR* ''max Latency''–Instructs Jetty to abort the test if the latency for delivering a message is greater than this value (in ms).fckLRfckLRAfter the global configuration, the test client loops through individual tests cycles.¬† Again, press Enter to accept the default value. Two example iterations of the test cycle follow:fckLRfckLR<source lang="bash"></source> clients [100]: 100 clients = 0010 clients = 0020 clients = 0030 clients = 0040 clients = 0050 clients = 0060 clients = 0070 clients = 0080 clients = 0090 clients = 0100 Clients: 100 subscribed:100 publish [1000]: publish size [50]: pause [100]: batch [10]: 0011111111221111111111111111100000000000000000000000000000000000000000000000000000000000000000000000 Got:10000 of 10000 Got 10000 at 901/s, latency min/ave/max =2/41/922ms -- clients [100]: Clients: 100 subscribed:100 publish [1000]: publish size [50]: pause [100]: batch [10]: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 Got:10000 of 10000 Got 10000 at 972/s, latency min/ave/max =3/26/172ms -- fckLRfckLRThe parameters that you can set follow:fckLRfckLR* ''clients''–Number of clients to simulate. The clients are kept from one test iteration to the next, so if the number of clients changes, or an incremental number of new clients are created or destroyed, take that into account here. (Currently reducing clients produces a noisy exception as the connection is retried. You can ignore this exception).fckLR* ''publish''–Number of chat messages to publish for this test. The number of messages received is this number multiplied by the users per chat room (which is the number of clients divided by the global number of rooms).fckLR* ''publish size''–Size in bytes of the chat message to publish.fckLR* ''pause''–A period (in ms) to pause between batches of published messages.fckLR* ''batch''–Size of the batch of published messages to send in a burst.fckLRfckLRWhile the test is executing, a series of digits outputs to show progress. The digits represent the current average latency in units of 100 ms. For example, 0 represents &lt; 100 ms latency from the time the client published the message to when it was received. And 1 represents a latency &gt;= 100 ms and &lt; 200 ms.fckLRAt the end of the test cycle the summary is printed showing the total messages received, the message rate and the min/ave/max latency.fckLRfckLR==Interpreting the results==fckLRfckLRBefore producing numbers for interpretation, it is important to run a number of trials, which allows the system to "warm up." During the initial runs, the Java JIT compiler optimizes the code and populates object pools with reusable objects. Thus the first runs for a given number of clients is often slower. This can be seen in the test cycle shown above where the average latency initially grew to over 200 ms before it fell back to &lt; 100 ms. The average and maximum latency for the second run were far superior to the first run.fckLRIt is also important to use long runs for producing results for the following reasons:fckLRfckLR* To reduce any statistical effect of the ramp-up and ramp-down periods.fckLR* To ensure that any resources (for example, queues, memory, file descriptors) that are being used in a non-sustainable way have a chance to max out and cause errors, garbage collections or other adverse affects.fckLR* To include in the results any occasional system hiccups caused by other system eventsfckLRfckLRTypically it is best to start with short, low-volume test cycles, and to gradually reduce the pause or increase the batch to determine approximate maximum message rates. Then you can extend the test duration by increasing the number of messages published or the number of clients (which also increases the message rate as there are more users per room).fckLRA normal run should report no exceptions or timeouts. For a single server and single test client with one room per simulated client, the number of messages expected should always be the number received. If the server is running clustered, the messages received reduce by a factor equal to the number of servers. Similarly, if you are using multiple clients, since each test client sees messages published from the other test clients, the number of messages received will exceed the number sent.fckLRfckLR===Testing load balancers===fckLRfckLRWhen testing a load balancer, be aware of the following:fckLR* Start with a cluster of one so that you can verify that no messages are being lost. Then increase the cluster size. fckLR* You will not have exact message counts, and must adjust according to the number of nodes.fckLR* It is very important that there is affinity, as the Bayeux client ID must be known on the worker node used, and both connections from the same simulated node must arrive at the same worker node. However, the test does not use HTTP sessions, so the balancer must set any cookies used for affinity (the test client handles set cookies).fckLRfckLR{{note|In reality, IP source hash is a sufficient affinity for Bayeux, but in this test, all clients come from the same IP address. Also note that the real dojo CometD client has good support for migrating to new nodes if affinity fails or the cluster changes. Also a real chat room server implementation would probably be backed by JMS so that multiple nodes would still represent a single chat space.}}fckLRfckLR}}
+
You can tune many other values in the server stack; the [http://knowledgehub.zeus.com/articles/2005/09/02/tuning_zxtm_for_maximum_performance zeus ZXTM] documentation provides a good overview.
 +
 
 +
==Installing, configuring and running CometD==
 +
 
 +
The CometD client and server are now in the [http://cometd.org/ CometD Project at The Dojo Foundation], including downloads and documentation.
 +
 
 +
==Installing, configuring and running the Jetty server==
 +
 
 +
Jetty installation is easy to accomplish. See [[Jetty/Starting/Downloads|Downloading Jetty]], and [[Jetty/Howto/Run_Jetty|How to Run Jetty]].
 +
 
 +
===Editing the Jetty configuration for CometD testing===
 +
 
 +
For the purposes of CometD testing, you need to edit the standard configuration of Jetty (<tt>etc/jetty.xml</tt> to change the connector configuration as follows:
 +
 
 +
* Increase the max idle time.
 +
* Increase the low resources connections.
 +
 
 +
The relevant section to update is:
 +
 
 +
<source lang="xml">
 +
 
 +
 +
<Call name="addConnector">
 +
  <Arg>
 +
    <New class="org.eclipse.jetty.nio.SelectChannelConnector">
 +
      <Set name="host"><SystemProperty name="jetty.host" /></Set>
 +
      <Set name="port"><SystemProperty name="jetty.port" default="8080"/></Set>
 +
      <Set name="maxIdleTime">300000</Set>
 +
      <Set name="Acceptors">2</Set>
 +
      <Set name="statsOn">false</Set>
 +
      <Set name="confidentialPort">8443</Set>
 +
      <Set name="lowResourcesConnections">25000</Set>
 +
      <Set name="lowResourcesMaxIdleTime">5000</Set>
 +
        </New>
 +
  </Arg>
 +
</Call>
 +
 
 +
</source>
 +
 
 +
To run the server with the additional memory needed for the test, use:
 +
 
 +
<source lang="java">
 +
 +
java -Xmx2048m -jar start.jar etc/jetty.xml
 +
 
 +
</source>
 +
 
 +
You should now be able to point a browser at the server at either:
 +
 
 +
* <nowiki>http://localhost:8080/</nowiki>
 +
* <nowiki>http://yourServerIpAddress:8080/</nowiki>
 +
 
 +
Specifically try out the CometD chat room with your browser to confirm that it is working.
 +
 
 +
==Running the Jetty Bayeux test client==
 +
 
 +
The Jetty CometD Bayeux test client generates load simulating users in a chat room. To run the client:
 +
 
 +
<source lang="bash">
 +
 
 +
cd $JETTY_HOME/contrib/cometd/client
 +
bin/run.sh
 +
 
 +
</source>
 +
 
 +
{{note|Depending on the version you might need to create a lib/cometd directory and put the cometd-api, cometd-java-server and cometd-java-client in it.
 +
 
 +
* lib/cometd/
 +
* lib/cometd/cometd-api-1.0.beta9.jar
 +
* lib/cometd/cometd-client-6.1.19.jar
 +
* lib/cometd/cometd-server-6.1.19.jar
 +
}}
 +
 
 +
The client has a basic text UI that operates in two phases: 1) global configuration 2) test runs. An example global configuration phase looks like:
 +
 
 +
<source lang="bash">
 +
 
 +
# bin/run.sh
 +
2008-04-06 13:43:57.545::INFO: Logging to STDERR via org.eclipse.log.StdErrLog
 +
server[localhost]: 192.126.8.11
 +
port[8080]:
 +
context[/cometd]:
 +
base[/chat/demo]:
 +
rooms [100]: 10
 +
rooms per client [1]:
 +
max Latency [5000]:
 +
 +
</source>
 +
 
 +
Use the Enter key to accept the default value, or enter a new value and then press <tt>Enter</tt>. The parameters and their meaning are:
 +
 
 +
* ''server''–Host name or IP address of the server running Jetty with CometD
 +
* ''8080''–Port (8080 unless you have changed it in jetty.xml)
 +
* ''context''–Context of the web application running CometD (CometD in the test server).
 +
* ''base''–Base Bayeux channel name used for chat room. Normally you would not change this.
 +
* ''rooms''–Number of chat rooms to create. This value combines with the number of users to determine the users per room. If you have 100 rooms and 1000 users, then you will have 10 users per room and every message sent is delivered 10 times. For runs with >10k users, 1000 rooms is a reasonable value.
 +
* ''rooms per client''–Allows a simulated user to subscribe to multiple rooms. However, as these are randomly selected, values greater than 1 mean that the client is unable to accurately predict the number of messages that will be delivered. Leave this at 1 unless you are testing something specific.
 +
* ''max Latency''–Instructs Jetty to abort the test if the latency for delivering a message is greater than this value (in ms).
 +
 
 +
After the global configuration, the test client loops through individual tests cycles.  Again, press Enter to accept the default value. Two example iterations of the test cycle follow:
 +
 
 +
<source lang="bash">
 +
 
 +
clients [100]: 100
 +
clients = 0010
 +
clients = 0020
 +
clients = 0030
 +
clients = 0040
 +
clients = 0050
 +
clients = 0060
 +
clients = 0070
 +
clients = 0080
 +
clients = 0090
 +
clients = 0100
 +
Clients: 100 subscribed:100
 +
publish [1000]:  
 +
publish size [50]:  
 +
pause [100]:  
 +
batch [10]:  
 +
0011111111221111111111111111100000000000000000000000000000000000000000000000000000000000000000000000
 +
 +
Got:10000 of 10000
 +
Got 10000 at 901/s, latency min/ave/max =2/41/922ms
 +
--
 +
clients [100]:  
 +
Clients: 100 subscribed:100
 +
publish [1000]:  
 +
publish size [50]:  
 +
pause [100]:  
 +
batch [10]:  
 +
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 +
 +
Got:10000 of 10000
 +
Got 10000 at 972/s, latency min/ave/max =3/26/172ms
 +
--
 +
</source>
 +
 
 +
The parameters that you can set follow:
 +
 
 +
* ''clients''–Number of clients to simulate. The clients are kept from one test iteration to the next, so if the number of clients changes, or an incremental number of new clients are created or destroyed, take that into account here. (Currently reducing clients produces a noisy exception as the connection is retried. You can ignore this exception).
 +
* ''publish''–Number of chat messages to publish for this test. The number of messages received is this number multiplied by the users per chat room (which is the number of clients divided by the global number of rooms).
 +
* ''publish size''–Size in bytes of the chat message to publish.
 +
* ''pause''–A period (in ms) to pause between batches of published messages.
 +
* ''batch''–Size of the batch of published messages to send in a burst.
 +
 
 +
While the test is executing, a series of digits outputs to show progress. The digits represent the current average latency in units of 100 ms. For example, 0 represents < 100 ms latency from the time the client published the message to when it was received. And 1 represents a latency >= 100 ms and < 200 ms.
 +
At the end of the test cycle the summary is printed showing the total messages received, the message rate and the min/ave/max latency.
 +
 
 +
==Interpreting the results==
 +
 
 +
Before producing numbers for interpretation, it is important to run a number of trials, which allows the system to "warm up." During the initial runs, the Java JIT compiler optimizes the code and populates object pools with reusable objects. Thus the first runs for a given number of clients is often slower. This can be seen in the test cycle shown above where the average latency initially grew to over 200 ms before it fell back to < 100 ms. The average and maximum latency for the second run were far superior to the first run.
 +
It is also important to use long runs for producing results for the following reasons:
 +
 
 +
* To reduce any statistical effect of the ramp-up and ramp-down periods.
 +
* To ensure that any resources (for example, queues, memory, file descriptors) that are being used in a non-sustainable way have a chance to max out and cause errors, garbage collections or other adverse affects.
 +
* To include in the results any occasional system hiccups caused by other system events
 +
 
 +
Typically it is best to start with short, low-volume test cycles, and to gradually reduce the pause or increase the batch to determine approximate maximum message rates. Then you can extend the test duration by increasing the number of messages published or the number of clients (which also increases the message rate as there are more users per room).
 +
A normal run should report no exceptions or timeouts. For a single server and single test client with one room per simulated client, the number of messages expected should always be the number received. If the server is running clustered, the messages received reduce by a factor equal to the number of servers. Similarly, if you are using multiple clients, since each test client sees messages published from the other test clients, the number of messages received will exceed the number sent.
 +
 
 +
===Testing load balancers===
 +
 
 +
When testing a load balancer, be aware of the following:
 +
* Start with a cluster of one so that you can verify that no messages are being lost. Then increase the cluster size.  
 +
* You will not have exact message counts, and must adjust according to the number of nodes.
 +
* It is very important that there is affinity, as the Bayeux client ID must be known on the worker node used, and both connections from the same simulated node must arrive at the same worker node. However, the test does not use HTTP sessions, so the balancer must set any cookies used for affinity (the test client handles set cookies).
 +
 
 +
{{note|In reality, IP source hash is a sufficient affinity for Bayeux, but in this test, all clients come from the same IP address. Also note that the real dojo CometD client has good support for migrating to new nodes if affinity fails or the cluster changes. Also a real chat room server implementation would probably be backed by JMS so that multiple nodes would still represent a single chat space.}}
 +
 
 +
}}

Revision as of 13:27, 30 January 2012



Introduction

These instructions describe how to stress test CometD from Jetty7 running on Unix. The same basic steps apply to Windows or Mac; please feel free to add details and terminology specific to these platforms to this wiki.

The basic steps are:

Feature

Configuring/tuning the operating system of the test client and server machines

The operating system must be able to support the number of connections (file descriptors) for the test on both the server machine and the required test client machines.

For a Linux system, change the file descriptor limit in the /etc/security/limit.conf file. Add the following two lines (or change any existing nofile lines):

* hard nofile 40000
* hard nofile 40000

You can tune many other values in the server stack; the zeus ZXTM documentation provides a good overview.

Installing, configuring and running CometD

The CometD client and server are now in the CometD Project at The Dojo Foundation, including downloads and documentation.

Installing, configuring and running the Jetty server

Jetty installation is easy to accomplish. See Downloading Jetty, and How to Run Jetty.

Editing the Jetty configuration for CometD testing

For the purposes of CometD testing, you need to edit the standard configuration of Jetty (etc/jetty.xml to change the connector configuration as follows:

  • Increase the max idle time.
  • Increase the low resources connections.

The relevant section to update is:

 
 <Call name="addConnector">
   <Arg>
     <New class="org.eclipse.jetty.nio.SelectChannelConnector">
       <Set name="host"><SystemProperty name="jetty.host" /></Set>
       <Set name="port"><SystemProperty name="jetty.port" default="8080"/></Set>
       <Set name="maxIdleTime">300000</Set>
       <Set name="Acceptors">2</Set>
       <Set name="statsOn">false</Set>
       <Set name="confidentialPort">8443</Set>
       <Set name="lowResourcesConnections">25000</Set>
       <Set name="lowResourcesMaxIdleTime">5000</Set>
         </New>
   </Arg>
 </Call>

To run the server with the additional memory needed for the test, use:

 
 java -Xmx2048m -jar start.jar etc/jetty.xml

You should now be able to point a browser at the server at either:

  • http://localhost:8080/
  • http://yourServerIpAddress:8080/

Specifically try out the CometD chat room with your browser to confirm that it is working.

Running the Jetty Bayeux test client

The Jetty CometD Bayeux test client generates load simulating users in a chat room. To run the client:

 cd $JETTY_HOME/contrib/cometd/client
 bin/run.sh
Note.png
Depending on the version you might need to create a lib/cometd directory and put the cometd-api, cometd-java-server and cometd-java-client in it.
  • lib/cometd/
  • lib/cometd/cometd-api-1.0.beta9.jar
  • lib/cometd/cometd-client-6.1.19.jar
  • lib/cometd/cometd-server-6.1.19.jar


The client has a basic text UI that operates in two phases: 1) global configuration 2) test runs. An example global configuration phase looks like:

 # bin/run.sh
 2008-04-06 13:43:57.545::INFO:  Logging to STDERR via org.eclipse.log.StdErrLog
 server[localhost]: 192.126.8.11
 port[8080]:
 context[/cometd]:
 base[/chat/demo]:
 rooms [100]: 10
 rooms per client [1]:
 max Latency [5000]:

Use the Enter key to accept the default value, or enter a new value and then press Enter. The parameters and their meaning are:

  • server–Host name or IP address of the server running Jetty with CometD
  • 8080–Port (8080 unless you have changed it in jetty.xml)
  • context–Context of the web application running CometD (CometD in the test server).
  • base–Base Bayeux channel name used for chat room. Normally you would not change this.
  • rooms–Number of chat rooms to create. This value combines with the number of users to determine the users per room. If you have 100 rooms and 1000 users, then you will have 10 users per room and every message sent is delivered 10 times. For runs with >10k users, 1000 rooms is a reasonable value.
  • rooms per client–Allows a simulated user to subscribe to multiple rooms. However, as these are randomly selected, values greater than 1 mean that the client is unable to accurately predict the number of messages that will be delivered. Leave this at 1 unless you are testing something specific.
  • max Latency–Instructs Jetty to abort the test if the latency for delivering a message is greater than this value (in ms).

After the global configuration, the test client loops through individual tests cycles.  Again, press Enter to accept the default value. Two example iterations of the test cycle follow:

 clients [100]: 100
 clients = 0010
 clients = 0020
 clients = 0030
 clients = 0040
 clients = 0050
 clients = 0060
 clients = 0070
 clients = 0080
 clients = 0090
 clients = 0100
 Clients: 100 subscribed:100
 publish [1000]: 
 publish size [50]: 
 pause [100]: 
 batch [10]: 
 0011111111221111111111111111100000000000000000000000000000000000000000000000000000000000000000000000
 
 Got:10000 of 10000
 Got 10000 at 901/s, latency min/ave/max =2/41/922ms
 --
 clients [100]: 
 Clients: 100 subscribed:100
 publish [1000]: 
 publish size [50]: 
 pause [100]: 
 batch [10]: 
 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
 
 Got:10000 of 10000
 Got 10000 at 972/s, latency min/ave/max =3/26/172ms
 --

The parameters that you can set follow:

  • clients–Number of clients to simulate. The clients are kept from one test iteration to the next, so if the number of clients changes, or an incremental number of new clients are created or destroyed, take that into account here. (Currently reducing clients produces a noisy exception as the connection is retried. You can ignore this exception).
  • publish–Number of chat messages to publish for this test. The number of messages received is this number multiplied by the users per chat room (which is the number of clients divided by the global number of rooms).
  • publish size–Size in bytes of the chat message to publish.
  • pause–A period (in ms) to pause between batches of published messages.
  • batch–Size of the batch of published messages to send in a burst.

While the test is executing, a series of digits outputs to show progress. The digits represent the current average latency in units of 100 ms. For example, 0 represents < 100 ms latency from the time the client published the message to when it was received. And 1 represents a latency >= 100 ms and < 200 ms. At the end of the test cycle the summary is printed showing the total messages received, the message rate and the min/ave/max latency.

Interpreting the results

Before producing numbers for interpretation, it is important to run a number of trials, which allows the system to "warm up." During the initial runs, the Java JIT compiler optimizes the code and populates object pools with reusable objects. Thus the first runs for a given number of clients is often slower. This can be seen in the test cycle shown above where the average latency initially grew to over 200 ms before it fell back to < 100 ms. The average and maximum latency for the second run were far superior to the first run. It is also important to use long runs for producing results for the following reasons:

  • To reduce any statistical effect of the ramp-up and ramp-down periods.
  • To ensure that any resources (for example, queues, memory, file descriptors) that are being used in a non-sustainable way have a chance to max out and cause errors, garbage collections or other adverse affects.
  • To include in the results any occasional system hiccups caused by other system events

Typically it is best to start with short, low-volume test cycles, and to gradually reduce the pause or increase the batch to determine approximate maximum message rates. Then you can extend the test duration by increasing the number of messages published or the number of clients (which also increases the message rate as there are more users per room). A normal run should report no exceptions or timeouts. For a single server and single test client with one room per simulated client, the number of messages expected should always be the number received. If the server is running clustered, the messages received reduce by a factor equal to the number of servers. Similarly, if you are using multiple clients, since each test client sees messages published from the other test clients, the number of messages received will exceed the number sent.

Testing load balancers

When testing a load balancer, be aware of the following:

  • Start with a cluster of one so that you can verify that no messages are being lost. Then increase the cluster size.
  • You will not have exact message counts, and must adjust according to the number of nodes.
  • It is very important that there is affinity, as the Bayeux client ID must be known on the worker node used, and both connections from the same simulated node must arrive at the same worker node. However, the test does not use HTTP sessions, so the balancer must set any cookies used for affinity (the test client handles set cookies).
Note.png
In reality, IP source hash is a sufficient affinity for Bayeux, but in this test, all clients come from the same IP address. Also note that the real dojo CometD client has good support for migrating to new nodes if affinity fails or the cluster changes. Also a real chat room server implementation would probably be backed by JMS so that multiple nodes would still represent a single chat space.