Difference between revisions of "Jetty/Tutorial/Apache"
|Line 1:||Line 1:|
| introduction =
| introduction =
[http://httpd.apache.org/ Apache httpd] is a HTTP server written in C, that is often used to front other web services.
[http://httpd.apache.org/ Apache httpd] is a HTTP server written in C, that is often used to front other web services.
Latest revision as of 15:02, 23 April 2013
Apache httpd is a HTTP server written in C, that is often used to front other web services. Jetty is a full functional and optimized HTTP server and has no need of an apache httpd instance between it and the internet. However, deployers often want to place an instance of apache between Jetty and the internet for some of the following "reasons":
- Performance. Apache Httpd does have slightly superior performance to jetty for pure HTTP request handling. However, for dynamic response generation, apache must pass the request to another process and the resulting double handling reduces the total throughput to less than direct requests to Jetty. More over, with the advent of comet style web applications, long held requests are common and the apache thread model assigns a thread per outstanding request, so apache does not scale to large numbers of comet connections.
- Static content. Apache Httpd is very good at serving static content fast. However, Jetty is no slouch either as it can use direct memory mapped buffers for static content, so that only kernel space is used for the data transfer. Besides, if your application has a lot of static content, then you will get much better results by either ensuring good client caching or serving the content from an CDNS edge cache.
- Security. Some believe that apache gives them a more secure solution as there are no TCP/IP connections terminating on Jetty. However, since Jetty is written in Java, it is not vulnerable to the class of security exploit that a server written in C is. Jetty has a good security record, but has had some past issues, but mostly of the nature that would not have been helped by a fronting instance of Apache.
- Load Balancing. Apache has several options for load balancing between multiple servlet servers. These solutions are reasonable, but there are better software and appliance load balancers available. The main limitation of apache as a load balancer is that it's threading model is not-asynchronous, so scaling is limited (specially for comet traffic).
- Administration. Often an enterprise has staff who are very familiar with apache and thus have a strong preference to deploy everything behind apache. This can be a good reason to avoid chaos in a deployment environment, so long as some of the performance and scalability limitations do not affect your web application.
So if we have not yet convinced you to not use apache, read on for the best way to do it. This tutorial can be followed step by step to build up more and more capabilities into your apache configuration.
Which Module ?
Apache provides two mechanisms by which a request that it can receive can be forwarded to a servlet container like Jetty.
Mod_jk is a module written specifically for communicating with the apache tomcat server via the AJP protocol. It includes a load balancer and some management interfaces. Jetty supports this protocol via it's AJP connector, but we do not recommend using mod_jk since:
- While the binary AJP protocol is more compact than HTTP, there is little benefit from this as the link between apache and the servlet container is often either local or over a fast LAN. Jetty is highly optimized for handling HTTP and HTTP semantics are well known and documented. Using AJP can change those semantics and reduce some key optimizations.
- The mod_jk modules is maintained with the tomcat project rather than with the httpd project, thus it is not documented to the same standard as other apache modules and there are frequent version issues of which mod_jk should go with which apache.
- The AJP protocol has been at verion 13 for some time, however there have been changes in the protocol without changing of the version number. Incompatibilities can frequently result.
The mod_proxy modules are superior in features, maintained with apache httpd, support HTTP and AJP and has a rich load balancer. We highly recommend using mod_proxy when using Jetty with apache.
Distributions of apache differ greatly about their approach to apache configuration files. The main difference is if the entire configuration is placed in a single file (apache.conf or httpd.conf) or split up into multiple directories of configuration files (conf.d, ports.conf, mods_available, mods-enabled) with the use of symlinks to activate modules. Configuration may also be done at the server level, or embeded within a VirtualHost configuration of the server.
This tutorial does not recommend or discuss in detail either approach and simply outlines the configuration directives needed. Where these directive are placed will depend greatly on your distribution and existing configuration.
In order to use any of the modules described below, they must first be loaded into the httpd server, so the following directives can be used to load all the modules discussed
LoadModule proxy_module /usr/lib/apache2/modules/mod_proxy.so LoadModule proxy_balancer_module /usr/lib/apache2/modules/mod_proxy_balancer.so LoadModule proxy_http_module /usr/lib/apache2/modules/mod_proxy_http.so LoadModule proxy_ajp_module /usr/lib/apache2/modules/mod_proxy_ajp.so LoadModule jk_module /usr/lib/apache2/modules/mod_jk.so
In some distributions, these load directives can be enabled with symlinks:
cd $APACHE_HOME/mods-enabled ln -s ../mods-available/proxy.load proxy.load ln -s ../mods-available/proxy_http.load proxy_http.load
The following directives form a good base configuration for mod_proxy:
# Turn off support for true Proxy behaviour as we are acting as # a transparent proxy ProxyRequests Off # Turn off VIA header as we know where the requests are proxied ProxyVia Off # Turn on Host header preservation so that the servlet container # can write links with the correct host and rewriting can be avoided. ProxyPreserveHost On # Set the permissions for the proxy <Proxy *> AddDefaultCharset off Order deny,allow Allow from all </Proxy> # Turn on Proxy status reporting at /status # This should be better protected than: Allow from all ProxyStatus On <Location /status> SetHandler server-status Order Deny,Allow Allow from all </Location>
To connect to servlet container with HTTP protocol, the ProxyPass directive can be used to send requests received on a particular URL to a Jetty instance. The following example will proxy all requests received by apache on /test/* to the /context running on the local jetty instance on port 8080:
ProxyPass /test http://localhost:8080/context
Alternately, the location directive can be used to group multiple directives for the same URL:
<Location /test/> ProxyPass /test http://localhost:8080/context SetEnv proxy-nokeepalive 1 </Location>
The mod_proxy_http will set some additional headers on the requests that it proxies:
- X-Forwarded-For - The IP address of the client
- X-Forwarded-Host - The original host requested by the client in the Host HTTP request header
- X-Forwarded-Server - The hostname of the proxy server
While not supported directly by mod_proxy_http, Jetty also understands the following experimental request header:
- X-Forwarded-Proto - The URL protocol scheme of the original request
One option for setting this, if the protocol schema is static, is to use mod_headers RequestHeader directive.
If the values of these headers are meaningful to your web application, then Jetty can be configured to interpret them and make their values available via the servlet API. The setForwarded(true) method should be called on the connector. This can be done in jetty.xml like:
<Call name="addConnector"> <Arg> <New class="org.eclipse.jetty.server.nio.SelectChannelConnector"> <Set name="host"><SystemProperty name="jetty.host" /></Set> <Set name="port"><SystemProperty name="jetty.port" default="8080"/></Set> <Set name="forwarded">true</Set> </New> </Arg> </Call>
Proxying SSL on Apache to HTTP on Jetty
The situation here is:
https http ---------> Apache -------> Jetty
In other words, you have offloaded your SSL onto Apache and you want to use plain http to proxy to Jetty. You want Jetty to return all redirected pages using https:// to your Apache server. You can do that by setting the X-Forwarded-Proto as describe above.
If you need access on Jetty to some of the SSL information accessible on Apache, then you need to some configuration tricks on Apache to insert the SSL info as headers on outgoing requests. Follow the Apache configuration suggestions on http://www.zeitoun.net/articles/client-certificate-x509-authentication-behind-reverse-proxy/start which shows you how to use mod_headers to insert the appropriate request headers. Of course you will also need to code your application to look for the corresponding custom request headers bearing the SSL information.
To connect to servlet container with AJP protocol, the ProxyPass directive can be used to send requests received on a particular URL to a Jetty instance, using "ajp" as the protocol on the URL. The following example will proxy all requests received by apache on /test/* to the /context running on the local jetty instance accepting AJP on port 8009:
ProxyPass /test ajp://localhost:8009/context
In order to accept AJP, the jetty instance must be started with an AJP connector configured. This can normally be done with the command line like:
java -jar start.jar OPTIONS=Server,ajp etc/jetty.xml etc/jetty-ajp.xml
The contents of the jetty-ajp.xml file simply add an AJP connector with the following
<Call name="addConnector"> <Arg> <New class="org.eclipse.jetty.ajp.Ajp13SocketConnector"> <Set name="port">8009</Set> </New> </Arg> </Call>
It is recommended to NOT use the AJP protocol, and superior performance and clearer semantics will be achieve using HTTP.
The balancer allows a received request to be proxied to one of several Jetty instances using either HTTP or AJP as the protocol. The following example shows how all requests to /test can be proxied to a two node cluster:
ProxyPass /test balancer://mycluster/context <Proxy balancer://mycluster> BalancerMember http://myhost1.org:8080 BalancerMember http://myhost2.org:8080 </Proxy>
If your webapp uses sessions, then it is highly desirable to ensure that all requests for the same session are sent to the same node in the cluster. This can be achieved by appending a worker name to the session ID used by Jetty. This is achieved by configuring an instance of one of the Jetty session ID managers with the worker name. Usually, there will only be a single session ID manager per jetty instance which is referenced by all per context session managers. Here's an example of configuring a HashSessionIdManager in a jetty.xml file:
<Configure id="Server" class="org.eclipse.jetty.server.Server"> <Set name="sessionIdManager"> <New id="hashIdMgr" class="org.eclipse.jetty.server.session.HashSessionIdManager"> <Set name="workerName">node1</Set> </New> </Set> <Call name="setAttribute"> <Arg>hashIdMgr</Arg> <Arg><Ref id="hashIdMgr"/></Arg> </Call> </Configure>
We then need to configure each context with a reference to that session ID manager. You can do that either in code, in a WEB-INF/jetty-web.xml file inside the webapp, or in an external context.xml file. Here's an example of using a context.xml file to set up a session manager that references the single session ID manager:
<Ref name="Server" id="Server"> <Call id="hashIdMgr" name="getAttribute"> <Arg>hashIdMgr</Arg> </Call> </Ref> <Set name="sessionHandler"> <New class="org.eclipse.jetty.server.session.SessionHandler"> <Arg> <New id="hashMgr" class="org.eclipse.jetty.server.session.HashSessionManager"> <Set name="idManager"> <Ref id="hashIdMgr"/> </Set> </New> </Arg> </New> </Set>
Once your jetty instances have been configured with worker names, then the following configuration will set up mod_proxy_balancer to look for those worker names in the JSESSIONID cookie and jsessionid URL parameter:
ProxyPass /test balancer://mycluster/context stickysession=JSESSIONID|jsessionid nofailover=On <Proxy balancer://mycluster> BalancerMember http://myhost1.org:8080 route=node1 BalancerMember http://myhost2.org:8080 route=node2 </Proxy>
If your cluster supports distributed sessions (via Database, Wadi, terracotta, gigaspaces, etc), then you can set nofailover=Off, so that if a node fails then the balancer will reroute the request to another node in the cluster. Jetty will automatically rewrite the worker ID of a cookie for a rerouted request. With nofailover=On, an 503 unavailable response will be sent if a worker node fails.
When a request has been proxied to another server, often the response can be generated with incorrect links, cookie domains and redirection headers. However, a well written web application will either use relative links and/or the Host header to generate absolute addresses. So if ProxyPreserveHost directive is on, then often no rewriting is required.
However, not all web applications are well written with regards to the Host header, and some hard code domain names. If this is the case with your webapp, then you may need to rewrite some headers and links. The following example shows how the ProxyPassReverse directives can be used to rewrite headers and cookies.
ProxyPass /mirror/foo/ http://backend.example.com/ ProxyPassReverse /mirror/foo/ http://backend.example.com/ ProxyPassReverseCookieDomain backend.example.com public.example.com ProxyPassReverseCookiePath / /mirror/foo/
If there are links within the body of the response that need to be rewritten, then the non-apache mod_proxy_html may be used.