GeoServer in Production - terralab.ufop.br · 1. Executive Summary The quality of a GeoServer...

21
1. Executive Summary The quality of a GeoServer deployment in a production environment is measured by three criteria: Reliability: the server’s ability to successfully fulfill requests for maps and data. Availability: the overall uptime of the server (including both planned and unplanned outages). Performance: how quickly the server can fulfill client requests. This paper presents various techniques for tuning GeoServer to improve the reliability, availability, and performance of production instances. GeoServer in a production environment can be evaluated according to three criteria: reliability, availability, and performance. This paper discusses methods for implementing production grade GeoServer deployments. GeoServer in Production Implement Production Grade GeoServer Deployments OPENGEO Updated Spring 2013

Transcript of GeoServer in Production - terralab.ufop.br · 1. Executive Summary The quality of a GeoServer...

1. Executive SummaryThe quality of a GeoServer deployment in a production environment is measured by three criteria:

Reliability: the server’s ability to successfully fulfill requests for maps and data.•

Availability: the overall uptime of the server (including • both planned and unplanned outages).

Performance: how quickly the server can fulfill client requests.•

This paper presents various techniques for tuning GeoServer to improve the reliability, availability, and performance of production instances.

GeoServer in a production environment can be evaluated according to three criteria: reliability, availability, and performance. This paper discusses methods for implementing production grade GeoServer deployments.

GeoServer in ProductionImplement Production Grade GeoServer Deployments

OpenGeO Updated Spring 2013

opengeo.org 2

2. ReliabilityReliability is a measure of how many requests fail in a given period of time. A failed request is one that does not correctly return the requested resource (such as a map image or feature dataset). Request failures may be caused by lack of or poor configuration of resources such as CPU cycles, memory, and network bandwidth. Requests may also fail because other requests are consuming an excessive amount of resources. For example, a particular request may consume 95% of the allocated server resources, causing other requests to fail due to lack of resources. (Note that reliability does not consider requests that fail because they are incorrectly specified; e.g. asking for layers or styles that do not exist.)

GeoServer reliability can be increased by implementing the following general tuning strategies:

Disable unused services•

Control server resources•

Limit concurrent requestsk.•

2.1 ReliabilityTo improve resource usage and reduce failure risk, disable all GeoServer services that are not being used. For example, if a GeoServer is used to serve only maps, enable WMS and read-only WFS, and disable WFS-T (Web Feature Service–Transactional) and Web Coverage Service (WCS).

The Web Administration Interface provides controls for enabling and disabling specific services (see Figure 2.1).

2.2 Control Server ResourcesReliability can be improved by setting limits on the server resources that can be used by individual requests. For WMS services, the maximum memory allocated and the maximum time allowed per request can be controlled. For WFS services, the maximum number of features returned can be controlled. In addition, choosing efficient WFS response formats improves resource utilization, and allocating sufficient network bandwidth will maximize WFS throughput.

Limit WMS Rendering Memory

The amount of rendering memory a single WMS GetMap request is allowed to use is controlled by the WMS / Max rendering memory parameter in the Web Administration Interface (stored as maxRequestMemory in wms.xml).

Figure 2.1: Service Metadata and Service Level options for WFS services

opengeo.org 3

The amount of rendering memory needed depends on the requested image size. There is also a dependency on the number of FeatureTypeStyle elements used in the Styled Layer Descriptor (SLD). FeatureTypeStyles (FTS) are used in GeoServer to control the graphical stacking of features. For example, drawing cased lines (e.g. a line style for highways) requires using two FTS to draw first thick lines and then thin lines. Each FTS uses a separate drawing buffer, which allows a single scan of the source and improves performance.

The formula to calculate the amount of rendering memory used by a WMS request is:

where:

PixelDepth• is 3 bytes with no transparency, 4 bytes with transparency

ImageWidth • is the width of the requested image, in pixels

ImageHeight• is the height of the requested image, in pixels

MaximumFeatureTypeStyles• is the maximum number of FeatureTypeStyles used across all features types in the request

The following chart shows the amount of memory used for common request sizes. Rendering memory usage ranges from 200 kilobytes for a typical 256×256 tile request up to 6 megabytes for a 1024×768 screen-sized request with transparency and using 2 FeatureTypeStyle buffers.

Figure2.2Configurationoptions for WMS services

BytesUsed = PixelDepth × ImageWidth × ImageHeight × MaximumFeatureTypeStyles

opengeo.org 4

Figure 2.3: Memory use of common requests

In general, 16 megabytes is sufficient to render a 2048×2048 image at 4 bytes per pixel (full color and transparency). This is the size of an 8×8 meta-tile if tiling is being used.

Limit WMS Rendering Memory

GeoServer allows specifying the maximum amount of time that will be spent processing individual WMS requests. This is controlled by the WMS / Max rendering time parameter in the Web Administration Interface (stored as maxRenderingTime in wms.xml). The parameter limits map rendering time, which includes the time it takes to retrieve features and rasterize them into a map image. This does not include the time to send the response back to the client. For example, when a PNG or JPEG image is created, the parameter limits the total rendering time, but not the time used to transmit the image.

When setting the maximum rendering time it is important to take into consideration peak load conditions. For example, under average load GetMap requests may take less than a second. It is desirable to allow requests to take longer under high load situations, but the limit should probably not exceed more than a few minutes. In most environments a setting of 120 seconds is sufficient to render map requests. A request that processes for a full 120 seconds likely indicates an anomalous situation, such as a GetMap request against a layer using a style that does not have appropriate scale dependencies, and is thus rendering too many features.

opengeo.org 5

Limit WFS Rendering Memory

Excessively large WFS responses can take a long time to process and occupy a large fraction of outgoing bandwidth. GeoServer allows setting a limit on the number of features in a WFS request via the WFS / Maximum number of Features parameter in the Web Administration Interface (stored as maxFeatures in wfs.xml).

The default limit is 1,000,000 features per response. This can be set lower if it is known that WFS response size will always be smaller (e.g. by determining the maximum feature count over all datasets being served).

Optimize WFS Responses

For most WFS output formats (such as GML, JSON, and CSV) GeoServer operates in full streaming mode by default. Streaming the response allows the WFS to use only a small amount of memory when processing a request.

Some formats cannot be streamed, and hence use more server resources. Output formats that are compressed (such as zipped Shapefiles) are staged on disk, so response time may be reduced by disk latency. Other output formats (such as Microsoft Excel) are staged in memory, and can use potentially large amounts of server memory. If resources are limited, these formats should be avoided or limited in size.

Some output formats (such as GML and JSON) are verbose and can result in large response payloads. GeoServer transparently compresses responses to reduce their size (using HTTP 1.1 GZIP compression). If front-end layers such as proxy servers are in use, ensure that they do not defeat this strategy by expanding compressed HTTP responses.

Allocate Network Bandwidth

WFS requests can produce large responses, which can result in high network bandwidth consumption and long request times. The following chart shows, for various values of bandwidth, the time taken to deliver a 300K response payload (this is a typical screen-sized WMS image, or 8,000 WFS features GZIP-compressed).

Figure2.4:Configurationoptions for WFS services

opengeo.org 6

Figure 2.5: Response times to send 300kb as a function of bandwidth

The outgoing bandwidth is shared across all active requests, so each response gets only a fraction of it. More concurrent requests results in a smaller portion of bandwidth for each response, and thus longer response times. If possible, network bandwidth should be sized to accommodate expected peak request load. To handle situations where available bandwidth is fixed, the following section shows how to control the number of concurrent requests.

2.3 Limit Concurrent RequestsLimiting the number of requests processing concurrently inside GeoServer is important for a number of reasons:

Performance• : Testing shows that for GetMap requests against local data sources, maximum throughput is achieved when the number of parallel requests is at most twice the number of CPU cores. Also, too many concurrent requests can saturate the network bandwidth available for response, causing all requests to slow down.

Resource usage• : Requests such as GetMap can use a significant amount of memory. The WMS Rendering Memory Limit controls the amount of memory used for each request, but if too many requests run concurrently it is still possible to exceed the heap memory allocated to the host JVM. By limiting the number of concurrent requests the total amount of memory used can be kept below the maximum memory size.

Fairness• : Individual users should be prevented from flooding the server with requests, which denies other users access to services.

GeoServer can control request concurrency by either limiting the number of server threads available, or by controlling request flow via queueing.

opengeo.org 7

Application Server Thread Limit

The simplest way to control concurrent requests is to to limit the number of server threads available to process requests. Application servers usually provide a setting to control this. Note that the default setting may be too high for GeoServer’s potentially long processing times and large response sizes. It is recommended that GeoServer instances be limited to around 20 concurrent requests.

Each application server provides its own way of limiting available threads, typically via a configuration parameter setting. For example, Apache Tomcat uses the maxThreads parameter in $TOMCAT/etc/server.xml. By default, Tomcat allows 200 concurrent requests. The following snippet shows this reduced to the recommended 20:2.3. Limit Concurrent Requests

Control Request Flow

The GeoServer control-flow module provides control over the number of requests executing concurrently inside the server. When the control-flow module is enabled, excess requests are queued up and processed as execution slots become available. If desired, the module can be configured to reject requests that have been queued longer than a certain threshold.

Control Flow Rules

The GeoServer control-flow the control-flow module provides a number of rules for controlling request flow. The rules are specified in the controlflow.properties file located in the GeoServer data directory.

Total Request Count Control

The maximum number of OWS requests allowed to execute concurrently can be specified with:

Requests in excess of this count are queued, and executed as other requests complete and execution slots become free.

Per-Request Type Control

Limits on concurrent requests can be set for specific services and request types using the following syntax:

<Server port="8005" shutdown="SHUTDOWN"> ... <Connector port="8080" protocol="HTTP/1.1" ConnectionTimeout="20000" redirectPort="8443" maxThreads="20" minSpareThreads="20" /> ... </Server>

ows.global=<count>

ows.<service>[.<request>[.<outputFormat>]]=<count>

opengeo.org 8

where:

<service>• : the OWS service (WMS, WFS, WCS)

<request>• : [optional] the request type. For example, for WMS requests the type can be GetMap, GetFeatureInfo, DescribeLayer, GetLegendGraphics, GetCapabilities

<outputFormat>• : [optional] the output format of the request. For example, for WMS GetMap requests the format can be image/png, image/gif, or any other supported output form

Examples:

Per-User Control

A limit on the number of requests from individual users can be specified with:

where <count> is the maximum number of concurrent requests a single user can execute.

Note that user identity is tracked using HTTP cookies, so this will work for browser-based clients, but possibly not for other kinds of clients.

Request Queue Timeout

The request queue timeout limit can be specified with:

where <seconds> is the number of seconds a request remains queued while waiting for execution. If a queued request does not enter execution before the timeout expires it is dropped from the queue.

The following is an example of a typical controlflow.properties file for a server having 4 cores:

# don’t allow more than 16 WCS requests in parallel ows.wcs=16 # don’t allow more than 8 WMS GetMap requests in parallel ows.wms.getmap=8 # don’t allow more than 2 WFS GetFeature requests with Excel output format ows.wfs.getfeature.application/msexcel=2

user=<count>

timeout=<seconds>

opengeo.org 9

2.4 Summary: ReliabilityThere are a number of strategies that can be used to increase the reliability of GeoServer instances in production environments. The following checklist summarizes the strategies discussed in this section:

Disable unused services to simplify the instance • and reduce demand on the server

Set WMS rendering memory allocation and processing time limits•

Set WFS response size limits•

Optimize the types of WFS response formats•

Ensure that WFS response compression is being maintained•

Allocate sufficient bandwidth to handle expected WFS request load•

Limit available server threads to reduce request concurrency•

Implement control flow to queue requests and avoid server overload•

3. Summary: ReliabilityAvailability measures the fraction of time a service is available to respond to client requests. It is computed as the available time divided by the total elapsed time (including periods when the service is unavailable due to failure or scheduled downtime).

# if a request waits in queue for more than 60 seconds then it’s not # worth executing, as the client will likely have given up by then timeout=60

# don’t allow the execution of more than 100 requests total in parallel ows.global=100

# don’t allow more than 10 GetMap requests in parallel ows.wms.getmap=10

# don’t allow more than 4 outputs with Excel output as it is memory-bound ows.wfs.getfeature.application/msexcel=4

# don’t allow a single user to perform more than 6 requests in parallel # (6 being the Firefox default concurrency level at the time of writing) user=6

Availability = Available Time / Total Time

opengeo.org 10

Strategies for achieving high availability include:

Monitoring the server for failure•

Balancing processing load across multiple servers•

Removing single points of failure.•

3.1 Server MonitoringA high availability deployment can employ a watchdog process to monitor the status of GeoServer servers and restart them if the host application server fails or the GeoServer instance becomes unresponsive. The watchdog process is software that runs externally to the application server. It periodically checks that the server is up and running and takes action if it is not.

The following figure shows the three situations that a watchdog addresses.Disable unused services to simplify the instance and reduce demand on the server

Some remarks on each situation:

In normal operation, the watchdog checks to see if GeoServer is running. • If it responds within a defined time limit, the watchdog does nothing.

An abnormal situation can occur where the GeoServer instance • is running, but takes too long to respond to a simple query (due to being overloaded or hung). In this case the watchdog kills the application server process and then restarts it.

Another abnormal situation is where the application server itself has • failed. In this case the watchdog simply restarts the application server.

This example shows a single server, but a watchdog can be used to monitor any number of application servers.

A watchdog can be implemented using a full-featured monitoring application such as Nagios1, or it can be as a simple as a custom shell script. The following is an example of a watchdog script for GeoServer running in Apache Tomcat on a Unix system. The script checks that the Tomcat server process is running, and if so requests a small image file from GeoServer to test responsiveness. If the process is not running or if the response takes too long, the script kills and restarts Tomcat. Typically this script would be run on a regular basis as a cron job.

1 http://nagios.org/

Figure 3.1: Service monitoring with watchdog

opengeo.org 11

#!/bin/bash

# Set up script variables PID_FILE=/var/run/catalina/catalina.pid HTTP_URL=http://localhost:8080/GeoServer/openlayers/img/west-mini.png CATALINA_SCRIPT=/opt/tomcat-6.0/bin/catalina.sh GeoServer_LOG=/var/lib/GeoServer_data/default/logs/GeoServer.log CATALINA_LOG=/opt/tomcat-6.0/logs/catalina.out LOG_COPY=/home/tomcat PID=`cat $PID_FILE`

# Function to kill and restart application server function catalinarestart() { $CATALINA_SCRIPT stop sleep 5 kill 9 $PID cp $GeoServer_LOG $LOG_COPY cp $CATALINA_LOG $LOG_COPY $CATALINA_SCRIPT start }

if [ -d /proc/$PID ] then # App server is running - kill and restart it if there is no response. wget $HTTP_URL -T 1 --timeout=20 -O /dev/null &> /dev/null if [ $? -ne “0” ] then echo Restarting Catalina because $HTTP_URL does not respond, pid $PID catalinarestart # else # echo No Problems! fi else # App server process is not running - restart it echo Restarting Catalina because pid $PID is dead. catalinarestart fi

opengeo.org 12

3.2 Load BalancingLoad balancing distributes processing across a pool of servers. By allowing redundancy of computing resources such as servers, CPUs and data stores, it eliminates some single points of failure that may reduce service reliability.

Load balancing can also increase throughput by enabling horizontal scale-out. For example, OGC protocols such as WMS and WFS are stateless. Because no session state is maintained across client requests, requests can be routed to any server for processing. This means that GeoServer can be scaled out via a relatively simple approach utilizing load balancing. The more complex technique of clustering at the application server level is not required.

3.3 High Availability ArchitectureA high availability (HA) architecture is a configuration that has no single point of failure. This requires that GeoServer, the data source, and the load balancer itself are all replicated to provide redundancy. In this architecture the load balancing layer may be implemented using either a hardware load balancer or purely software-based components.

Figure 3.2: Services are redundant, but load balance

is a single point of failure

opengeo.org 13

Figure 3.3: Minimal high availability configuration

For example, Figure 3.3 shows a straightforward architecture for a high availability GeoServer environment with only two servers, though more could easily be added. Open source software can be used to provide load balancing and failover capabilities. A suggested list of components used is:

VRRPd daemon• 2 providing transparent fail-over behind a single IP address

Balance• 3 TCP proxy providing load balancing

Replicated GeoServer and PostGIS instances•

Figure 3.4 shows how this high availability deployment operates.

2 http://off.net/~jme/vrrpd/

3 http://www.inlab.de/balance.html

opengeo.org 14

Figure 3.4: High availability configuration in action

In this deployment, the network router sends the request to one of the VRRPd routers. The VRRPd routers communicate to elect a Master router for transmitting requests. If the Master router fails, the election process provides dynamic fail-over to the next available VRRPd router. This allows the virtual router IP address to be used as the default first-hop address on the hosts. Once routed, the request is sent by the load balancer to the next available instance of GeoServer to be processed.

3.4 Summary: Availability

Availability is the amount of uptime during which a system is able to service client requests. Techniques to increase the availability of GeoServer deployments include:

Use a watchdog process to detect server • responsiveness and restart it if necessary

Use load balancing to distribute request load across a pool of servers•

Provide high availability by using redundant hardware • and software components throughout the system

opengeo.org 15

4. Performance Performance is a measure of how fast GeoServer can fulfill client requests. Factors that affect GeoServer performance include hardware capacity, data tuning, software versions and configuration, network saturation, caching, and many others. Because of this range of factors, performance of production systems should be analyzed on a case-by-case basis. However, there are some general strategies for improving performance that are effective in most cases.

Note that while performance is often the main focus of system tuning, it is advisable to ensure that the service is reliable before attempting to improve performance. Fortunately, many of the strategies presented earlier for increasing reliability also provide a boost to performance..

4.1 Java Virtual MachineFor best performance, use the Oracle (Sun) Java HotSpot Virtual Machine (JVM). Testing has shown that the Oracle JVM is significantly faster than other JVM implementations. For best results use the latest release of the JVM, since each new version has offered significant performance improvements. Oracle’s Java SE 6 Performance White Paper4 describes the JVM improvements that were introduced in Java SE 6 (specifically see Section 2.3 - Ergonomics in the 6.0 Virtual Machine).

For production use, GeoServer should be run using the Server mode of the JVM. This mode is used by default on some platforms (such as Linux and Solaris), but not on others (such as Windows or OS X). The -server JVM option forces the use of the Server VM. To determine the default JVM mode, run java -version, and the output should be as follows:

Note that the final line says Server VM.

4.2 JVM tuningCertain JVM operating characteristics can be tuned to optimize performance when running GeoServer. The following parameters can be configured:

-server• forces the use of the Java HotSpot Server VM

-Xms2048m -Xmx2048m• sets the JVM to use 2048 megabytes (2 GB) of memory for the heap, and allocates it all on startup (the heap size should be adjusted to fit the actual memory available)

-XX:+UseParallelOldGC -XX:+UserParallelGC• enables multi-threaded garbage collection, which improves performance if more than two cores are present

-XX:NewRatio=2• tunes the JVM for handling a large number of short-lived objects

-XX:+AggressiveOpt• enables experimental optimizations that will be defaults in future versions of the JVM

4 http://www.oracle.com/us/products/middleware/application-

server/6-performance-137236.html

java version “1.6.0_26” Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Java HotSpot(TM) Server VM (build 20.1-b02, mixed mode)

opengeo.org 16

The method of setting these parameters is container-specific. For example, in Apache Tomcat, they are configured by defining them in the CATALINA_OPTS variable in a setenv script file located in the installation bin directory.

4.3 JAI and JAI Image I/OGeoServer uses the Java Advanced Imaging (JAI) API for raster manipulation and analysis, and the JAI Image I/O API for image encoding and decoding. Figure 4.1 shows where JAI and JAI I/O are utilized in the WMS and WCS processing pipeline.For production use, GeoServer should be run using the Server mode of the JVM. This mode is used by default on some platforms (such as Linux and Solaris), but not on others (such as Windows or OS X). The -server JVM option forces the use of the Server VM. To determine the default JVM mode, run java -version, and the output should be as follows:

Figure 4.1: JAI and JAI I/O usage in GeoServer

The JAI and JAI Image I/O APIs provide both Java and native code implementations for most operating system platforms. GeoServer ships with only the pure Java implementations, so for best performance ensure that the native code extensions are installed and configured to be used. GeoServer will use the native code implementations by default if they are present.

4.4 JDK and JAI Performance ComparisonFigure 4.2 compares the performance of GeoServer running on the Oracle (Sun) JDK and OpenJDK, with and without JAI native code enabled. The test uses random map requests for TIGER roads data at 1:3M scale, styled with a simple black line. The results demonstrate that using the Oracle JDK with the JAI native code implementation provides the best overall performance by a significant margin.

opengeo.org 17

Figure 4.2: Performance comparison

4.5 Data OptimizationA major factor affecting GeoServer performance is data optimization. Data that is not optimized reduces performance by requiring more disk I/O and increasing CPU load. Vector (feature) and raster (coverage) data can both be tuned to improve performance by taking advantage of software optimizations and by choosing appropriate formats.

Vector Data

The first step to improve vector data performance is to use a format that is designed for rapid data retrieval. This means choosing formats that support indexes, such as spatially-enabled databases or file formats such as Shapefiles. Avoid using data interchange formats such as GML, since they are not designed to allow rapid access.

Alays use indexes when available for querying. Indexing increases performance by improving the efficiency of queries and data retrieval. Indexes should be defined on all attributes used in GeoServer queries, including geometry and any non-spatial attributes used in filters.

Reprojecting vector data into a different coordinate system is processor-intensive. For optimal performance data should be stored in the coordinate system that is most commonly requested by service clients.

If the application requires multi-scale rendering, considering using multiple data layers with different levels of generalization. The classic example is storing multiple levels of coastline features with detail dependent on the scale.

Cartographic styling also affects performance. Using scale dependencies (via the MaxScaleDenominator and MinScaleDenominator SLD elements) can reduce rendering costs and time by drawing fewer features at small scales. Using a complex style at all zoom levels is usually unnecessary. Use simpler styling at small scales, and reserve complex styling for higher zoom levels.

opengeo.org 18

These map styling guidelines help to improve rendering performance:

Draw fewer features at small scales (when zoomed out)•

Draw important features at middle and large scales•

Draw no more than approximately 1,000 features per request•

Minimize the use of complex styling such as partial transparency, • labeling, halos, multiple feature type styles, and multiple symbolizers per feature, as they can add significant processing overhead.

Raster Data

Optimizing raster data is crucial to obtaining good rendering performance. Often raster data is stored in a format that is suitable for archival and distribution, but this usually does not provide optimum performance when serving image data via GeoServer.

When serving single raster images, performance can be enhanced by storing imagery in the GeoTIFF format. For maximum performance, avoid using image compression. For large images, internal tiling and image overviews should be used to provide fast access to sub-areas and lower-resolution versions of the image. The open source Geospatial Data Abstraction Library5, or GDAL, is a powerful set of tools for restructuring raster data formats. The gdaladdo tool from this library allows creating overviews for single image files. When using multiple files to create image mosaics, the gdal_retile tool can be used to create external image pyramids in either the file system or a database.

Raster formats based on wavelet transforms (such as ECW, MrSID, and JPEG 2000) also offer very good performance. GeoServer supports using these formats when the appropriate licenses are procured and drivers are installed.

As with vector data, reprojecting rasters to a different coordinate system is computationally intensive and will degrade performance. Raster data should be stored in the coordinate system most commonly requested.

4.6. Summary: PerformanceThere are many factors that can affect GeoServer performance. This section has presented the following general tuning strategies:

Use the most recent version of the Oracle JVM•

Ensure the JVM is run in Server mode•

Configure JVM options for maximum performance•

Install the native code extensions for JAI and Image I/O•

Store vector data using formats such as spatial databases or shapefiles•

Use spatial and attribute indexes where available•

For multi-scale data use multiple layers with different levels of generalization•

Use styling scale dependencies, and avoid performance-• intensive styling when rendering large numbers of features

Store raster data in efficient formats such as GeoTIFF•

5 http://gdal.org/

opengeo.org 19

Use image tiling and overviews where possible•

Store vector and raster data in the most frequently • requested coordinate system

5. Real-world ExampleThis section presents real-world examples of architectures for GeoServer deployment in production.

5.1 MassGISSite: http://lyceum.massgis.state.ma.us/

MassGIS is the Massachusetts state agency assigned to the collection, storage and dissemination of geographic data. MassGIS had one of the earliest production installations of GeoServer. Their system provides access to over a terabyte of spatial data, organized into more than 850 layers. They actively encourage the development of external applications against the OGC services they provide, and currently the system handles over 23,000 requests per day from many different clients in a wide variety of formats and coordinate systems.

GeoServer is used to serve both raster and vector data stored in the ESRI ArcSDE sub-system. The Squid caching proxy is utilized to cache common WMS and WFS requests. A pool of GeoServer instances across multiple servers is deployed to handle the load, with watchdogs running to ensure availability. The implementation is a good example of using commodity hardware to deploy a production GeoServer instance. For instance, the load balancer is a refurbished PC that also hosts a shared Samba folder where the GeoServer configuration files are stored.

Figure 5.1: MassGIS Architecture

opengeo.org 20

5.2 TriMetSite: http://ride.trimet.org/

TriMet manages bus, light rail, and commuter rail services in the Portland, Oregon metro area. They provide an interactive web-based trip planner and routing service which uses GeoServer as the mapping and spatial data access engine.

To provide scalability, the TriMet deployment has multiple server nodes, with two GeoServer instances in each node. TileCache is used for a map tile cache, and PostGIS provides data storage. Two separate systems are in operation: a production environment and a staging environment for testing new applications. The staging environment is used to populate the production tile cache offline, and is also used to test configuration changes, which are then pushed to the production environment via svn version control.

Figure 5.2: TriMet Architecture

5.3. CartoCiudadSite: http://www.cartociudad.es/ and http://www.cartociudad.es/portal/

CartoCiudad manages a database of geospatial datasets contributed by a number of public agencies in Spain. The database contains a topological road network together with parcel, census and postal information covering the entire Spanish national territory. Maps are made available via an interactive web application, and data is disseminated via SDI services including WFS and WPS.

GeoServer is used to provide the WMS service, with TileCache providing caching of map tiles. Redundant enterprise load balancers redirect to three server nodes running GeoServer. GeoServer accesses the enterprise database, which is stored in an Oracle RAC (Real Application Cluster) composed of five servers.

opengeo.org 21

Figure 5.3: CartoCiudad Architecture

5.4 Italian Civil Protection GeoSDISite: http://www.geosdi.org

GeoSDI is a program of the Italian Civil Protection department for implementing open software solutions for Spatial Data Infrastructure (SDI). The program allows government agencies to integrate and exchange spatial data based on international open geospatial standards.

GeoSDI provides OGC services using a “private cloud” style of architecture. Processing takes place on five blade servers (with a spare blade kept in reserve). Each blade has 8 cores and 32 GB of memory, and runs eight instances of GeoServer and one instance of PostgreSQL/PostGIS. Storage is provide by a NAS (Network Attached Storage) system. All software is run in VMWare virtual machines, giving maximum flexibility in provisioning services. High availability is ensured by using redundant software load balancers.

Figure 5.4: GeoSDI Architecture

© 2013 OpenGeo. Redistributable under the Creative Commons Attribution-Share Alike license.