Best Practice - Pentaho ETL Servers with High Availability

Your feedback is important to us!  Email us how we can improve these documents.

Software Version
Pentaho  7.0
Apache Tomcat 8.0

Overview

We have collected a set of best practice recommendations and information for you to use when you want to  set up your Pentaho ETL servers with a clustered High Availability solution.

High Availability solutions for our ETL servers need to be set as ACTIVE/PASSIVE and not as ACTIVE/ACTIVE. The load balancer sites in front of a cluster and only one ETL server gets any traffic. The secondary server only gets traffic if the primary server goes down. This model allows the servers to deal with failover. So, in the event that one Pentaho server goes down, service is not interrupted but is instead taken over by a live server.

Keep these Pentaho Architecture principles in mind while you are working through this document:

  1. Architecture is important, above all else.
  2. Platforms are always evolving: sometimes you will have to think creatively.

Some of the things discussed here include clustering the ETL server nodes, configuring a load balancer for HA, and Tomcat and Apache configuration for Windows and Linux.

The intention of this document is to speak about topics generally; however, these are the specific versions covered here: 

   -  Best Practice - Pentaho ETL Servers in High Availability

Have more questions? Submit a request

Comments

Powered by Zendesk