
Customer Portal

Get a grip on your data

With battle-tested solutions and a focus on foundational strength,

Pentaho+ helps you meet the challenges of an AI-driven world.

Pentaho Platform

Your feedback is important to us! Email us how we can improve these documents.


This page serves as a library for the best practices we have put together on all of the different ways you can install and upgrade Pentaho software, and all the methods you can use to set it up according to your organization's needs.


  • Pentaho with the Cloud
    • Best Practices for Pentaho and Amazon Web Services 
    • Using Pentaho with Microsoft Azure
  • Pentaho Platform Document Library
    • Installing Pentaho Server on Hadoop Edge Nodes
    • Downloading Pentaho Content Using REST APIs 
    • Guidelines for Pentaho and VMs
    • Best Practices for Report Bursting
    • Best Practices for Installation & Upgrade
    • Pentaho Server in High Availability
    • Pentaho DI-Server in High Availability
    • Configuring JVM Heap Size
    • Guidelines - Merging Custom Configuration Files for Pentaho Upgrades
    • Connecting the PDI Client to a Secure Hadoop Cluster
    • Daylight Saving Time and the Pentaho Scheduler - updated!

    The Components Reference in Pentaho Documentation has a complete list of supported software and hardware.

    Pentaho with the Cloud

    AWS.png Best Practices for Pentaho with Amazon Web Services
    For versions 7.x, 8.x / published Aug 2019

    This document covers some best practices on the installation of Pentaho’s Server and Client products on Amazon Web Services (AWS) and serves to complement Pentaho product installation instructions by showing how the server and database configurations can be applied specifically in an AWS elastic compute cloud (EC2) environment.

    Audience: Pentaho administrators or anyone with a background in infrastructure architecture or Amazon Web Services who is interested in working with Pentaho on AWS.

    Using_Pentaho_with_Microsoft_Azure.jpg Using Pentaho with Microsoft Azure
    For versions 7.x, 8.x / published April 2018

    This document presents best practices around the installation of Pentaho's server and client products on Microsoft Azure, and gives an overview of the server, network, and storage architecture recommended to run Pentaho.

    Audience: Pentaho administrators or anyone with a background in infrastructure architecture or Azure who is interested in installing Pentaho on Azure.



    Pentaho Platform Document Library

     InstallingPentaho.jpg Installing Pentaho Server on Hadoop Edge Nodes
    For versions 7.x, 8.x / published September 2019

    This document discusses the pros and cons of installing Pentaho Server software on Hadoop edge nodes, as well as considerations for splitting the web server tier from the web application tier (Tomcat container).

    Audience: Pentaho or system administrators or anyone needing to manage a clustered Hadoop environment

     DownloadContentAPIs.png Downloading Pentaho Content Using REST APIs
    For versions 7.x, 8.x / published September 2018

    This document describes how to on download reports to a folder outside the repository, download the entire contents of your repository, and how to download your entire repository to an external location using client URL (cURL). 

    Audience: Pentaho or system administrators, or anyone with a background in cURL or REST APIs who is interested in downloading information to a location outside of the Pentaho Repository.

    PentahoWithVMs.jpg Guidelines for Pentaho and VMs
    For versions 6.x, 7.x, 8.x / published July 2018

    This document contains information and guidelines around using Pentaho in Virtual Machines. It includes information about the configuration of hosts, guest servers, and guest clients.

    Audience: Pentaho or system administrators, or anyone interested in servicing VMs and the Pentaho platform and design tools

    ReportBurstingCover.jpg Report Bursting with Pentaho
    For versions 7.x, 8.x / published July 2018

    This document contains best practices on report bursting using the Pentaho Platform, including ways in which you can configure Pentaho components to achieve report bursting.

    Audience: Pentaho or system administrators, or anyone working with Pentaho Report Designer and data integration jobs and transformations

    Installation and Upgrade Best Practices for Installation & Upgrade
    For versions 7.x, 8.0 / published April 2018

    Here is a set of best practices for the installation or upgrade of your Pentaho software, including resource allocation and post-installation cleanup.

    Audience: Pentaho administrators or anyone needing to install or upgrade Pentaho software

    PentahoServerHA.jpg Pentaho Server in High Availability
    For versions 7.x, 8.x / published October 2019

    We have collected a set of best practice recommendations for using the Pentaho Server for analytics and setting them up with a clustered High Availability solution.

    Audience: System administrators or anyone needing to manage increasing data processing and concurrent user connections

    PentahoDI.jpg Pentaho DI-Server in High Availability 
    For versions 7.x, 8.x / published October 2019

    This document lists our best practice recommendations for using the Pentaho DI-Server within a clustered High Availability solution.

    Audience: System administrators or anyone needing to manage increasing data processing and concurrent user connections

    Configuring_JVM.jpg Configuring JVM Heap Size
    For versions 6.x, 7.x, 8.x / published May 2018

    The purpose of this document is to highlight the available options within the Oracle Java Development Kit (JDK) Java Virtual Machine (JVM) to improve overall speed and performance, particularly with garbage collection. Each section of this document discusses the various aspects of the JVM and gives general information on how they operate.

    Audience: Pentaho administrators, or anyone with a background in Java who is interested in maximizing VM speed and performance

    merging_custom_config_files_sm.png Guidelines - Merging Custom Configuration Files for Pentaho Upgrades
    For versions 6.x, 7.x, 8.x / published April 2018

    These guidelines are intended to assist you in upgrading from Pentaho 6.x to Pentaho 7.x. This document contains information on which files would need to be merged if you have a lot of customizations, notes on upgrading Pentaho plugins, and what to configure if you are using CAS or IWA security. It also includes a class mappings table for Spring Security upgrades and compile error fixes reference tables for repository, extensions, and core compile errors.

    Audience: System administrators or anyone needing to upgrade Pentaho


    Connecting the PDI Client to a Secure Hadoop Cluster
    For versions 6.x, 7.x, 8.0 / published May 2018

    This document covers best practices on methods and strategies regarding the different options to execute processes and authenticate users with Big Data using the Windows operating system.

    Audience: Pentaho administrators or anyone with PDI experience who is interested in improving authentication setups.

    DST_and_Scheduler_sm.png  Daylight Saving Time (DST) and the Pentaho Scheduler
    For versions 7.x, 8.x, 9.0 / published February 2020

    This document describes the behavior of the Pentaho Scheduler when a DST change occurs, and it includes strategies for how to reduce the impact of the time change.

    Audience: Pentaho users or administrators or anyone with a background in scheduling on the Pentaho Server.






