Hitachi Vantara Pentaho Customer Portal

Integrating Pentaho with MapR using Apache Drill

Your feedback is important to us! Email us how we can improve these documents.


Apache Drill is a schema-free SQL-on-Hadoop tool that lets you run SQL queries against different data sets located in your Hadoop filesystem with various formats, e.g. json, csv, Parquet, HBase, etc. Blending Pentaho Data Integration (PDI) with Apache Drill gives you the flexibility to do data integration work through Pentaho’s powerful PDI product.

Note: Pentaho Data Integration’s support of Drill is limited and is provided through our support for JDBC 3 / 4 drivers. Support of the Apache Drill driver itself is provided through MapR.

Some of the things discussed here include configuring Apache Drill for Pentaho Data Integration, connecting PDI to Drill, and links to recommended settings and best practices.

We assume that you have administrator permissions on the cluster, have a MapR Converged Data Platform running with Apache Drill installed, and Apache ZooKeeper running in replicated mode

Software Version(s) PDF
Pentaho 6.x, 7.x, 8.0

The Components Reference in Pentaho Documentation has a complete list of supported software and hardware.












Have more questions? Submit a request


Powered by Zendesk