Hitachi Vantara Pentaho Customer Portal

Best Practices - Test Driven Development – Pentaho Solution

Your feedback is important to us!  Email us how we can improve these documents.

Software Version
Pentaho  5.4, 7.x

Test Driven Development (TDD)

The Pentaho BA and DI platform is not a java development platform (developers don’t write code in Pentaho, but they create content) however Pentaho provides the flexibility required to embed it and extend it to adapt to the business requirements, for such a reason there is no standard nor framework to do TDD. This guide will show some of the integration points that will allow developers to create a test plan and test cases to facilitate TDD.

This guide will be focusing in the 2 core Pentaho servers using 5.4.0.1 GA release and assuming they are running in localhost using the default ports without SSL security (BA = http://localhost:8080 and DI = http://localhost:9080). Pentaho does not provide any recommendations or suggestions on what Test Platform to use, the techniques described below can be used in any such platforms (as long as they support http / https calls).

The plan consists of 3 main tasks: Define what needs to be tested, execute the defined tests, and compare the results with a validation point.

Define. Either it is the data that is loaded using the DI server, the visualizations that are presented in the BA server, or the data that feeds these visualizations; it is responsibility of the client to define what will be tested. For example:

  • Data loading: I will test that the process used to load the data into the fact table fact_xxx is loading the data accurate. For this test I will need to get the current state of the table, load data, under certain controlled environment, into the table, compare the result table with the original state and finally make sure these changes reflect the true data that needs to be loaded.
  • Flow in charts: I will test that when clicking chart “x” in dashboard “z” it is correctly opening the new dashboard “y”. For this test I will need to run the URL that renders dashboard “z”, then click on chart “x”, and finally capture the result dashboard and compare it with dashboard “y” to make sure the resulting dashboard is what was expected.
  • Data in widget: I will test that the data loading widget “n” in dashboard “m” is correct. For this test I will need to get a sample of the data that feeds widget “n”, then I will run the URL to export data from the CDA query that is used in this widget, and finally I will compare the generated data with the sample data obtained in the first step.

Execute. Once the test scenarios are defined, the next step is to execute those tests. Pentaho provides integration points that allow the execution of every piece of content within Pentaho (executing an ETL job, an Analyzer Report, or a CTools dashboard). The next section of the document shows all of the integration points required as REST services. Pentaho recommends using either PDI or CDA (part of CTools) to test the data.

Compare. Once the test scenarios were executed, it is required to check whether the results obtained are accurate (and the test succeeded) or inaccurate (and the test failed). This can be done with any technology that the client chooses; it can be manual comparison, with shell scripts or with any other testing suite (such as Selenium[1]).

 

Environment Setup

This document was tested using Pentaho Business Analytics 5.4.0.1 installer (pentaho-business-analytics-5.4.0.1-130-x64.bin) with Sample Data installed, in Ubuntu Server 14.04 fully updated.

Other components installed / used in the tests:

  • Pentaho Integration Examples
  • WriteToLog ETL package (to test DI Services)
  • curl 7.35.0

 

cURL Usage

It is recommended to use cURL to run all of this services via a shell script[2], bellow it is explained the different steps and usages recommended.

  • Login[3]: curl -d "j_username=<USERNAME>&j_password=<PASSWORD>" -c /path/to/cookie/file/cookie.txt ‘http://<HOST>:<PORT>/pentaho/j_spring_security_check’
    • -d: Use POST method to submit username and password
    • -c: Specify the file used to store the cookie details from the session
    • Substitute <USERNAME>, <PASSWORD>, <HOST>, and <PORT> with the specific connection details
  • Execute call: curl -b /path/to/cookie/file/cookie.txt '<URL REST CALL>'
    • -b: Specify what file contains the cookie with the session details
    • Substitute <URL REST CALL> with the correct URL call specified belloe
  • Invalidate the session[4]: curl -b cookie.txt 'http://<HOST>:<PORT>/pentaho/Logout'
    • Substitute <HOST> and <PORT> with the specific connection details

 

BA Server

Every content displayed in the BA Server can be accessed with REST services (web calls), and test cases can be written for Data Validation and for Visual Validation. These REST services are documented in Pentaho’s Integration Examples (follow the guide in http://help.pentaho.com/Documentation/5.4/0R0/0Q0/020 to obtain these samples)[5]. Below are the different type of contents and the API that can be used.

 

Analyzer

  • Displaying a report in viewer mode
    • /pentaho/api/repos/:path:to:saved:analyzer:report.xanalyzer/viewer
    • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Leading%20Product%20Lines%20(pivot%20table).xanalyzer/viewer
  • Displaying a report in editor mode
    • /pentaho/api/repos/:path:to:saved:analyzer:report.xanalyzer/editor
    • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Leading%20Product%20Lines%20(pivot%20table).xanalyzer/editor?showRepositoryButtons=true
  • Creating a new report
    • /pentaho/api/repos/xanalyzer/editor?catalog=YourCatalog&cube=YourCube

 

Interactive Reports

  • Displaying an Interactive Report in view mode
    • /pentaho/api/repos/:path:to:saved:pir:report.prpti/prpti.view
    • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Vendor%20Sales%20Report%20(interactive%20report).prpti/prpti.view
  • Displaying an Interactive Report in edit mode
    • /pentaho/api/repos/:path:to:saved:pir:report.prpti/prpti.edit
    • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Vendor%20Sales%20Report%20(interactive%20report).prpti/prpti.edit
  • Export an Interactive Report
    • /pentaho/api/repos/:path:to:saved:pir:report.prpti/report
      • To select the output type, pass the parameter output-target with one of the following values:
        • table/html;page-mode=page for HTML (Paginated)
        • table/html;page-mode=stream for HTML (Single Page)
        • pageable/pdf for PDF
        • table/excel;page-mode=flow for Excel
        • application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;page-mode=flow for Excel 2007
        • table/csv;page-mode=stream for Comma Separated Value
        • table/rtf;page-mode=flow for Rich-Text-Format
        • pageable/text for Text
      • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Vendor%20Sales%20Report%20(interactive%20report).prpti/report?output-target=table/csv;page-mode=stream
    • Creating a new Interactive Report
      • /pentaho/api/repos/pentaho-interactive-reporting/prpti.new

 

Reports

  • Displaying a Report
    • /pentaho/api/repos/:path:to:saved:report.prpt/generatedContent
      • It is possible to pass parameters either as GET or as POST
      • To select the output type, pass the parameter output-target with one of the following values:
        • table/html;page-mode=page for HTML (Paginated)
        • table/html;page-mode=stream for HTML (Single Page)
        • pageable/pdf for PDF
        • table/excel;page-mode=flow for Excel
        • application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;page-mode=flow for Excel 2007
        • table/csv;page-mode=stream for Comma Separated Value
        • table/rtf;page-mode=flow for Rich-Text-Format
        • pageable/text for Text
      • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Top%20Customers%20 (report).prpt/generatedContent?sLine=[Product].[All Products].[Classic Cars]&sMarket=[Markets].[All Markets].[NA]&sYear=[Time].[All Years].[2003]&TopCount=3&output-target=table/html;page-mode=page
    • Displaying a Report with parameter panel
      • /pentaho/api/repos/:path:to:saved:report.prpt/viewer
      • http://localhost:8080/pentaho/api/repos/:public:Steel%20Wheels:Top%20Customers%20 (report).prpt/viewer?sLine=[Product].[All Products].[Classic Cars]&sMarket=[Markets].[All Markets].[NA]&sYear=[Time].[All Years].[2003]&TopCount=3&output-target=table/html;page-mode=page

 

Repository and Scheduler

  • Download Content
    • /pentaho/api/repo/files/:path:to:saved:content/download
      • Will generate a zip file
      • The zip file will contain the file/folder requested
      • The zip will include a exportManifest.xml file which identifies the content of the file
      • For each file in the zip, a .locale file will specify the internationalization details.
    • http://localhost:8080/pentaho/api/repo/files/:public:Steel%20Wheels:Top%20Customers%20(report).prpt/download
  • Browse Content
    • /pentaho/api/repo/files/:path:to:saved:content/children
      • Parameters are to be passed via GET
        • depth: Specify the depth it should fetch for content. 0 will only fetch the file/folder specified, -1 will not limit, positive integer will fetch the integer level of children
        • filter: Includes only tree entries that match the given filter string. The filter may use one or more wildcard characters ("*"), or a disjunction (using the "|" character to represent logical OR). Default is "*"
        • showHidden: If true, hidden content will be included in the result
      • The response can be in XML or JSON format, to determine it, it is needed to specify the Accept HTTP header
        • With curl, add the -H 'Accept:application/json parameter JSON
        • With curl, add the -H 'Accept:application/xml' parameter for XML
      • http://localhost:8080/pentaho/api/repo/files/:public:Steel%20Wheels/children?depth=-1&filter*.prpt&showHidden=true
        • Will display all files in the /public/Steel Wheels folder (including its subfolders and hidden files) that ends with the .prpt extension
      • Content Properties
        • /pentaho/api/repo/files/:path:to:saved:content/properties
          • The response can be in XML or JSON format, to determine it, it is needed to specify the Accept HTTP header
            • With curl, add the -H 'Accept:application/json parameter JSON
            • With curl, add the -H 'Accept:application/xml' parameter for XML
          • http://localhost:8080/pentaho/api/repo/files/:public:Steel%20Wheels:Top%20Customers%20(report).prpt /properties
        • List Scheduled Content
          • /pentaho/api/scheduler/jobs
            • Required method is GET, since POST is for creating jobs.
            • The response can be in XML or JSON format, to determine it, it is needed to specify the Accept HTTP header
              • With curl, add the -H 'Accept:application/json parameter JSON
              • With curl, add the -H 'Accept:application/xml' parameter for XML
            • http://localhost:8080/pentaho/api/scheduler/jobs

 

CTools

  • Dashboards and Widgets
    • /pentaho/api/repos/:path:to:saved:dashboard.wcdf/generatedContent?ts=undefined
      • The ts=undefined parameter I normally a timestamp to prevent caching, however this can be either removed or replaced with a random number / value
    • http://localhost:8080/pentaho/api/repos/:public:plugin-samples:pentaho-cdf-dd:cde_sample1.wcdf/generatedContent?ts=1405266243858
  • Execute a CDA query
    • /pentaho/plugin/cda/api/doQuery?path=/path/to/cda/datasource.cda&dataAccessId=ID
      • It is possible to pass parameters for the query
    • http://localhost:8080/pentaho/plugin/cda/api/doQuery?path=/public/plugin-samples/cda/cdafiles/mondrian-jndi.cda&dataAccessId=1&paramstatus=Shipped
  • Export a CDA query to excel
    • /pentaho/plugin/cda/api/doQuery?path=/path/to/cda/datasource.cda&dataAccessId=ID&outputType=xls
    • http://localhost:8080/pentaho/plugin/cda/api/doQuery?path=/public/plugin-samples/cda/cdafiles/mondrian-jndi.cda&dataAccessId=1&paramstatus=Shipped&outputType=xls
  • List available CDA queries
    • /pentaho/plugin/cda/api/listQueries?path=/path/to/cda/datasource.cda
    • http://localhost:8080/pentaho/plugin/cda/api/listQueries?path=/public/plugin-samples/cda/cdafiles/mondrian-jndi.cda
  • Refresh CDA Cache (full)
    • /pentaho/plugin/cda/api/cacheMonitor/removeAll
    • method=removeAll needs to be send as POST

 

DI Server

Similar to the BA Server, you can run and schedule jobs / transformations in the DI Server using REST services, which will allow these tasks to be done programmatically and, therefore, provides a very easy integration point with any Test Platform to automate all the testing use cases. Below are the different API calls that can be used. Please make sure to configure correctly the DI Server for these calls to work

  1. Make sure the repositories.xml in the home folder/.kettle (for linux it is /home/<user>/.kettle or ~/.kettle/repositories.xml) with the correct information of the repository for the DI Server. Example:
  <repository>    <id>PentahoEnterpriseRepository</id>
    <name>localee</name>
    <description>Localhost EE Repo</description>
    <repository_location_url>http://localhost:9080/pentaho-di</repository_location_url>
    <version_comment_mandatory>N</version_comment_mandatory>
  </repository>
  1. Make sure the file data-integration-server/pentaho-solutions/system/kettle/slave-server-config.xml has the <repository> entry pointing to the configuration in the repositories.xml. Example:
<repository>
  <name>localee</name>
</repository>
  • Run a Job
    • /pentaho-di/kettle/runJob
      • Use the parameter job=/path/to/job/in/repository to specify which job
      • Use the parameter level=Level to specify the log level (be aware that the higher the log level, the performance of the job might get affected). Use one of the following:
        • Nothing
        • Error
        • Minimal
        • Basic
        • Detailed
        • Debug
        • Rowlevel
      • Use the parameter xml=Y to get the response in XML format
      • The response will have a <id> tag. Capture this value since it will be used to check the status of the Job.
    • http://localhost:9080/pentaho-di/kettle/runJob?job=/home/admin/WriteToLog&xml=Y&level=Debug
      • The response returned <id>e4e107bf-5ff6-4da4-bb27-42275f8f2a22</id>
    • Status of a Job
      • /pentaho-di/kettle/jobStatus
        • Use the parameter name=<JobName> to specify the name of the job.
        • Use the parameter id=<JobID> to specify the ID of the job.
        • Use the parameter xml=Y to get the response in XML.
        • The response will contain information, such as: Status (<status_desc>), resulting rows (<result>), and the full log text (<log_text>).
      • http://localhost:9080/pentaho-di/kettle/jobStatus?name=WriteToLog&id=e4e107bf-5ff6-4da4-bb27-42275f8f2a22&xml=Y
    • Run a Transformation
      • /pentaho-di/kettle/runTrans
        • Use the parameter trans=/path/to/trans/in/repository to specify which transformation
        • Use the parameter level=Level to specify the log level (be aware that the higher the log level, the performance of the job might get affected). Use one of the following:
          • Nothing
          • Error
          • Minimal
          • Basic
          • Detailed
          • Debug
          • Rowlevel
        • Use the parameter xml=Y to get the response in XML format
        • The response will have a <id> tag. Capture this value since it will be used to check the status of the Transformation.
      • http://localhost:9080/pentaho-di/kettle/runTrans?trans=/home/admin/WriteToLogTrans&xml=Y&level=Debug
        • The response returned <id>bce47eae-2d4f-4e36-a60f-61e57d897afb</id>
      • Status of a Transformation
        • /pentaho-di/kettle/transStatus
          • Use the parameter name=<TransName> to specify the name of the transformation.
          • Use the parameter id=<TransID> to specify the ID of the transformation.
          • Use the parameter xml=Y to get the response in XML.
          • The response will contain information, such as: Status (<status_desc>), error description (<error_desc>), and the result of each step (<stepstatus>).
        • http://localhost:9080/pentaho-di/kettle/transStatus?name=WriteToLogTrans&id=bce47eae-2d4f-4e36-a60f-61e57d897afb

 

[1] Pentaho does not have a recommendation of Selenium over any other technologies, the Client can use any technology they are comfortable with.

[2] Further reading http://rpbouman.blogspot.com/2014/10/performing-administrative-tasks-on.html

[3] http://help.pentaho.com/Documentation/5.4/0R0/070/010/00A

[4] Not required, but recommended to avoid leaving a session opened.

[5] Full API documentation in http://help.pentaho.com/Documentation/5.4/0R0/070

Have more questions? Submit a request

Comments

Powered by Zendesk