Environment:
Operating Systems: Windows 10, RHEL 8, RHEL 9
PDI Versions: 9.3.0.0-428 (with OpenJDK 11 and OpenJDK 1.8), 9.5.0.0-240 (with OpenJDK 11)
Error: The error 400 Bad Request occurs when the HTTP body contains a JSON with German special characters such as "ä", "Ü", "ß" in Pentaho Data Integration (PDI).
Problem Description:
When using the REST client step in PDI to send a POST request containing a JSON body with German special characters (e.g., "ä", "Ü", "ß"), the server responds with a 400 Bad Request error. This behavior occurs regardless of how the HTTP body is read (e.g., from a text file, CSV file, set variables, or set field value to a constant). Removing these special characters allows the request to succeed with an HTTP 200 OK response.
Note: The same request works using other client tools (Postman, Apache Hop).
Cause:
The issue is likely due to the PDI client (Spoon) not handling UTF-8 encoding properly when sending HTTP requests. Although UTF-8 encoding was specified in the text file/CSV file input steps and the ContentType field, the special characters in the HTTP body are not processed correctly, leading to a 400 Bad Request error.
Resolution:
To resolve this issue, you need to configure the PDI client (Spoon) to use UTF-8 encoding explicitly by modifying the Java options in the Spoon startup script.
Steps:
- Stop the Spoon application if it is running.
- Open the spoon.bat (on Windows) or spoon.sh (on Linux) file in a text editor.
- Locate the line that reads: set OPT=%OPT% %PENTAHO_DI_JAVA_OPTIONS%.
- Append the following option to the end of the line: -Dfile.encoding=utf8. The line should look like this: set OPT=%OPT% %PENTAHO_DI_JAVA_OPTIONS% -Dfile.encoding=utf8
- Save the file and restart Spoon.
- Retry the POST request with the special characters in the JSON body.
Important Notes:
- This workaround forces the PDI client to use UTF-8 encoding for all operations, which should resolve similar encoding issues in other contexts as well.
- If other encoding issues arise, consider reviewing all steps involving text input or output to ensure they are configured to use UTF-8 consistently.
Example: This issue is logged under Ticket 114391 / JIRA PDI-19291
Comments