EmbeddedPython
Run a document through a Java embedded Graal Python environment.
Why Use It?
EmbeddedPython executes per-document Python code inside the Lucille JVM using GraalPy. Instead of returning a JSON object, your script mutates the current document directly through a Python-friendly proxy bound as doc (and the raw Java document as rawDoc). This avoids ports, subprocesses, venvs, and per-document JSON round trips.
When To Use It
Use EmbeddedPython when you need one or more of the following:
- Minimal operational overhead (ports, subprocess lifecycle, venv creation, pip installs).
- No use of any external Python libraries or native dependencies that require a real Python environment.
- Lightweight field enrichment/transformation.
When To Use ExternalPython Instead
Avoid EmbeddedPython and use ExternalPython when you need one or more of the following:
- Real Python compatibility (including packages with native dependencies).
- Dependency management via a requirements.txt installed into a managed venv.
- Process isolation apart from the JVM.
Example
Input Document
{
"id": "doc-1",
"title": "Hello",
"author": "Test",
"views": 123
}
Python Script
doc["title"] = doc["title"].upper()
Output Document
{
"id": "doc-1",
"title": "HELLO",
"author": "Test",
"views": 123
}
Config Parameters
{
name: "EmbeddedPython-Example"
class: "com.kmwllc.lucille.stage.EmbeddedPython"
# Specify exactly one of the following:
script_path: "/path/to/my_script.py"
script: "doc['title'] = doc['title'].upper()"
}