Skip to main content

Pipelines

Pipelines are a specialised type of node designed to efficiently schedule and manage notebook execution within your workflow. Each pipeline consists of multiple elements, each of which contains a notebook and specific configuration settings for its execution.

Creating a Pipeline

You can create a pipeline in any folder of your project by clicking on the + button in the File System Explorer, located in the top right-hand corner. Then, select New.... and choose Pipeline.

Configuration

Upon accessing the detail page of a pipeline, you'll find three tabs:

General settings

Here you can configure the basic settings of the pipeline:

OptionTypeDescriptionDefault
Execution timeoutnumberThe maximum time in seconds that a pipeline is allowed to run.300
Global parametersstring/JSONA JSON object containing the pipeline's global parameters.{}
Worker imagestringThe worker image used to run the pipeline. The default is to use the project's base image.<default_project_image>
Worker environment
stringYou can select the available machine resource options (RAM, CPU, GPU) that will be used to run the pipeline.
Active
booleanWhether the automatic executions are enabled for the pipeline.false
Crontab expression
stringThe crontab expression that defines the schedule of the pipeline. See Crontab.guru

Under the form options, you have a button to 'See logs' and another to 'Clear logs'. Pipelines logs work in the same way as in Notebooks.

Elements

Pipelines are composed of elements that will get executed in a sequence. Each element contains a Notebook and some Settings:

OptionTypeDescriptionDefault
NotebookstringThe notebook to be executed. You can choose from all the notebooks in your project.
Parametersstring/JSONA JSON object containing the parameters of the item.{}
OrdernumbrerThe order in which pipeline elements are executed. Items with a lower execution order are executed first.

You can also delete, clone or edit elements in this element table.

Executions

info

In the Free Plan you can only manually run pipelines by clicking the Execute button in the top right hand corner of the pipeline detail page.

The Executions tab displays a table of pipeline execution statistics, filterable by status (Completed, Errored, Pending, Running or OTHER) or creation date. From the Actions column, you can restart or cancel tasks based on their current status.

Pipeline executions

Public API

The API allows users to run pipelines programmatically through a REST API interface. This feature is designed to provide data pipeline users with better integration and connectivity with third-party applications and processes.

All endpoints are secured and require a valid token for access. You must add the token to the Authorisation header as follows:

Authorization: Token YOUR_ACCESS_TOKEN

You can access the API tab within a specific pipeline to generate a token and get an example of how to run the pipeline:

Pipeline API

Endpoints

Execute

POST /api/project/{project_node_uuid}/pipeline/{node_uuid}/execute/

Starts executing a Pipeline.

Response:

{
"execution_status_urls": {
"method": "GET",
"detail": "http://your-server/api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/{execution_uuid}/",
"list": "http://your-server/api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/"
}
}
Get Execution:

GET /api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/{execution_uuid}/

Get a specific Pipeline execution.

Response:

{
"uuid": "execution_uuid",
"pipeline_uuid": "pipeline_uuid",
"elapsed_time": "00:30:00",
"creation_datetime": "2023-10-02T15:20:30Z",
"status": "Running",
"exception": ""
}
List Executions

GET /api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/

Get a list of executions for a specific Pipeline.

Response:

{
"items": [
{
"uuid": "execution_uuid1",
"pipeline_uuid": "pipeline_uuid",
"elapsed_time": "00:15:00",
"creation_datetime": "2023-10-02T12:20:30Z",
"status": "Running",
"exception": ""
},
{
"uuid": "execution_uuid2",
"pipeline_uuid": "pipeline_uuid",
"elapsed_time": "00:20:00",
"creation_datetime": "2023-10-02T13:20:30Z",
"status": "Pending",
"exception": "Some error occurred"
}
]
}