Pipelines
Pipelines are a specialised type of node designed to efficiently schedule and manage notebook execution within your workflow. Each pipeline consists of multiple elements, each of which contains a notebook and specific configuration settings for its execution.
Creating a Pipeline
You can create a pipeline in any folder of your project by clicking on the +
button in the File System Explorer, located in the top right-hand corner. Then, select New...
.
and choose Pipeline
.
Configuration
Upon accessing the detail page of a pipeline, you'll find three tabs:
General settings
Here you can configure the basic settings of the pipeline:
Option | Type | Description | Default |
---|---|---|---|
Execution timeout | number | The maximum time in seconds that a pipeline is allowed to run. | 300 |
Global parameters | string/JSON | A JSON object containing the pipeline's global parameters. | {} |
Worker image | string | The worker image used to run the pipeline. The default is to use the project's base image. | <default_project_image> |
Worker environment | string | You can select the available machine resource options (RAM, CPU, GPU) that will be used to run the pipeline. | |
Active | boolean | Whether the automatic executions are enabled for the pipeline. | false |
Crontab expression | string | The crontab expression that defines the schedule of the pipeline. See Crontab.guru |
Under the form options, you have a button to 'See logs' and another to 'Clear logs'. Pipelines logs work in the same way as in Notebooks.
Elements
Pipelines are composed of elements that will get executed in a sequence. Each element contains a Notebook and some Settings:
Option | Type | Description | Default |
---|---|---|---|
Notebook | string | The notebook to be executed. You can choose from all the notebooks in your project. | |
Parameters | string/JSON | A JSON object containing the parameters of the item. | {} |
Order | numbrer | The order in which pipeline elements are executed. Items with a lower execution order are executed first. |
You can also delete, clone or edit elements in this element table.
Executions
In the Free Plan you can only manually run pipelines by clicking the Execute
button in the top right hand corner of the pipeline detail page.
The Executions tab displays a table of pipeline execution statistics,
filterable by status (Completed, Errored, Pending, Running or OTHER) or creation date.
From the Actions
column, you can restart or cancel tasks based on their current status.
Public API
The API allows users to run pipelines programmatically through a REST API interface. This feature is designed to provide data pipeline users with better integration and connectivity with third-party applications and processes.
All endpoints are secured and require a valid token for access. You must add the token to the Authorisation header as follows:
Authorization: Token YOUR_ACCESS_TOKEN
You can access the API
tab within a specific pipeline to generate a token and get an example of how to run the pipeline:
Endpoints
Execute
POST /api/project/{project_node_uuid}/pipeline/{node_uuid}/execute/
Starts executing a Pipeline.
Response:
{
"execution_status_urls": {
"method": "GET",
"detail": "http://your-server/api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/{execution_uuid}/",
"list": "http://your-server/api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/"
}
}
Get Execution:
GET /api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/{execution_uuid}/
Get a specific Pipeline execution.
Response:
{
"uuid": "execution_uuid",
"pipeline_uuid": "pipeline_uuid",
"elapsed_time": "00:30:00",
"creation_datetime": "2023-10-02T15:20:30Z",
"status": "Running",
"exception": ""
}
List Executions
GET /api/project/{project_node_uuid}/pipeline/{node_uuid}/executions/
Get a list of executions for a specific Pipeline.
Response:
{
"items": [
{
"uuid": "execution_uuid1",
"pipeline_uuid": "pipeline_uuid",
"elapsed_time": "00:15:00",
"creation_datetime": "2023-10-02T12:20:30Z",
"status": "Running",
"exception": ""
},
{
"uuid": "execution_uuid2",
"pipeline_uuid": "pipeline_uuid",
"elapsed_time": "00:20:00",
"creation_datetime": "2023-10-02T13:20:30Z",
"status": "Pending",
"exception": "Some error occurred"
}
]
}