Manage Workflows

This section explains how to use the Manage Workflows UI in Qualiz to create and organize workflows within a project.

Workflows List¶

The Manage Workflows page displays all the workflows available within the selected project. From this page, you can:

View existing workflows along with their details.
Edit workflow configurations.
Delete workflows that are no longer needed.

Screenshot: Workflows List Page¶

Workflows List Page

Actions¶

Edit – Click the edit icon next to a workflow to update its details or structure.
Delete – Click the delete icon to remove the workflow from the project.

At the top-right corner of the page, click the New button to create a new workflow.

Create Workflow¶

When you click the New button, a popup window opens where you can enter the initial details required to set up the workflow.

Create Workflow Popup

Fields¶

Name – Enter a name to identify the workflow.
Project – Select the project under which the workflow will be created.
No of Retry – Define how many times the workflow should retry upon failure.
Params – Enter additional parameters in JSON format as input for the workflow.

After filling in the required details, click Create to proceed to the workflow editor.

Workflow Editor¶

The workflow editor provides a drag-and-drop interface where you can visually design the workflow by adding tasks and connecting them to define the data processing flow.

Screenshot: Workflow Editor¶

Workflow Editor

Workflow Editor – Task Fields¶

Each task in the workflow editor has common fields, along with additional fields specific to the task type.

✅ Common Fields for All Tasks¶

Task Name – Enter a descriptive name for the task.
Task Type – Select the type of task.

Below are the task types with their specific fields:¶

✅ Data Ingestion (Connection)¶

Connection – Select the source connection.
Post-process – Choose between:
Archive – Move files after processing.
Delete – Remove files after processing. (Only applicable for storage sources like S3, GCS, Azure Blob.)
Archive Options (only if Archive is selected):
Bucket – Destination bucket for archived files.
Folder – Destination folder path.

✅ Run Python Script¶

Bucket – Storage bucket containing the Python file.
Folder – Folder path within the bucket.
File – Python file name.
Params – Input parameters in JSON format.

✅ Run Python on Cluster (Beam)¶

Bucket – Storage bucket containing the Python file.
Folder – Folder path within the bucket.
File – Python file name.
Params – Input parameters in JSON format.
Number of Parallel Workers – Define how many workers should execute the task concurrently.
Additional Python Packages – List of extra packages required for execution.

✅ Execute SQL Script¶

Bucket – Storage bucket containing the SQL file.
Folder – Folder path within the bucket.
SQL File – SQL script file name.
Params – Input parameters in JSON format.
Target Data Source – The database where the SQL script will be executed.

✅ Data Cleansing¶

Cleansing Rule ID – Select the cleansing rule to apply.
Number of Parallel Workers – Define how many workers should process the task in parallel.

✅ Data Deduplication¶

Deduplication Rule ID – Select the deduplication rule to apply.

✅ Notification¶

Notification Template ID – Select the template used to send alerts or messages.