Skip to content

Dag

datapizza.pipeline.dag_pipeline.DagPipeline

A pipeline that runs a graph of a dependency graph.

a_run async

a_run(data)

Run the pipeline asynchronously.

Parameters:

Name Type Description Default
data dict

The input data to the pipeline.

required

Returns:

Name Type Description
dict

The results of the pipeline.

add_module

add_module(node_name, node)

Add a module to the pipeline.

Parameters:

Name Type Description Default
node_name str

The name of the module.

required
node PipelineComponent

The module to add.

required

connect

connect(
    source_node, target_node, target_key, source_key=None
)

Connect two nodes in the pipeline.

Parameters:

Name Type Description Default
source_node str

The name of the source node.

required
target_node str

The name of the target node.

required
target_key str

The key to store the result of the target node in the source node.

required
source_key str

The key to retrieve the result of the source node from the target node. Defaults to None.

None

from_yaml

from_yaml(config_path)

Load the pipeline from a YAML configuration file.

Parameters:

Name Type Description Default
config_path str

Path to the YAML configuration file.

required

Returns:

Name Type Description
DagPipeline DagPipeline

The pipeline instance.

run

run(data)

Run the pipeline.

Parameters:

Name Type Description Default
data dict

The input data to the pipeline.

required

Returns:

Name Type Description
dict dict

The results of the pipeline.