Steps
Steps are the fundamental building blocks of StepFlow workflows. Each step executes a specific component and produces output that can be used by subsequent steps.
Basic Step Structure
A step has the following structure:
steps:
- id: step_name
component: /path/component
input:
# Input data for the component
# Optional fields:
skip: condition # Skip this step if condition is truthy
on_error: # How to handle errors
action: continue|skip|terminate
default_output: {} # Default output if action is continue
Step Anatomy
Required Fields
id
: Unique identifier for the step within the workflowcomponent
: URL specifying which component to executeinput
: Data to pass to the component
Optional Fields
skip
: Condition to skip step executionon_error
: Error handling configurationdescription
: Human-readable description
Step Execution
Execution Order
Steps execute based on their dependencies, not their order in the YAML file. StepFlow automatically determines the execution order by analyzing data dependencies.
steps:
# This step runs first (no dependencies)
- id: load_data
component: /builtin/load_file
input:
path: "data.json"
# This step waits for load_data to complete
- id: process_data
component: /python/udf
input:
data: { $from: { step: load_data } }
# This step also waits for load_data (runs in parallel with process_data)
- id: validate_data
component: /validate
input:
data: { $from: { step: load_data } }
# This step waits for both process_data and validate_data
- id: combine_results
component: /merge
input:
processed: { $from: { step: process_data } }
validated: { $from: { step: validate_data } }
Parallel Execution
Steps without dependencies run in parallel automatically. This provides optimal performance without requiring explicit parallelization.
Data References
Steps reference data using the $from
syntax:
Workflow Input
input:
user_id: { $from: { workflow: input }, path: "user_id" }
Step Output
input:
processed_data: { $from: { step: previous_step } }
# Or reference a specific field
user_name: { $from: { step: user_lookup }, path: "name" }
Literal Values
input:
config: { $literal: { timeout: 30, retries: 3 } }
message: { $literal: "Hello World" }
Nested Path References
Use dot notation or array syntax for nested data:
input:
# Dot notation
user_email: { $from: { step: user_data }, path: "profile.email" }
# Array index
first_item: { $from: { step: list_data }, path: "items[0]" }
# Complex nested path
deep_value: { $from: { step: complex_data }, path: "level1.level2[0].value" }
Skip Handling
Basic Skip Condition
steps:
- id: expensive_analysis
component: /ai/analyze
skip: { $from: { workflow: input }, path: "skip_analysis" }
input:
data: { $from: { step: load_data } }
Skip Propagation
When a step is skipped, consuming steps are also skipped by default:
steps:
- id: optional_step
component: /data/process
skip: { $from: { workflow: input }, path: "skip_optional" }
# This step is skipped if optional_step is skipped
- id: dependent_step
component: /data/analyze
input:
data: { $from: { step: optional_step } }
Skip Interruption
Use $on_skip
to provide default values when dependencies are skipped:
steps:
- id: consumer_step
component: /data/process
input:
required_data: { $from: { step: required_step } }
optional_data:
$from: { step: optional_step }
$on_skip: "use_default"
$default: { status: "not_processed" }
Skip Actions
skip_step
(default): Skip the consuming stepuse_default
: Use the provided default value
Error Handling
Error Actions
steps:
- id: risky_operation
component: /external/api_call
on_error:
action: continue # continue|skip|terminate
default_output:
result: "fallback_value"
status: "error_handled"
input:
endpoint: "https://api.example.com"
Error Action Types
terminate
(default): Stop workflow executioncontinue
: Usedefault_output
and mark step as successfulskip
: Mark step as skipped (triggers skip propagation)
Default Output Requirements
When using action: continue
, the default_output
must:
- Conform to the component's output schema
- Include all required fields
- Match expected data types
Component Types
Built-in Components
# Blob storage
- id: store_data
component: /builtin/put_blob
input:
data: { $from: { step: prepare_data } }
# Sub-workflow execution
- id: sub_process
component: /builtin/eval
input:
workflow: { $from: { step: load_workflow } }
input: { $from: { step: prepare_input } }
External Components
# Python UDF
- id: custom_logic
component: /python/udf
input:
blob_id: { $from: { step: create_udf } }
input: { $from: { workflow: input } }
# Custom component server
- id: specialized_task
component: /my_plugin/advanced_processor
input:
configuration: { $from: { step: load_config } }
data: { $from: { step: prepare_data } }
OpenAI Integration
# OpenAI API call
- id: ai_analysis
component: /builtin/openai
input:
messages:
- role: system
content: "You are a helpful assistant."
- role: user
content: { $from: { step: user_input }, path: "question" }
model: "gpt-4"
max_tokens: 150
Advanced Patterns
Dynamic Components
Create components at runtime:
steps:
- id: create_custom_component
component: /factory/create_processor
input:
type: "data_transformer"
config: { $from: { workflow: input }, path: "processor_config" }
- id: use_custom_component
component: { $from: { step: create_custom_component }, path: "component_url" }
input:
data: { $from: { step: load_data } }
Conditional Processing
steps:
- id: process_if_large
component: /data/heavy_processor
skip:
$from: { step: check_size }
path: "is_small"
input:
data: { $from: { step: load_data } }
- id: process_if_small
component: /data/light_processor
skip:
$from: { step: check_size }
path: "is_large"
input:
data: { $from: { step: load_data } }
Multi-step Dependencies
steps:
- id: final_step
component: /data/combine
input:
# Wait for multiple steps
processed: { $from: { step: process_data } }
validated: { $from: { step: validate_data } }
enriched: { $from: { step: enrich_data } }
# With fallback for optional data
optional_metadata:
$from: { step: optional_metadata }
$on_skip: "use_default"
$default: {}
Best Practices
Step Naming
- Use descriptive, action-oriented names
- Follow consistent naming conventions
- Avoid generic names like
step1
,process
# Good
- id: load_user_data
- id: validate_email_format
- id: send_welcome_email
# Avoid
- id: step1
- id: process
- id: do_stuff
Input Organization
- Group related inputs logically
- Use meaningful parameter names
- Provide default values where appropriate
# Good organization
input:
# Data inputs
user_data: { $from: { step: load_user } }
settings: { $from: { step: load_settings } }
# Configuration
timeout: { $literal: 30 }
retries: { $literal: 3 }
# Optional parameters with defaults
debug_mode:
$from: { workflow: input }
path: "debug"
$on_skip: "use_default"
$default: false
Error Handling Strategy
- Use
terminate
for critical failures - Use
continue
for recoverable errors with meaningful defaults - Use
skip
for optional operations
# Critical operation - must succeed
- id: authenticate_user
component: /auth/verify
# on_error defaults to terminate
# Optional enhancement - can fail gracefully
- id: enrich_profile
component: /data/enrich
on_error:
action: continue
default_output:
enriched: false
metadata: {}
# Completely optional - skip if fails
- id: log_analytics
component: /analytics/track
on_error:
action: skip