Reference
This document provides the complete specification for StepFlow workflow files, including all fields, formats, and advanced features supported by the workflow engine.
Complete Workflow Structure
# Optional metadata
name: "My Workflow"
description: "A complete workflow demonstrating all features"
version: "1.2.0"
# Input validation schema
input_schema:
type: object
properties:
# Input field definitions
required: []
# Output validation schema
output_schema:
type: object
properties:
# Output field definitions
required: []
# Workflow execution steps
steps:
- id: step_identifier
component: /component/name
# Step configuration...
# Output mapping
output:
result_field: { $from: { step: step_name } }
# Test configuration
test:
stepflow_config: "path/to/test-config.yml"
cases:
- name: test_case_name
# Test case definition...
# Example inputs (for documentation and tooling)
examples:
- name: example_name
description: "Description of this example"
input:
# Example input data
Metadata Fields
Core Metadata
name: "Workflow Name"
description: "Detailed description of what this workflow does"
version: "1.0.0"
Field Descriptions:
name
(optional): Human-readable workflow name for documentation and UIsdescription
(optional): Detailed description of workflow purpose and behaviorversion
(optional): Semantic version for workflow tracking and compatibility
Usage Guidelines:
- Use descriptive names that clearly indicate the workflow's purpose
- Include version numbers for workflows that evolve over time
- Write descriptions that help other developers understand the workflow's use case
Version Management
# Version with metadata
version: "2.1.0"
# Complex versioning (future feature)
version_info:
version: "2.1.0"
compatibility: ">=2.0.0"
deprecated_features: ["old_syntax"]
migration_guide: "docs/migration-v2.md"
Schema Definitions
Input Schema
Defines the structure and validation rules for workflow input:
input_schema:
type: object
properties:
user_data:
type: object
properties:
name:
type: string
minLength: 1
maxLength: 100
email:
type: string
format: email
age:
type: integer
minimum: 0
maximum: 150
required: ["name", "email"]
processing_options:
type: object
properties:
format:
type: string
enum: ["json", "xml", "csv"]
default: "json"
include_metadata:
type: boolean
default: true
additionalProperties: false
file_paths:
type: array
items:
type: string
pattern: "^[a-zA-Z0-9._/-]+$"
minItems: 1
maxItems: 10
required: ["user_data"]
additionalProperties: false
Output Schema
Defines the structure of workflow output:
output_schema:
type: object
properties:
processed_data:
type: object
description: "The main processed result"
metadata:
type: object
properties:
processing_time_ms:
type: integer
minimum: 0
records_processed:
type: integer
minimum: 0
warnings:
type: array
items:
type: string
required: ["processing_time_ms", "records_processed"]
status:
type: string
enum: ["success", "partial", "failed"]
required: ["processed_data", "metadata", "status"]
Schema Features
Supported JSON Schema Features:
- Type validation (
string
,number
,integer
,boolean
,array
,object
,null
) - String constraints (
minLength
,maxLength
,pattern
,format
) - Number constraints (
minimum
,maximum
,multipleOf
) - Array constraints (
minItems
,maxItems
,uniqueItems
) - Object constraints (
required
,additionalProperties
,patternProperties
) - Composition (
allOf
,anyOf
,oneOf
,not
) - Conditional schemas (
if
/then
/else
) - References (
$ref
) - Enumerations (
enum
) - Constants (
const
)
Step Configuration
Basic Step Structure
steps:
- id: unique_step_identifier
component: /path/component_name
input:
param1: { $from: { workflow: input }, path: "field" }
param2: { $literal: "static_value" }
# Optional fields
input_schema:
type: object
# Schema for this step's input
output_schema:
type: object
# Schema for this step's output
skip_if: { $from: { workflow: input }, path: "skip_condition" }
on_error:
action: continue
default_output:
result: "fallback_value"
Step Schemas
Individual steps can define their own input/output schemas:
steps:
- id: data_processor
component: /custom/processor
input_schema:
type: object
properties:
data:
type: array
items:
type: object
properties:
id: { type: string }
value: { type: number }
required: ["id", "value"]
rules:
type: object
additionalProperties:
type: string
required: ["data"]
output_schema:
type: object
properties:
processed_data:
type: array
items:
type: object
summary:
type: object
properties:
total_records: { type: integer }
processing_time_ms: { type: integer }
required: ["processed_data", "summary"]
input:
data: { $from: { workflow: input }, path: "raw_data" }
rules: { $from: { step: load_rules } }
Step Schema Benefits:
- Validation: Ensures data integrity at each step
- Documentation: Self-documenting step interfaces
- Tooling: Enables IDE autocompletion and validation
- Debugging: Clear error messages for data mismatches
Test Configuration
Test Structure
test:
# Optional: Override stepflow config for tests
stepflow_config: "test-config.yml"
# Test cases
cases:
- name: test_case_identifier
description: "Human-readable description of test case"
input:
# Input data matching workflow input_schema
output:
outcome: success # success | skipped | failed
result:
# Expected output data matching workflow output_schema
# Optional: Override specific configuration
config_overrides:
timeout: 60
log_level: debug
Complete Test Example
test:
stepflow_config: "test/test-config.yml"
cases:
- name: successful_processing
description: "Process valid user data successfully"
input:
user_data:
name: "Alice Johnson"
email: "alice@example.com"
age: 30
processing_options:
format: "json"
include_metadata: true
file_paths: ["data/input1.csv", "data/input2.csv"]
output:
outcome: success
result:
processed_data:
user_id: "alice_johnson_30"
formatted_name: "Alice Johnson"
email_domain: "example.com"
metadata:
processing_time_ms: 250
records_processed: 2
warnings: []
status: "success"
- name: missing_required_field
description: "Handle missing required input gracefully"
input:
user_data:
name: "Bob Smith"
# Missing required email field
processing_options:
format: "json"
file_paths: ["data/input1.csv"]
output:
outcome: failed
error:
code: "VALIDATION_ERROR"
message: "Required field 'email' missing from user_data"
- name: empty_file_list
description: "Handle empty file list"
input:
user_data:
name: "Charlie Brown"
email: "charlie@example.com"
file_paths: []
output:
outcome: failed
error:
code: "VALIDATION_ERROR"
message: "file_paths must contain at least 1 item"
- name: skip_condition_triggered
description: "Test skip condition behavior"
input:
user_data:
name: "David Wilson"
email: "david@example.com"
skip_processing: true # This triggers skip condition
processing_options:
format: "json"
file_paths: ["data/input1.csv"]
output:
outcome: skipped
reason: "Processing skipped due to skip_processing flag"
Test Configuration Overrides
test:
# Global test configuration
stepflow_config: "test/base-config.yml"
# Global test settings
defaults:
timeout: 30
log_level: info
cases:
- name: quick_test
# Inherits global settings
input: { }
output: { }
- name: long_running_test
# Override timeout for this specific test
config_overrides:
timeout: 300
log_level: debug
input: { }
output: { }
- name: custom_config_test
# Use completely different config
stepflow_config: "test/special-config.yml"
input: { }
output: { }
Example Inputs
Basic Examples
examples:
- name: basic_usage
description: "Simple example with minimal required fields"
input:
user_data:
name: "John Doe"
email: "john@example.com"
age: 25
file_paths: ["sample.csv"]
- name: advanced_usage
description: "Example showing all optional features"
input:
user_data:
name: "Jane Smith"
email: "jane@company.com"
age: 35
processing_options:
format: "xml"
include_metadata: true
file_paths: ["data1.csv", "data2.csv", "data3.csv"]
- name: edge_case
description: "Edge case with maximum values"
input:
user_data:
name: "Very Long Name That Tests Maximum Length Constraints"
email: "test@subdomain.example-domain.com"
age: 150
processing_options:
format: "csv"
include_metadata: false
file_paths: [
"file1.csv", "file2.csv", "file3.csv", "file4.csv", "file5.csv",
"file6.csv", "file7.csv", "file8.csv", "file9.csv", "file10.csv"
]
Example Categories
examples:
# Production examples
- name: production_scenario_1
description: "Typical production use case"
tags: ["production", "common"]
input: { }
# Development examples
- name: development_test
description: "Quick test for development"
tags: ["development", "quick"]
input: { }
# Edge cases
- name: maximum_load
description: "Test with maximum input size"
tags: ["edge-case", "stress-test"]
input: { }
# Error scenarios
- name: invalid_input_example
description: "Example that should fail validation"
tags: ["error", "validation"]
expected_outcome: "validation_error"
input: { }
Example Usage
Examples serve multiple purposes:
- Documentation: Show users how to use the workflow
- Testing: Provide test data for automated testing
- Tooling: Enable UI dropdowns and quick-start templates
- Validation: Verify workflow works with realistic data
Accessing Examples:
- Workflow editors can provide example selection
- CLI tools can run workflows with example inputs
- Documentation generators can include examples
- Test frameworks can use examples as test cases
Advanced Features
Conditional Workflows
steps:
- id: check_environment
component: /env/environment_check
input:
environment: { $from: { workflow: input }, path: "env" }
# Different processing based on environment
- id: production_processing
component: /production/processor
skip_if:
$from: { step: check_environment }
path: "is_development"
input:
data: { $from: { workflow: input }, path: "data" }
- id: development_processing
component: /development/processor
skip_if:
$from: { step: check_environment }
path: "is_production"
input:
data: { $from: { workflow: input }, path: "data" }
Multi-Environment Support
# Environment-specific configurations
environments:
development:
default_timeout: 10
enable_debug: true
data_source: "dev_db"
staging:
default_timeout: 30
enable_debug: false
data_source: "staging_db"
production:
default_timeout: 60
enable_debug: false
data_source: "prod_db"
# Use environment variables in steps
steps:
- id: connect_database
component: /database/connect
input:
connection_string:
$from: { environment: current }
path: "data_source"
timeout:
$from: { environment: current }
path: "default_timeout"
Workflow Composition
# Import and compose other workflows
imports:
- path: "common/data-validation.yml"
alias: "validate"
- path: "processors/text-processing.yml"
alias: "text_proc"
steps:
# Use imported workflow as a step
- id: validate_input
component: /builtin/eval
input:
workflow: { $import: "validate" }
input: { $from: { workflow: input } }
# Chain workflows
- id: process_text
component: /builtin/eval
input:
workflow: { $import: "text_proc" }
input: { $from: { step: validate_input } }
Metadata and Annotations
# Workflow metadata
metadata:
author: "Development Team"
created: "2024-01-15"
last_modified: "2024-01-20"
tags: ["data-processing", "production"]
category: "data-pipelines"
complexity: "medium"
estimated_runtime: "5-10 minutes"
# Documentation links
documentation:
readme: "docs/workflow-guide.md"
api_docs: "https://api.example.com/docs"
runbook: "https://runbooks.example.com/data-pipeline"
# Resource requirements
resources:
memory: "512Mi"
cpu: "0.5"
timeout: "10m"
max_retries: 3
# Security annotations
security:
required_permissions: ["data:read", "storage:write"]
sensitive_data: ["user_data.email", "api_keys"]
compliance: ["GDPR", "SOX"]
Validation and Best Practices
Schema Validation
# Enable strict validation
validation:
strict_mode: true
fail_on_extra_properties: true
validate_step_schemas: true
# Custom validation rules
validation_rules:
- rule: "no_empty_strings"
description: "String fields cannot be empty"
pattern: "^.+$"
applies_to: ["string"]
- rule: "valid_email_domains"
description: "Email must be from allowed domains"
pattern: "@(company|partner)\\.com$"
applies_to: ["email"]
Performance Optimization
# Performance hints
performance:
parallel_steps: ["step1", "step2", "step3"]
cache_results: ["expensive_computation"]
memory_efficient: true
streaming_mode: false
# Resource limits
limits:
max_execution_time: "1h"
max_memory: "2Gi"
max_blob_size: "100Mi"
max_steps: 50
Documentation Standards
# Rich documentation
name: "User Data Processing Pipeline"
description: |
This workflow processes user data through multiple validation and
transformation steps. It supports multiple output formats and includes
comprehensive error handling.
Key features:
- Multi-format output (JSON, XML, CSV)
- Email validation and domain checking
- Age-based processing rules
- Detailed error reporting
Usage:
- Development: Use with test data for feature development
- Staging: Full integration testing with realistic data
- Production: Live user data processing
# Step documentation
steps:
- id: validate_user_data
description: |
Validates user data against business rules:
- Name must be 1-100 characters
- Email must be valid format from allowed domains
- Age must be realistic (0-150)
component: /validation/user_validator
# ... rest of step configuration
This specification provides the complete reference for StepFlow workflow authoring, covering all supported features and advanced patterns for building robust, maintainable workflows.