11 KiB
11 KiB
Test Prompt for Generic DFA MCP Server
You are an LLM tasked with testing the Generic DFA MCP Server. Your goal is to verify that the system can handle ANY type of workflow dynamically, maintaining context and preventing premature completion.
Test Overview
You will:
- Define multiple different workflows dynamically
- Run instances of each workflow
- Test state transitions and context preservation
- Verify checkpoint/rollback functionality
- Confirm the system is truly generic
Test 1: Customer Support Ticket Workflow
Step 1.1: Define the Workflow
Use workflow.define
to create a customer support ticket system:
{
"name": "support-ticket",
"description": "Customer support ticket lifecycle",
"states": {
"new": {
"transitions": {
"assign": "assigned",
"close": "closed"
}
},
"assigned": {
"transitions": {
"start": "in_progress",
"escalate": "escalated",
"unassign": "new"
}
},
"in_progress": {
"transitions": {
"resolve": "resolved",
"escalate": "escalated",
"need_info": "waiting_customer"
}
},
"waiting_customer": {
"transitions": {
"customer_replied": "in_progress",
"timeout": "closed"
}
},
"escalated": {
"transitions": {
"resolve": "resolved",
"close": "closed"
}
},
"resolved": {
"transitions": {
"reopen": "in_progress",
"close": "closed"
}
},
"closed": { "final": true }
},
"initialState": "new"
}
Step 1.2: Run the Workflow
- Start a support ticket with context:
{ticketId: "T-001", customer: "alice@example.com", issue: "Cannot login", priority: "high"}
- Assign to agent:
action='assign', data={agent: "bob@support.com", assignedAt: "<timestamp>"}
- Start working:
action='start'
- Create checkpoint:
description='Ticket in progress'
- Need customer info:
action='need_info', data={question: "Which browser are you using?"}
- Customer replies:
action='customer_replied', data={response: "Chrome v120", repliedAt: "<timestamp>"}
- Resolve ticket:
action='resolve', data={solution: "Cleared browser cache", resolvedAt: "<timestamp>"}
- Close ticket:
action='close'
- Verify final state is 'closed' and context contains full history
Test 2: Document Approval Workflow
Step 2.1: Define the Workflow
Use workflow.define
to create a document approval process:
{
"name": "document-approval",
"description": "Multi-level document approval",
"states": {
"draft": {
"transitions": {
"submit": "level1_review",
"delete": "deleted"
}
},
"level1_review": {
"transitions": {
"approve": "level2_review",
"reject": "draft",
"request_changes": "draft"
}
},
"level2_review": {
"transitions": {
"approve": "approved",
"reject": "level1_review",
"request_changes": "draft"
}
},
"approved": {
"transitions": {
"publish": "published",
"archive": "archived"
}
},
"published": { "final": true },
"archived": { "final": true },
"deleted": { "final": true }
},
"initialState": "draft"
}
Step 2.2: Run the Workflow
- Start with context:
{documentId: "DOC-2024-001", title: "Q4 Report", author: "finance@company.com"}
- Submit for review:
action='submit', data={submittedAt: "<timestamp>"}
- Level 1 requests changes:
action='request_changes', data={comments: ["Add revenue projections"]}
- Resubmit:
action='submit', data={changes: "Added projections", version: 2}
- Level 1 approves:
action='approve', data={approver: "manager@company.com"}
- Create checkpoint:
description='Before final approval'
- Level 2 rejects:
action='reject', data={reason: "Need CEO input"}
- List checkpoints and rollback to 'Before final approval'
- Level 2 approves:
action='approve', data={approver: "director@company.com"}
- Publish:
action='publish', data={publishedAt: "<timestamp>", url: "https://..."}
Test 3: Order Processing State Machine
Step 3.1: Define the Workflow
Use workflow.define
to create an e-commerce order workflow:
{
"name": "order-processing",
"description": "E-commerce order fulfillment",
"states": {
"pending": {
"transitions": {
"pay": "paid",
"cancel": "cancelled"
}
},
"paid": {
"transitions": {
"process": "processing",
"refund": "refunded"
}
},
"processing": {
"transitions": {
"ship": "shipped",
"backorder": "backordered",
"cancel": "cancelled"
}
},
"backordered": {
"transitions": {
"ship": "shipped",
"cancel": "cancelled"
}
},
"shipped": {
"transitions": {
"deliver": "delivered",
"return": "returned"
}
},
"delivered": { "final": true },
"returned": { "final": true },
"refunded": { "final": true },
"cancelled": { "final": true }
},
"initialState": "pending"
}
Step 3.2: Run the Workflow
- Start order:
context={orderId: "ORD-123", items: ["laptop", "mouse"], total: 1200}
- Process payment:
action='pay', data={paymentId: "PAY-456", method: "credit_card"}
- Start processing:
action='process'
- Ship order:
action='ship', data={trackingNumber: "TRACK-789", carrier: "FedEx"}
- Deliver:
action='deliver', data={deliveredAt: "<timestamp>", signature: "John Doe"}
Test 4: Feature Flag Rollout
Step 4.1: Define the Workflow
Use workflow.define
to create a feature flag rollout process:
{
"name": "feature-rollout",
"description": "Gradual feature flag deployment",
"states": {
"planning": {
"transitions": {
"approve": "canary",
"reject": "cancelled"
}
},
"canary": {
"transitions": {
"expand": "partial",
"rollback": "rolled_back"
}
},
"partial": {
"transitions": {
"expand": "full",
"rollback": "canary",
"emergency_stop": "rolled_back"
}
},
"full": {
"transitions": {
"finalize": "completed",
"rollback": "partial"
}
},
"completed": { "final": true },
"rolled_back": { "final": true },
"cancelled": { "final": true }
},
"initialState": "planning"
}
Step 4.2: Run with Checkpoints
- Start:
context={feature: "dark-mode", targetUsers: 1000000}
- Approve:
action='approve', data={approvedBy: "product-team"}
- Create checkpoint:
description='Canary deployment started'
- Expand to partial:
action='expand', data={percentage: 10, metrics: {errors: 0}}
- Create checkpoint:
description='10% rollout stable'
- Simulate error scenario - use
action='emergency_stop'
- List all checkpoints
- Rollback to '10% rollout stable' checkpoint
- Continue expansion:
action='expand', data={percentage: 100}
- Finalize:
action='finalize'
Test 5: Complex Scenario - Interview Process
Step 5.1: Define the Workflow
Create an interview process with multiple paths:
{
"name": "interview-process",
"description": "Candidate interview workflow",
"states": {
"applied": {
"transitions": {
"screen": "screening",
"reject": "rejected"
}
},
"screening": {
"transitions": {
"pass": "phone_interview",
"fail": "rejected"
}
},
"phone_interview": {
"transitions": {
"pass": "technical_interview",
"fail": "rejected",
"no_show": "rescheduling"
}
},
"rescheduling": {
"transitions": {
"reschedule": "phone_interview",
"withdraw": "withdrawn"
}
},
"technical_interview": {
"transitions": {
"pass": "final_interview",
"fail": "rejected",
"maybe": "additional_round"
}
},
"additional_round": {
"transitions": {
"pass": "final_interview",
"fail": "rejected"
}
},
"final_interview": {
"transitions": {
"hire": "offer_extended",
"reject": "rejected"
}
},
"offer_extended": {
"transitions": {
"accept": "hired",
"decline": "declined",
"negotiate": "negotiating"
}
},
"negotiating": {
"transitions": {
"accept": "hired",
"decline": "declined"
}
},
"hired": { "final": true },
"rejected": { "final": true },
"declined": { "final": true },
"withdrawn": { "final": true }
},
"initialState": "applied"
}
Step 5.2: Run Complex Scenario
- Start:
context={candidateId: "C-001", position: "Senior Engineer", appliedAt: "<timestamp>"}
- Screen candidate:
action='screen'
- Pass screening:
action='pass', data={score: 85}
- No show for phone interview:
action='no_show'
- Reschedule:
action='reschedule', data={newDate: "<future-date>"}
- Pass phone interview:
action='pass', data={interviewer: "tech-lead"}
- Technical interview needs additional round:
action='maybe', data={reason: "Need to assess system design"}
- Pass additional round:
action='pass'
- Pass final interview:
action='hire'
- Extend offer:
data={salary: 150000, startDate: "<date>"}
- Candidate negotiates:
action='negotiate', data={requestedSalary: 165000}
- Accept negotiated offer:
action='accept', data={finalSalary: 160000}
Verification Checklist
After completing all tests, verify:
- Dynamic Definition: You successfully defined 5 different workflows on the fly
- State Enforcement: Each workflow enforced its defined transitions (couldn't skip states)
- Context Preservation: All data added during transitions was preserved
- No Hardcoding: The system handled completely different domains without any file-specific logic
- Checkpoints Work: Successfully created and rolled back to checkpoints
- Multiple Workflows: Could run different workflow types simultaneously
- Proper Completion: Workflows only completed when reaching actual final states
- Error Prevention: Invalid actions were rejected with clear errors
Success Criteria
The test is successful if:
- All 5 workflows were defined and executed without any hardcoded logic
- Context was never lost between transitions
- The system prevented invalid state transitions
- Checkpoints and rollbacks worked across all workflow types
- Each workflow reached its final state only through valid paths
Additional Tests (Optional)
Try defining your own creative workflows:
- A game state machine (menu → playing → paused → game_over)
- A content moderation flow
- A subscription lifecycle
- A build/deployment pipeline
- Any multi-step process you can imagine
The system should handle ANY valid state machine you can define!