agents SDK when deployed to Cloudflare Workers. Subsequent requests in the same chat room stall unpredictably at random steps, while the same code works perfectly locally. Added a fetch call to verify if network calls are broken once the agent breaks.
Environment: Cloudflare Workers (production deployment)
Affected: Agents SDK chat flow with multiple requests per session
Status: ❌ Broken in Workers, ✅ Works locally
We're migrating a Node.js project using the Agents SDK to Cloudflare Workers. The application uses:
- One agent per user serving multiple chat sessions
- SQLite for internal session mapping
- E2B sandbox integration
- Similar flow to the official chat bot example
- Deploy this starter to Cloudflare Workers
- Send the first chat message → ✅ Works perfectly
- Send multiple messages in the same chat room but different sessions → ❌ Hangs indefinitely after some requests (~4 in this minimal reproduction, but about 1-2 requests in our production environment)
- Subsequent requests stall at random steps in production:
- Authentication
- E2B sandbox loading
- Tool initialization
- Network calls never resolve or timeout
| Environment | First Request | Subsequent Requests |
|---|---|---|
| Local Development | ✅ Works | ✅ Works |
| Cloudflare Workers | ✅ Works | ❌ Hangs randomly |
// This flow works locally but fails on Workers after first request
1. Authentication
2. Load E2B sandbox
3. Billing service check
4. Initialize tools
5. Return stream from Agents SDK // ← Hangs here or earlier steps- Cloudflare account
- OpenAI API key
- This exact starter template
-
Clone and Install:
npx create-cloudflare@latest --template cloudflare/agents-starter cd agents-starter npm install -
Configure Environment:
# Create .dev.vars file echo "OPENAI_API_KEY=your_openai_api_key" > .dev.vars
-
Test Locally (this works):
npm start # Open browser, send multiple messages → All work fine -
Deploy to Workers (this breaks):
npm run deploy # Visit deployed URL, send first message → Works # Send second message → Hangs indefinitely
Architecture:
- One agent instance per user
- Multiple chat sessions mapped via SQLite
- Hyperdrive connection to PostgreSQL (GCP)
- E2B sandbox integration
- Streaming responses via Agents SDK
Failure Pattern:
- ✅ First request after deployment: Always succeeds
- ❌ Subsequent requests: Hang at unpredictable steps
- 🔄 No timeout or error - requests just never resolve
- 🏠 Local development: No issues whatsoever
Affected Components:
- Authentication flow
- E2B sandbox initialization
- Tool system setup
- Agents SDK streaming
- Network calls in general
- Timing: Issue started recently (wasn't happening before)
- Scope: Only affects Cloudflare Workers deployment
- Pattern: First request always works, subsequent ones fail
- Randomness: Failure occurs at different steps unpredictably
- No Errors: Requests don't timeout or throw errors, they just hang
-
Workers Runtime Differences:
- Different event loop behavior
- Request/response lifecycle differences
- Memory or state management issues
- Max allowed connections
-
Agents SDK Integration:
- Potential Workers-specific compatibility issue
- State persistence between requests
- Streaming response handling
-
External Dependencies:
- E2B sandbox connection pooling
- Hyperdrive connection management
- SQLite state between requests
- Confirmed local development works perfectly
- Verified first request always succeeds in Workers
- Identified random failure points in subsequent requests
- Ruled out API key or authentication issues
- Need investigation into Workers-specific behavior
- Need Agents SDK team input on Workers compatibility
├── src/
│ ├── app.tsx # Chat UI implementation
│ ├── server.ts # ⚠️ Main agent logic (where issues occur)
│ ├── tools.ts # Tool definitions (hangs during init)
│ ├── utils.ts # Helper functions
│ └── styles.css # UI styling
├── wrangler.jsonc # Workers configuration
└── .dev.vars.example # Environment template
src/server.ts: Contains the main chat flow that hangssrc/tools.ts: Tool initialization that sometimes failswrangler.jsonc: Workers configuration that might affect behavior- Network calls: Any external API calls that hang in Workers
-
Workers Runtime Investigation:
- Are there known issues with persistent connections in Workers?
- How should long-running agent sessions be handled?
- Any Workers-specific considerations for the Agents SDK?
-
Debugging Assistance:
- Best practices for debugging hanging requests in Workers
- Logging/monitoring recommendations for this type of issue
- Workers-specific profiling tools
-
Workers Compatibility:
- Is the Agents SDK fully tested on Cloudflare Workers?
- Any known limitations or required configurations?
- Recommended patterns for multi-request agent sessions?
-
State Management:
- How should agent state persist between requests in Workers?
- Are there Workers-specific initialization patterns?
- Connection pooling best practices?
-
Similar Issues:
- Has anyone experienced similar hanging request issues?
- Any workarounds or solutions found?
- Alternative deployment patterns that work?
-
Testing Help:
- Can others reproduce this issue with the same setup?
- Different Workers configurations to try?
- Alternative agent architectures that work reliably?
Click to expand original starter template documentation
- 💬 Interactive chat interface with AI
- 🛠️ Built-in tool system with human-in-the-loop confirmation
- 📅 Advanced task scheduling (one-time, delayed, and recurring via cron)
- 🌓 Dark/Light theme support
- ⚡️ Real-time streaming responses
- 🔄 State management and chat history
- 🎨 Modern, responsive UI
Add new tools in tools.ts using the tool builder:
// Example of a tool that requires confirmation
const searchDatabase = tool({
description: "Search the database for user records",
parameters: z.object({
query: z.string(),
limit: z.number().optional(),
}),
// No execute function = requires confirmation
});
// Example of an auto-executing tool
const getCurrentTime = tool({
description: "Get current server time",
parameters: z.object({}),
execute: async () => new Date().toISOString(),
});The starting implementation uses the ai-sdk and OpenAI provider, but you can use alternatives like workers-ai-provider or anthropic.
MIT
