Back to blog
featureworkflowsvolumes

Workflows That Pass Files, Not Just Text

2 min read

Until now, workflow steps communicated by passing text. Step 1's output got pasted into step 2's prompt. It worked for summaries and short answers, but fell apart when the output was a CSV, a JSON blob, or anything too large to stuff into a prompt.

Now steps share a filesystem and write structured output.

Shared volumes

Every multi-step workflow run gets a shared directory at /home/daytona/shared. Files written by step 1 are immediately available to step 2. No serialization, no token overhead, no size limits beyond disk space.

Write a spreadsheet in step 1, read it in step 2. Download an image in step 1, process it in step 3. The filesystem is the interface.

A 3-step research pipeline looks like this:

  • Step 1 scrapes URLs and writes /home/daytona/shared/competitors.csv
  • Step 2 reads that CSV, fetches each page, appends extracted data
  • Step 3 reads the enriched data and writes /home/daytona/shared/report.md

Each step sees the same directory. No prompt stuffing, no token overhead.

The shared directory is cleaned up automatically when the workflow finishes. Single-step workflows skip it entirely, so there's no overhead when you don't need it.

Structured output

Agents can write /output/output.json with a standard shape:

	{
	"summary": "Extracted pricing from 5 competitors",
	"files": ["/home/daytona/shared/competitors.csv"],
	"data": { "competitor_count": 5, "data_points": 12 }
}

When this file exists, the next step gets the structured summary instead of raw free-form text. Clean, predictable context. When the file doesn't exist, everything falls back to text passing. Nothing breaks.

Why this matters

Before: each step got a wall of text from the previous one and had to parse it. Token-heavy, error-prone, limited to what fits in a prompt.

After: steps share real files and communicate with typed JSON. Prompts stay focused on the task, not on parsing the previous step's output.

This is what makes workflows practical for data pipelines: ETL, report generation, multi-source research, where the intermediate artifacts are bigger than a paragraph.

Back to all posts