Laravel AI Agents: Streaming &amp; Structured Output | Mohamed Said        [  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MH.png)   Mohamed Said Laravel Backend Engineer  ](https://msaied.com) [ Home ](https://msaied.com) [ Projects ](https://msaied.com/projects) [ Articles  ](https://msaied.com/articles) [ Certificates ](https://msaied.com/certificates) [ Contact ](https://msaied.com#contact-section) 

       [  ](https://github.com/EG-Mohamed)       

 [ Home ](https://msaied.com) [ Projects ](https://msaied.com/projects) [ Articles ](https://msaied.com/articles) [ Certificates ](https://msaied.com/certificates) [ Contact ](https://msaied.com#contact-section) 

  [ home ](https://msaied.com)    [ articles ](https://msaied.com/articles)    Production AI Agents in Laravel: Streaming, Token Budgets, and Structured Output Contracts        On this page       1. [  Production AI Agents in Laravel: Streaming, Token Budgets, and Structured Output Contracts ](#production-ai-agents-in-laravel-streaming-token-budgets-and-structured-output-contracts)
2. [  Streaming Responses Over SSE ](#streaming-responses-over-sse)
3. [  Enforcing Token Budgets ](#enforcing-token-budgets)
4. [  Structured Output Contracts ](#structured-output-contracts)
5. [  Putting It Together in a Job ](#putting-it-together-in-a-job)
6. [  Takeaways ](#takeaways)

  ![Production AI Agents in Laravel: Streaming, Token Budgets, and Structured Output Contracts](https://cdn.msaied.com/325/ee4e73c7fdfa922304594beaac92c93c.png)

  #laravel   #ai   #openai   #production  

 Production AI Agents in Laravel: Streaming, Token Budgets, and Structured Output Contracts 
============================================================================================

     30 Jun 2026      4 min read    ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said  

       Table of contents

1. [  01   Production AI Agents in Laravel: Streaming, Token Budgets, and Structured Output Contracts  ](#production-ai-agents-in-laravel-streaming-token-budgets-and-structured-output-contracts)
2. [  02   Streaming Responses Over SSE  ](#streaming-responses-over-sse)
3. [  03   Enforcing Token Budgets  ](#enforcing-token-budgets)
4. [  04   Structured Output Contracts  ](#structured-output-contracts)
5. [  05   Putting It Together in a Job  ](#putting-it-together-in-a-job)
6. [  06   Takeaways  ](#takeaways)

 Production AI Agents in Laravel: Streaming, Token Budgets, and Structured Output Contracts
------------------------------------------------------------------------------------------

Most tutorials stop at "call the API, dump the response." Production agents are a different beast. You need to stream tokens to the browser without blocking a PHP worker for 30 seconds, enforce hard token budgets so a runaway prompt doesn't drain your quota, and guarantee that the model returns data your application can actually parse. This article covers all three.

---

### Streaming Responses Over SSE

Laravel's `StreamedResponse` pairs naturally with OpenAI's streaming API. The key is flushing output incrementally without buffering the entire completion.

```php
use Illuminate\Support\Facades\Route;
use Symfony\Component\HttpFoundation\StreamedResponse;
use OpenAI\Laravel\Facades\OpenAI;

Route::get('/chat/stream', function () {
    return new StreamedResponse(function () {
        $stream = OpenAI::chat()->createStreamed([
            'model' => 'gpt-4o',
            'messages' => [['role' => 'user', 'content' => request('prompt')]],
        ]);

        foreach ($stream as $response) {
            $delta = $response->choices[0]->delta->content ?? '';
            if ($delta !== '') {
                echo 'data: ' . json_encode(['token' => $delta]) . "\n\n";
                ob_flush();
                flush();
            }
        }

        echo "data: [DONE]\n\n";
    }, 200, [
        'Content-Type' => 'text/event-stream',
        'Cache-Control' => 'no-cache',
        'X-Accel-Buffering' => 'no', // critical for nginx
    ]);
});

```

`X-Accel-Buffering: no` is the header most people forget. Without it, nginx will buffer the entire response before forwarding it to the client, defeating the purpose of streaming entirely.

---

### Enforcing Token Budgets

Token overruns are a billing and latency problem. Enforce budgets at two layers: before the request (prompt token estimation) and inside the request (`max_tokens`).

```php
final class TokenBudget
{
    public function __construct(
        private readonly int $maxPromptTokens = 3_000,
        private readonly int $maxCompletionTokens = 1_000,
    ) {}

    public function assertPromptFits(string $prompt): void
    {
        // ~4 chars per token is a safe heuristic for English text
        $estimated = (int) ceil(mb_strlen($prompt) / 4);

        if ($estimated > $this->maxPromptTokens) {
            throw new \OverflowException(
                "Prompt exceeds budget: ~{$estimated} tokens (max {$this->maxPromptTokens})"
            );
        }
    }

    public function completionLimit(): int
    {
        return $this->maxCompletionTokens;
    }
}

```

Bind this as a singleton scoped to the current tenant or user plan:

```php
$this->app->scoped(TokenBudget::class, function () {
    $plan = auth()->user()?->plan ?? 'free';
    return match ($plan) {
        'pro'  => new TokenBudget(8_000, 2_000),
        default => new TokenBudget(3_000, 500),
    };
});

```

Using `scoped` rather than `singleton` ensures the budget resets per request, which matters under Octane.

---

### Structured Output Contracts

Asking a model to "return JSON" is not a contract. OpenAI's `response_format` with `json_schema` mode (available on `gpt-4o` and later) lets you enforce a schema server-side. Pair it with a DTO and a Pest assertion.

```php
$response = OpenAI::chat()->create([
    'model' => 'gpt-4o',
    'messages' => [
        ['role' => 'system', 'content' => 'Extract the invoice fields.'],
        ['role' => 'user', 'content' => $rawText],
    ],
    'response_format' => [
        'type' => 'json_schema',
        'json_schema' => [
            'name' => 'invoice',
            'strict' => true,
            'schema' => [
                'type' => 'object',
                'properties' => [
                    'vendor'  => ['type' => 'string'],
                    'amount'  => ['type' => 'number'],
                    'due_date'=> ['type' => 'string', 'format' => 'date'],
                ],
                'required' => ['vendor', 'amount', 'due_date'],
                'additionalProperties' => false,
            ],
        ],
    ],
]);

$data = json_decode($response->choices[0]->message->content, true, flags: JSON_THROW_ON_ERROR);
$invoice = InvoiceData::from($data); // Spatie Data DTO

```

With `strict: true`, the model will refuse to produce output that violates the schema rather than hallucinating extra fields. Validate the DTO immediately after hydration — never trust the model's output downstream without a type check.

---

### Putting It Together in a Job

For non-interactive agents, run the completion inside a queued job with a timeout that matches your token budget:

```php
class ExtractInvoiceJob implements ShouldQueue
{
    public int $timeout = 60;
    public int $tries = 2;

    public function handle(TokenBudget $budget, InvoiceExtractor $extractor): void
    {
        $budget->assertPromptFits($this->rawText);
        $invoice = $extractor->extract($this->rawText, $budget->completionLimit());
        InvoiceExtracted::dispatch($invoice);
    }
}

```

Set `$timeout` conservatively. A 500-token completion at peak load can still take 20+ seconds.

---

### Takeaways

- Add `X-Accel-Buffering: no` to every SSE response or nginx will swallow your stream.
- Use `scoped()` for per-request token budgets under Octane, not `singleton()`.
- OpenAI's `json_schema` response format with `strict: true` is a real contract, not a prompt suggestion.
- Validate and hydrate into a typed DTO immediately — never pass raw model output into business logic.
- Set explicit job `$timeout` values that reflect your worst-case token budget, not a generic default.

 Found this useful?

          [  ](https://twitter.com/intent/tweet?url=https%3A%2F%2Fmsaied.com%2Farticles%2Fproduction-ai-agents-in-laravel-streaming-token-budgets-and-structured-output-contracts-1&text=Production+AI+Agents+in+Laravel%3A+Streaming%2C+Token+Budgets%2C+and+Structured+Output+Contracts) [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fmsaied.com%2Farticles%2Fproduction-ai-agents-in-laravel-streaming-token-budgets-and-structured-output-contracts-1) 

 Frequently Asked Questions 
----------------------------

  3 questions  

     Q01  Why does my nginx proxy buffer the SSE stream even with StreamedResponse?        Nginx buffers proxy responses by default. Set the `X-Accel-Buffering: no` response header to instruct nginx to pass chunks through immediately. You may also need `proxy_buffering off` in your nginx config for non-Accel setups. 

      Q02  Is the 4-characters-per-token heuristic accurate enough for budget enforcement?        It is a safe overestimate for English prose, which is intentional. For precise counts use a tokenizer library (e.g., tiktoken via a PHP FFI binding), but the heuristic is sufficient for a pre-flight guard that errs on the side of caution. 

      Q03  Does `json\_schema` response format work with all OpenAI models?        Structured output with `strict: true` requires `gpt-4o` (2024-08-06 snapshot or later) or `gpt-4o-mini`. Earlier models support `response_format: {type: json_object}` but without schema enforcement, so the model can still produce non-conforming output. 

  Continue reading

 More Articles 
---------------

 [ View all    ](https://msaied.com/articles) 

 [ ![Laravel 13: New Features, Helpers, and Practical Upgrade Notes](https://cdn.msaied.com/339/58c4fa6fe9b6d25a2dac17c621b6f4c6.png) laravel laravel-13 upgrade 

### Laravel 13: New Features, Helpers, and Practical Upgrade Notes

Laravel 13 ships with async-first defaults, a leaner bootstrapping layer, and several quality-of-life helpers....

  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said 

 1 Jul 2026     3 min read  

  Read    

 ](https://msaied.com/articles/laravel-13-new-features-helpers-and-practical-upgrade-notes) [ ![Laravel 12: Structured Route Files, Slim Skeletons, and the New Application Bootstrapping](https://cdn.msaied.com/337/05b39d16d0f88a5fb94d0cf74049b88b.png) laravel laravel-12 upgrade 

### Laravel 12: Structured Route Files, Slim Skeletons, and the New Application Bootstrapping

Laravel 12 ships with a leaner skeleton, first-class route file organisation, and a revised application bootst...

  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said 

 1 Jul 2026     3 min read  

  Read    

 ](https://msaied.com/articles/laravel-12-structured-route-files-slim-skeletons-and-the-new-application-bootstrapping) [ ![Laravel API Resources: Sparse Fieldsets, Conditional Relationships, and Versioning](https://cdn.msaied.com/336/89d518450335e8fcdaa5be882cf4dd3e.png) laravel api resources 

### Laravel API Resources: Sparse Fieldsets, Conditional Relationships, and Versioning

Go beyond basic API resources. Learn how to implement sparse fieldsets, conditionally load relationships, and...

  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said 

 1 Jul 2026     3 min read  

  Read    

 ](https://msaied.com/articles/laravel-api-resources-sparse-fieldsets-conditional-relationships-and-versioning) 

   [  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MH.png)   Mohamed Said Laravel Backend Engineer  ](https://msaied.com)Senior Backend Engineer specializing in Laravel, scalable SaaS platforms, APIs, and cloud infrastructure. I build secure, high-performance web applications that help businesses grow.

Explore

- [Home](https://msaied.com)
- [Projects](https://msaied.com/projects)
- [Articles](https://msaied.com/articles)
- [Certificates](https://msaied.com/certificates)
- [Contact](https://msaied.com#contact-section)

Connect

- [   hello@msaied.com ](mailto:hello@msaied.com)
- [   +20 109 461 9204 ](tel:+201094619204)

© 2026 Mohamed Said. All rights reserved.

 [  ](https://github.com/EG-Mohamed) [  ](https://www.linkedin.com/in/msaiedm/) [  ](https://wa.me/201094619204) [  ](mailto:hello@msaied.com) [  ](https://drive.google.com/file/u/0/d/1MF20IPRJyzfy32mhEutjL5EpSls0w2Q8/view)
