Streaming Resources with Symfony 2

7 April 2015, Rhodri Pugh

One of our products, BindHQ, makes heavy use of S3 for storage of a large number of varied document types.

When users access a document the ideal delivery method is to pass them a pre-signed URL right to the bucket content. This way we don’t have any overhead on our application to serve that. But we’ve been working on new features that involve annotating or amending documents for various reasons before they reach the user.

Architecture

One method of solving this would be to keep cached versions of these documents with the amendments already applied to them, we could then serve these objects directly from S3 the same we could unedited types.

The problem with this is that some of these features allow the users to amend and view documents on-the-fly, so we’d need to re-generate the cache back into S3 pretty quickly to give a decent user experience.

The approach we’ve gone with for our first version involves doing these modifications dynamically and serving them through our frontend application.

The user-facing portion of the application serving this information is written in Symfony2 so I was looking for ways to stay performant when doing this. My main concern was avoiding realising data in-memory as firstly these documents can be quite large, and secondly there’s no reason to slow down the response with buffering it up in different sections as it moves along its course.

StreamedResponse

Symfony offers a response type called StreamedResponse which allows you to do some stuff after the normal Symfony request cycle is complete. This means that you can avoid having to realise information before returning a response from your controller, and generate or stream the data lazily.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<?php

class MyController extends Controller
{
    public function someAction()
    {
        return new StreamedResponse(
            function () {
                echo "Send content";
                // do some stuff
                flush();
                // etc...
            }
        };
    }
}

So this allows us to read data from our internal services (that do all the clever bits of annotating documents and whatnot) and send them to the client, without having to buffer it all up in memory first.

Stream Helper

To handle the streaming part, the obvious solution would be simply to read from the input and write to the output. Something like…

1
2
3
4
5
<?php

while (false !== ($data = fread($input, 1024))) {
    echo $data;
}

But luckily there’s a PHP function from the wonderfully eclectic standard library that will do the job for us.

1
2
3
4
5
<?php

$output = fopen('php://output', 'w');

stream_copy_to_stream($input, $output);

I hoped there’d be support for this in Symfony already, but it’s not much bother to implement ourselves, so here’s my entire wrapper.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?php

namespace MyApp\MyBundle;

use Symfony\Component\HttpFoundation\StreamedResponse;

class ResourceResponse extends StreamedResponse
{
    /**
     * @param Resource $resource
     * @param integer $status
     * @param array $headers
     */
    public function __construct($resource, $status, array $headers = [])
    {
        $streamer = function () use ($resource) {
            stream_copy_to_stream(
                $resource,
                fopen('php://output', 'w')
            );
        };

        parent::__construct($streamer, $status, $headers);
    }
}

Guzzle Stream

A nice side effect of adding a resource-to-resource helper is that it made me make sure all our current code was using this lazy streaming approach. Bad news was that not all of it was, the main culprit being our use of Guzzle when calling to various internal services.

Looking into the docs it turned out that Guzzle does provide a method of accessing response streams.

1
2
3
4
5
6
7
8
9
10
11
<?php

use Guzzle\Stream\PhpStreamRequestFactory;

function requestToStream($request) {
    $factory = new PhpStreamRequestFactory();
    $stream = $factory->fromRequest($request);
    $wrappedStream = $stream->getStream();

    return $wrappedStream;
}

Just call getStream and you have yourself a resource to read from. I tried using this though and started getting an error… Inspecting the resource I was using suggested that it wasn’t actually a resource at all.

I didn’t get into the nitty gritty of it, but it after some Googling I realised the resource seemed to be getting GC’d somehow. So a quick update to detach the stream first…

1
2
3
4
5
6
7
8
9
10
11
12
13
<?php

use Guzzle\Stream\PhpStreamRequestFactory;

function requestToStream($request) {
    $factory = new PhpStreamRequestFactory();
    $stream = $factory->fromRequest($request);
    $wrappedStream = $stream->getStream();

    $stream->detachStream();

    return $wrappedStream;
}

… and all was good again!

Conclusion

No conclusion really, just happy to get to the solution that we were hoping for. Happy hacking!