Skip to content

Fix #80931: HTTP stream hangs if server doesn't close connection #6874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: PHP-7.4
Choose a base branch
from

Conversation

cmb69
Copy link
Member

@cmb69 cmb69 commented Apr 16, 2021

As is, we read HTTP streams until the server closes the connection,
what is standard behavior prior to HTTP/1.1 anyway. However, HTTP/1.1
defaults to not closing the connection, and although we are sending a
Connection: close header, there is no guarantee that the server heeds
it. Even worse, if a user passes a Connection: keep-alive header via
the stream context, the server is not supposed to close the connection,
so the request hangs (and will only terminated by a timeout).

The proper way to read HTTP responses is to the heed the sent Content-
Length, or in case of Transfer-encoding: chunked, to read until after
the final chunk (zero bytes). To avoid baking this into the general
streams behavior, we use stream filters for this purpose (as suggested
by @IMSoP), and if the end of the stream is detected by those, we set
the eof flag. Since there are already the quite suitable filters for
our purpose (dechunk and consumed) we attach the desired behavior
to these.

We avoid the filter overhead in cases where the server is going to
close the connection anyway, i.e. prior to HTTP/1.1, or if the server
responds with Connection: close.

Currently, the end of content check is enabled for the existing
filters by passing filter parameters. I kept that simple for now, by
just passing a bool or a long, but that might be to big a BC break;
neither the dechunk nor the consumed filter are documented, but
easily recognizable via stream_get_filters(), so existing code may
already pass parameters, although these are so far unused. That could
be mitigated by passing an array with respective members instead.
Another option to avoid the BC altogether would be to add the required
info to the stream name (similar to the convert.iconv.* filter).

For best BC, it might also be desireable to only set the eof flag
from the dechunk filter, if dechunk is explicitly requested.


Any suggestions for improvement are welcome, especially regarding the hackish regression test.

cmb69 added 2 commits April 16, 2021 16:59
As is, we read HTTP streams until the server closes the connection,
what is standard behavior prior to HTTP/1.1 anyway.  However, HTTP/1.1
defaults to not closing the connection, and although we are sending a
`Connection: close` header, there is no guarantee that the server heeds
it.  Even worse, if a user passes a `Connection: keep-alive` header via
the stream context, the server is not supposed to close the connection,
so the request hangs (and will only terminated by a timeout).

The proper way to read HTTP responses is to the heed the sent Content-
Length, or in case of `Transfer-encoding: chunked`, to read until after
the final chunk (zero bytes).  To avoid baking this into the general
streams behavior, we use stream filters for this purpose (as suggested
by @IMSoP), and if the end of the stream is detected by those, we set
the `eof` flag.  Since there are already the quite suitable filters for
our purpose (`dechunk` and `consumed`) we attach the desired behavior
to these.

We avoid the filter overhead in cases where the server is going to
close the connection anyway, i.e. prior to HTTP/1.1, or if the server
responds with `Connection: close`.

Currently, the end of content check is enabled for the existing
filters by passing filter parameters.  I kept that simple for now, by
just passing a bool or a long, but that might be to big a BC break;
neither the `dechunk` nor the `consumed` filter are documented, but
easily recognizable via `stream_get_filters()`, so existing code may
already pass parameters, although these are so far unused.  That could
be mitigated by passing an array with respective members instead.
Another option to avoid the BC altogether would be to add the required
info to the stream name (similar to the `convert.iconv.*` filter).

For best BC, it might also be desireable to only set the `eof` flag
from the `dechunk` filter, if `dechunk` is explicitly requested.
This is tricky, since the test server closes the socket right after
sending the response.  For now, we simulate `Connection: keep-alive` by
sleeping for two seconds, and check on the client side whether the
request took less than two seconds.
@cmb69 cmb69 added the Bug label Apr 16, 2021
In several cases besides `Transfer-Encoding` headers, a server is not
supposed to send a `Content-Length` header; in other cases, the server
is not required to send a `Content-Length` header.  We need to cater to
that and must not apply the `consumed` filter.
@cmb69
Copy link
Member Author

cmb69 commented Apr 19, 2021

Given https://bugs.php.net/bug.php?id=80970, attaching filters might be too big a BC break. :(

@cmb69
Copy link
Member Author

cmb69 commented Apr 27, 2021

I wonder what to do with this (see my comment above). Postpone to "master"?

@nikic
Copy link
Member

nikic commented Apr 27, 2021

@cmb69 Is it possible to support select on streams with filters?

@cmb69
Copy link
Member Author

cmb69 commented Apr 28, 2021

Is it possible to support select on streams with filters?

I'm not sure, but I see no obvious reasons why PHP_STREAM_AS_FD_FOR_SELECT wouldn't work for filtered streams, so I submitted #6926.

@cmb69
Copy link
Member Author

cmb69 commented Apr 28, 2021

So, no, we cannot support select() on filtered streams. So this fix constitutes a BC break (although selecting on HTTP wrappers appears to be somewhat uncommon).

@bukka
Copy link
Member

bukka commented Dec 16, 2023

I just left comment in #6926. It might be potentially possible and potentially also tricky to do. I agree that filter solution is the probably right one here. We would need filters if we want to support trailers in the future so it will be most likely necessary anyway. Currently I see this just for master only as it has potentially some BC impact. It will also need better test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants