gsoc2023: net: http: culminating pull request#59669
gsoc2023: net: http: culminating pull request#59669Emna-Rekik wants to merge 41 commits intozephyrproject-rtos:mainfrom
Conversation
This commit adds an empty test suite for http server. No test cases have been added in this commit. However, creating the empty test suite provides a structure for adding tests in subsequent commits. Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
8feb4e8 to
287d7b1
Compare
There was a problem hiding this comment.
This is more of a sample than a library at the moment.
Having a sample is fine - and we should have one, but subsys/ is not where it belongs. And the sample should make actual use of the library we're working on.
Generally, we really need to think on how should the HTTP2 server library look like. For sure we'll need a public API (as I wrote on Discord, the public header should be defined in include/zephyr/net/http). You need to think on what public functions should be exposed to the application. The most basic ones that come up to me would be:
http2_server_init()http2_server_start()http2_server_stop()
For the resource (webpage) representation in the server, you should have a look at the API added some time ago by @cfriedt (which could likely provide more details about it): https://github.com/zephyrproject-rtos/zephyr/blob/main/include/zephyr/net/http/service.h#L19
Finally, the sample app should make use of the above to create a HTTP2 server. And we should cover those functionalities in tests, just as listed in the GSoC umbrella issue.
There was a problem hiding this comment.
Actually, to expand on this a bit, the http service description was specifically designed to be more-or-less independent of any actual implementation details.
What that allows us to do, for example, is to easily (for various definitions of easy) compare two or more different http server implementations in various sample applications and testsuites. There may be an implementation-specific / conditionally compiled source file here or there, but otherwise it should work.
Ultimately, this will allow us to quantitatively compare HTTP server implementations in Zephyr and provide graphs, evidence, etc to show that e.g. Emna's server implementation outperforms CivetWeb by a factor of 2 or something along those lines, or this implementation or that implementation is able to handle N concurrent clients while using M bytes of memory, and so on.
There was a problem hiding this comment.
I'll see if I can pull the CivetWeb implementation in for reference purposes, in a way that we can easily choose this or that based on Kconfig.
287d7b1 to
9f48219
Compare
rlubos
left a comment
There was a problem hiding this comment.
@Emna-Rekik Please familiarize yourself with https://docs.zephyrproject.org/latest/contribute/guidelines.html#coding-style, we should follow this standard for Zephyr code. We should use tabs for indentation (configured as 8 spaces/tab), use snake_case for function/variable names etc.
include/zephyr/net/http/server.h
Outdated
There was a problem hiding this comment.
I think it'd be easier for apps to work with binary IP representation (we have struct net_addr that covers family and binary IP address).
include/zephyr/net/http/server.h
Outdated
There was a problem hiding this comment.
nit, but I'd use uint8_t * here.
There was a problem hiding this comment.
@cfriedt I think we should settle whether we want to stick to POSIX socket APIs, or use zsock_* prefixed ones. The former approach proved to be problematic in the past (as we still have two ways to enable socket POSIX APIs). Not sure if we need the dependency to POSIX either in a networking component.
There was a problem hiding this comment.
I think this conceptually belong to the HTTP1 parsing part - update header is part of HTTP1 request. IMO we only need to reply for that and take no further action (i. e. expect HTTP2 request in the next step).
There was a problem hiding this comment.
As we discussed last week - we may lose stream data here (overwrite it). If the initial recv() call received more data than only preface, we lose it here.
Also, think we should rethink the design here. recv() should only be caleld after the poll() call reports there is new data to read - otherwise we may block inside handle_http2_request(), effectively preventing the server from handling other requests.
There was a problem hiding this comment.
Wasnt' this already done in http2_server_start()?
There was a problem hiding this comment.
yes, this was for an HTTP/1 client after it was upgraded to HTTP/2
There was a problem hiding this comment.
We should really avoid declaring large buffers on a stack. I think we should have one buffer statically allocated per client, and use only it for communication.
There was a problem hiding this comment.
And actually it should be even needed to copy the buffers, just offset the orignal buffer when providing the pointer to the function (so you don't provide buffer directly, but buffer + some_offset).
There was a problem hiding this comment.
We still should be prepared for this case. I. e. the actual paylaod size may exceed the buffer we're processing, so that in case of the next recv() call we should treat the remainig data as a payload for the previous frame, and not try to parse the new frame header.
a2a3452 to
4722b52
Compare
There was a problem hiding this comment.
The thing is - you cannot feed the data to the HTTP1 parser unless you're certain it's a HTTP1 request. Otherwise, you may get unexpected results. So the order of action should be:
- Read data on a socket until you have
strlen(preface) bytes - If preface matches HTTP2 request - proceed as with HTTP2 request.
- Otherwise, proceed with HTTP1 request. Only now you can feed that data to the HTTP1 parser, and search for the upgrade header. If there's a header, reply and await next request (HTTP2 this time).
26dcbd5 to
2101835
Compare
include/zephyr/net/http/server.h
Outdated
There was a problem hiding this comment.
What's the purpose for this variable? Looks a bit hacky considering how it's used.
There was a problem hiding this comment.
server_fd and event_fd sockets need to be handled separately. Currently, you would call `close_client_connection() on the listening socket, if it reported an error.
There was a problem hiding this comment.
This comment will be a little longer, but I'll try to give a general overview of how a stateful parsing could look like.
First of all, we want to get rid of all poll()/recv() calls outside of the main server loop (so the one here). The reason for this is that we don't want to block the server from handling other requests, in case recv() would block inside handle_http2_request(). But even with non-blocking calls it doesn't work great in the current design, as in case there are no new data, you'd stop the parsing and start over from scratch (i. e. preface handling) w/o serving the request.
Receving new data
What I think would be needed in the first place to achieve this goal is to redesign how we recv new data, I suggest to:
- Introduce a separate input buffer for each client (so make it a part of the client context).
- Introduce a new
offsetvariable to keep track of how many bytes are currently in the buffer. This would also affect how yourecv()new data:
recv(ctx->client_fds[i].fd, buffer + offset, sizeof(buffer) - offset, 0);As you don't want to overwrite the data that is already in the buffer.
Parsing the buffer
Now, recv() will be providing new data to the buffer, so we need handle_http_request to consume it. That's where the stateful handling kicks in. If you look at the existing HTTP1 parser implementation you'll see there are lots of states, the parser also works in a way that it counsumes the new data byte by byte:
https://github.com/zephyrproject-rtos/zephyr/blob/main/subsys/net/lib/http/http_parser.c#L784
I don't think we need it to get that complex with HTTP2 since it's binary, so easier to parse.
First of all, we need to figure out what states the server can be in. The very first thing we expect from a newly connected client is a preface - therfore, it makes sense to introduce a state that represents this stage (HTTP_PREFACE). Once we've received the preface, we can determine whether we deal with HTTP1 or HTTP2 request.
For HTTP1 I believe we don't need much state representation, as most of this is handled by the existing parser. HTTP1_REQUEST should do, but if you see a need for more states for HTTP1 parsing, don't hesitate to add more.
For HTTP2 it gets a bit more complex, as we need to implement the parser ourselves. So we need to figure out, what is the next data we expect to get, when we proceed with HTTP2 request. For me, it seems that we should expect frame header. So we can introduce HTTP2_FRAME_HEADER state. Once we're done with the frame header parsing, we know what type of frame we deal with. So for a header frame, we could have HTTP2_HEADER_FRAME in which we parse header fields one by one. Similarly, in HTTP2_SETTINGS_FRAME we would parse ("consume") settings, one by one (note that they all have predefined format/size). The list goes on and on, I hope that you see where it's going. If certain frame type requires more complex parsing - you can define substates as well.
Also note, that with each frame header, you know exactly how much bytes in that frame to expect. So when you're done with parsing of the current frame ("consumed" all of the frame data), you can get back to HTTP2_FRAME_HEADER state.
Some pseudocode to consider
We should consider now how could this look like in practice (note that it's just a pseudo-C code, I did not compile it, it's just for an overview).
The main input function could look like:
int handle_http_request(struct http2_client_ctx *ctx)
{
int ret = -EINVAL;
do {
switch (ctx->state) {
case HTTP_PREFACE:
ret = handle_http_preface(ctx);
break;
case HTTP1_REQUEST:
ret = handle_http1_request(ctx);
break;
case HTTP2_FRAME_HEADER:
ret = handle_http2_frame_header(ctx);
break;
case xxx:
etc..
}
} while (ret == 0 && ctx->offset > 0);
return ret;
}We need to establish some protocol for the return values. 0 could indicate successful parsing and that the function should proceed with the next state (there might still be some data in the buffer to process). -EAGAIN could indicate that we expect more data to come before we can proceed with parsing. Any other negative value - parsing error, we should terminate the connection.
Now for example how could handle_preface() look like:
int handle_http_preface(struct http2_client_ctx *ctx)
{
if (ctx->offset < strlen(preface)) {
/* We don't have full preface yet, get more data. */
return -EAGAIN;
}
if (strncmp(ctx->buffer, preface, strlen(preface)) != 0) {
ctx->state = HTTP1_REQUEST;
} else {
ctx->state = HTTP2_FRAME_HEADER;
}
/* Now we need to "consume" the already processed data from the buffer
* For simplicity, I suggest to use memmove for starters, we can
* optimize to some other solution if needed
*/
/* ctx->offset will indicate remaining data in the buffer */
ctx->offset -= strlen(preface);
memmove(ctx->buffer, ctx->buffer + strlen(preface), ctx->offset);
return 0;
}000aebf to
bf6809b
Compare
|
Say you had a uC with a single core, not even multi-threading capable. Say that the firmware on this uC had to do lots of things like read GPIO, read / write from flash memory, serve / respond to HTTP traffic, etc int http_client_input_work(client_context *ctx)
{
// how many bytes to read?
// depends on http parser state
// depends also on how much data is available on your transport
// in Linux:
int bytes_available_to_read;
int ret = ioctl(ctx->fd, FIONREAD, &bytes_available_to_read);
// in Zephyr?
}
int http_client_output_work(client_context *ctx)
{
// how many bytes to read?
// depends on http parser state
// depends also on how much data is available on your transport
// in Linux:
int bytes_available_to_write;
int ret = ioctl(ctx->fd, FIONWRITE, &bytes_available_to_read);
// in Zephyr?
// how many bytes we can write on ctx->fd WITHOUT blocking at all
int bytes_to_write = MIN(bytes_available_to_write, ctx->output_buffer_size);
// linear buffer (array)
int bytes_written = write(ctx->fd, ctx->output_buffer, bytes_to_write);
// check for errors...
__ASSERT_NO_MSG(bytes_written == bytes_to_write);
// downside of linear buffer is that we either keep track of offset or use memmove to move data to the front of the array
memmove( ... );
// struct ring_buffer approach
ret = write(ctx->fd, ctx->output_ring, /* MIN(bytes_to_write, ring_buffer_bytes_to_end_of_ring) */ );
// check for errors..
bytes_written += ret;
bytes_to_write -= ret;
// drain N bytes from ring buffer (essentially update internal counters)
if (bytes_to_write > 0) {
ret = write(ctx->fd, ctx->output_ring[offs], /* MIN(bytes_to_write, ring_buffer_bytes_to_end_of_ring) */ );
// check for errors
// drain bytes_to_write bytes from ring buffer
}
}
int do_http_work(void)
{
int server_fd;
int cancel_fd;
struct pollfd pfd[1 /* server_fd */ + 1 /* cancellation_fd */ + max_num_clients ];
// condition pfd array (POLLIN, POLLOUT)
int ret = poll(pfd, ARRAY_SIZE(pfd), 0);
switch(ret) {
case 0:
return;
case -1:
// report error to main as -EINTR
return -errno;
default:
break;
}
// ret contains the number of file descriptors in pfd that have events
for(int i = 0; ret > 0; --ret, ++i) {
struct pollfd *p = &pfd[i];
if (p->fd == cancel_fd) {
if (!(ctx->client_fds[i].revents & POLLIN)) {
http_server_shutdown();
return;
}
}
if (p->fd in client_fds) {
if ((ctx->client_fds[i].revents & POLLERR) == POLLERR) {
// handle error
close_client_connection();
if ((ctx->client_fds[i].revents & POLLHUP) == POLLHUP) {
// handle socket close
close_client_connection();
} else if ((ctx->client_fds[i].revents & POLLOUT) == POLLOUT) {
// there is room to write data on this socket
// even our outgoing traffic has to be stateful!
// so sendAll() cannot work in a fully async system
http_client_output_work(&client_context[p->fd]);
} else if ((ctx->client_fds[i].revents & POLLIN) == POLLIN) {
// there is room to write data on this socket
// even our outgoing traffic has to be stateful!
// so sendAll() cannot work in a fully async system
http_client_input_work(&client_context[p->fd]);
}
}
if (p->fd == server_fd) {
// check if server *can* accept new client (check if num clients < max)
}
}
}
int main()
{
do {
// all of these functions would need to be more or less instantaneous
do_gpio_work();
do_flash_memory_work();
do_http_work();
// sleep until some event happens
} while(1);
} |
c9cfd30 to
369f3ee
Compare
There was a problem hiding this comment.
I think I've mentioned this already, but 0 is still a valid file descriptor - it should not be used as a marker of unused slot. Use -1 instead (preferably defined as some symbol like INVALID_SOCK.
Also, please make sure you follow Zephyr CS:
| if (ctx->fds[j].fd != 0) | |
| continue; | |
| if (ctx->fds[j].fd != -1) { | |
| continue; | |
| } |
There was a problem hiding this comment.
You should break the loop if ret < 0. And handle -EAGAIN as a special case - it's not an error, but an indication that more data is expected.
There was a problem hiding this comment.
I think we should drop this idea of having parse_http2_frames() which will attempt to parse as many frames there is in the buffer, and store their contents in an enormous array of frames - it won't scale well for embedded systems.
We should be parsing frames one by one - i. e. start in a FRAME_HEADER state always, and based on the content switch to a state that will be responsible of parsing the payload of the frame. In case of SETTINGS frame, parse settings one be one, until we consume all of the data in the current frame.
Implement a library for hpack static table Fixes zephyrproject-rtos#59693 Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Add Zephyr HTTP library Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Replace net_http with net_http_client Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Modify http Kconfig Fixes zephyrproject-rtos#59684 Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Verifies that HTTP method and HTTP path are correctly parsed Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Add sample.yaml file and adjust files after moving server.c to lib Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Convert README.md to README.rst and update it. Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
This test ensures that the server supports Gzipped Compression. Fixes zephyrproject-rtos#59700 Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Add test function for basic rest api Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
This test ensures that the server supports JSON Format Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Add JSON format support Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Establish a TLS communication between a client and Zephyr HTTP server. Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Add service header file Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Support Static Ressources Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
add this line ITERABLE_SECTION_ROM(http_resource_desc, 4) Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Return them to their origin after we have done linux prototye Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
add section rom file Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
add socket support Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Add static ressource support Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Modify service header file Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
Update prototype test functions and fix issues Fixes zephyrproject-rtos#59685 Fixes zephyrproject-rtos#59686 Fixes zephyrproject-rtos#59688 Fixes zephyrproject-rtos#59690 Fixes zephyrproject-rtos#59670 Signed-off-by: Emna Rekik <emna.rekik007@gmail.com>
1b4cb6a to
ab21233
Compare
|
Hi all, what is the plan with this one (considering merging)? There are lot of commits (a bit too much imho), and reviewing becomes harder. Is there something major still missing from this? |
|
ping @cfriedt, any ideas / suggestions regarding my question? |
Yes, my nick is jukkar |
| break; | ||
| } | ||
| /* Create a socket */ | ||
| ctx->server_fd = socket(af, SOCK_STREAM, proto); |
There was a problem hiding this comment.
Isn't this an issue if there is more than 1 service? The server_fd member will have the value of the last socket, which is an issue for accepting new clients.
|
Moving the work to #64465, please add comments there. Closing this one. |
This PR includes all commits for the gsoc2023 HTTP project.
Please see each commit message for details.