Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpRequest now handles gzipping response bodies #38944

Merged
merged 1 commit into from
Sep 7, 2020

Conversation

Wavesonics
Copy link
Contributor

@Wavesonics Wavesonics commented May 22, 2020

HttpRequest will add an "Accepts-Encoding" header for gzip (only if the user has not already specified the same Header), allowing servers to send compressed payloads.

HttpRequest will detect "Content-Encoding" when receiving data, and will decompress gzip payloads, delivering only the decompressed data to the "request_completed" signal.

HttpClient::request_raw is now available to use via HttpRequest::request_raw. This allows users complete control over their request body. This is then used by this PR to send a GZipped body if the new property "gzip_request" is true. Not all servers support gzip request bodies, so "gzip_request" is defaulted to false.

This solves #6433

For testing I have a Godot project which talks to a locally running Http server written in Go. The Go server has 2 end points /plain which accepts non-compressed Request payloads. It respects Accept-Content headers, thus will send a response back which is gzipped.

The Godot project will print the response to the terminal, proving it was decompressed successfully by this change.

The second endpoint /gzip accepts a gzipped payload, decompresses it on the server, and then sends it back to the client. It also respects the encoding header, and thus will re-compress the body, and send it back to the client compressed.

Godot Test Project:
https://fugitivethegame.online/download/HttpGzipClient.zip

Go Http Server (Godot project talks to this when running locally):
https://fugitivethegame.online/download/HttpGzipServer.zip

@Calinou
Copy link
Member

Calinou commented May 22, 2020

I made a project to test this locally, but I get the following errors in the console once the request to https://godotengine.org is completed:

image

The response is gzipped (you can check this in any browser's DevTools).

@Wavesonics
Copy link
Contributor Author

@Calinou huh yup, i have a repo, looking into it.

@Wavesonics
Copy link
Contributor Author

Wavesonics commented May 24, 2020

@Calinou ok I have fixed the problem. The initial PR was not allocating it's output buffer correctly, which became apparent when larger response bodies were being decompressed than what I had tested with. And in-fact, it never could allocate them correctly using the existing decompress method in Godot it turns out.

Http does not provide a mechanism for transmitting the expected uncompressed size of the response/request body. You have to stream decode it and allocate as you go. This is a supported pattern in zlib, it just takes a slightly difference approach than what was implemented in decompress.

So I added a new method to Compression called decompress_dynamic which will do just that. It takes an empty Vector as the output buffer, and continues to resize it during decompression.

I notice that the other methods in Compression accept pointers directly to the bytes, but I need Vector's functionality for resize and copy of the existing data. I wasn't sure though if including Vector into Compression was acceptable? Let me know and I can re-implement the copy-move on my own if that is desired.

I followed the best practices laid out here but had to diverge a fair bit as they are just doing file-to-file decompression.

@Wavesonics Wavesonics force-pushed the http-gzip branch 2 times, most recently from 647b0ec to 2515c14 Compare May 24, 2020 02:46
@Faless
Copy link
Collaborator

Faless commented May 25, 2020

I haven't got time to do a full review yet, but I'm unsure about adding a new request_raw method this way as it cause even more code duplication.
I know HTTPClient also have request and request_raw method with two separate implementations, but for what I can see they run the same code just with a different parameter type.
I wonder if we should make request internally call request_raw(..., p_body.utf8().get_data()).
Am I forgetting something here?

EDIT: At a first glance the PR looks good, I'll try to review it ASAP 🏅 .

@Wavesonics
Copy link
Contributor Author

@Faless I had the same thought as I was implementing that in HttpRequest, but I just tried to follow the patterns that I found as closely as possible.

I can certainly refactor it to all just go through request_raw.

One other question: I was thinking it'd be nice to add the decompress_dynamic (who's name I'm also not in love with, so let me know if there is any other name you can think of) to the PackedByteArray interface for GDScript/GDNative, but I can't make heads or tails of varriant_call.cpp. I can add a function there similar to _call_PackedByteArray_decompress, but I can't figure out how to actually register it.

@Faless
Copy link
Collaborator

Faless commented May 25, 2020

I can add a function there similar to _call_PackedByteArray_decompress, but I can't figure out how to actually register it.

See the ADDFUNCNR? (N = number of parameters, R = has return) macro and usage (e.g. for decompress)

@Wavesonics
Copy link
Contributor Author

Wavesonics commented May 26, 2020

Ok, I updated HttpRequest so it doesn't duplicate code between request and request_raw. Under the hood now everything just uses request_raw.

I also added a switch accept_gzip that can disable all of this auto-magical compression behavior all together so users can go back to how things work right now. This along with the addition of request_raw allows users to have complete control over their request/response bodies.

lastly I added decompress_dynamic to PackedByteArray as we discussed. However, in testing it, it sorta looks like I uncovered a bug? Or I could just be doing things wrong, this GDScript code should work as far as I understand things:

	var plain := "Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request Test Gzipped Request "
	var utf8Bytes := plain.to_utf8()
	print("utf8Bytes: %d" % utf8Bytes.size())

	var compressedBytes := utf8Bytes.compress(File.COMPRESSION_GZIP)
	print("compressedBytes: %d" % compressedBytes.size())

	var decompressedBytes := compressedBytes.decompress_dynamic(File.COMPRESSION_GZIP)
	print("decompressedBytes: %d" % decompressedBytes.size())
	print(decompressedBytes)

However, it appears to fail compression, returning a zero length buffer:

utf8Bytes: 756
compressedBytes: 0

@Wavesonics
Copy link
Contributor Author

PackedByteArray::get_string_from_utf8 also appears to fail, I'm taking a look, but if anyone has any insight here I'd be grateful :P

func _on_HTTPRequest_request_completed(result, response_code, headers, body: PackedByteArray):
	print("HTTP Response: %d" % response_code)
	print(headers)
	
	print("result: " + str(result))
	print("Body Len: %d" % body.size())
	
	print("Body to utf8")
	var bodyStr := body.get_string_from_utf8()
	print("Body to utf8 DONE")
	print("Body UTF8 Len: %d" % bodyStr.length())
	print("'%s'" % bodyStr)

Output:

result: 0
Body Len: 15516
Body to utf8
Body to utf8 DONE
Body UTF8 Len: 0
''

body.get_string_from_utf8() produces a zero length string.

@Wavesonics
Copy link
Contributor Author

Huh, the most recent build failure I don't quite understand:
https://travis-ci.org/github/godotengine/godot/jobs/691511953

Anyone who understands the inner workings of variant_call.cpp have any insights?

@Wavesonics
Copy link
Contributor Author

While working on this PR, I discovered that the GDScript methods for PackedByteArray were broken due to the switch away from the Pooled Array types. I have another PR here that fixes those methods: #39100

@Wavesonics
Copy link
Contributor Author

@Faless I don't know if you got a chance to take a look, but one of the earlier commits here does get rid of the redundant code in request() and request_raw() as you requested. When ever you have time, let me know if that satisfies what you were talking about.

@Wavesonics
Copy link
Contributor Author

I had some time to do some more reading, and it turns out that gzipping request bodies is quite rare, and most servers don't support it out of the box.

It is still done in some special cases, there is a Game Analytics service I use that tells you you should do it, but it's rare enough that I don't think it makes sense to include in the base class.

As part of this PR I exposed request_raw() to users of HTTPRequest which will allow people who need to gzip their request body to do some manually. Previously you had to drop down to use HttpClient in order to do so.

Copy link
Collaborator

@Faless Faless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned about decompress_dynamic. I think it should still have a max_decompressed_size.

IIRC gzip have a max compression factor of about 1024, which means a server could send 5 MiB that gets unpacked into 5 GiB (in memory), thus certainly crashing the client.
I know it's a very unlikely scenario, but maybe we could tie it to the body_size_limit of HTTPRequest (though by default it would still be -1 so unbounded).

core/io/compression.cpp Outdated Show resolved Hide resolved
core/io/compression.cpp Outdated Show resolved Hide resolved
core/io/compression.h Outdated Show resolved Hide resolved
scene/main/http_request.cpp Outdated Show resolved Hide resolved
scene/main/http_request.cpp Outdated Show resolved Hide resolved
scene/main/http_request.h Outdated Show resolved Hide resolved
scene/main/http_request.h Outdated Show resolved Hide resolved
@Wavesonics
Copy link
Contributor Author

Thanks for taking a look @Faless ! I fixed all of the trivial things.

For the max body size, I have two thoughts, looking at line 377 of http_request.cpp:

				// No body len (-1) if chunked or no content-length header was provided.
				// Change your webserver configuration if you want body len.
				body_len = client->get_response_body_length();

				if (body_size_limit >= 0 && body_len > body_size_limit) {
					call_deferred("_request_done", RESULT_BODY_SIZE_LIMIT_EXCEEDED, response_code, response_headers, PackedByteArray());
					return true;
				}

Maybe a check similar to: IF gzip AND body_len > SOME MAX then fall into that error condition.

That has a big down side though. While gzip has a maximum compression ratio of ~1:1000, a more typical ratio for HTML seems to be about 1:10. So any magic number we plug in here really wouldn't handle the edge cases I don't think.

body_length is the transport size of the body, so the compressed size in this case. And this doesn't tell us anything useful really about the potential uncompressed size, since our range is some where between 10 and 1000x.

So my second idea is to enforce some limit during decompression. Maybe add a new limit argument to decompress_dynamic(). And then have a configurable limit settable on HTTPRequest, and if decompress_dynamic() hits that limit, then HTTPRequest will return a new error code similar to RESULT_BODY_SIZE_LIMIT_EXCEEDED

@Wavesonics
Copy link
Contributor Author

Wavesonics commented Jun 24, 2020

Ok, I implemented the 2nd idea: decompress_dynamic() now accepts a new param: max_output_size. If the output exceeds the value of max_output_size in bytes, it will stop processing, and return an error.

HTTPRequest now passes body_size_limit to decompress_dynamic(), and if it returns an error as a result, then HTTPRequest will return an error to the caller.

@Wavesonics Wavesonics requested review from Faless and removed request for a team June 24, 2020 00:29
@Faless
Copy link
Collaborator

Faless commented Jun 25, 2020

HTTPRequest now passes body_size_limit to decompress_dynamic(), and if it returns an error as a result, then HTTPRequest will return an error to the caller.

Yeah, that's good, there is still a styling issue reported by CI and also an integer comparison that needs a cast in compression.cpp: strm.total_out > p_max_dst_size -> https://travis-ci.org/github/godotengine/godot/jobs/701471185

@Wavesonics
Copy link
Contributor Author

Not quite sure how to squash with merge commits in the history. I know github can squash on merge, might that work?

@akien-mga
Copy link
Member

@Wavesonics
Copy link
Contributor Author

Thanks @akien-mga guess I should read that before makign future PRs :P

Should be all squashed now.

@Wavesonics
Copy link
Contributor Author

It's a little difficult to test at the moment, because PackedByteArray's get_string methods need this to work: #39100

So the response body can't be easily printed to the terminal at the moment.

@Wavesonics Wavesonics changed the title HttpRequest now handles gzipping response and (optionally) request bodies HttpRequest now handles gzipping response bodies Jul 27, 2020
@Wavesonics Wavesonics force-pushed the http-gzip branch 2 times, most recently from 3f6f1d0 to 0b73d3a Compare September 2, 2020 19:34
Added request_raw to HttpRequest
Added decompress_dynamic to Compression class
Added decompress_dynamic to BytePoolArray

Merge doc fix

revert
@Wavesonics
Copy link
Contributor Author

Wavesonics commented Sep 2, 2020

@Faless I fixed the merge issue from a recent docs change in master, are there any remaining issues with the previous review & changes you had requested?

@Faless
Copy link
Collaborator

Faless commented Sep 5, 2020

@Wavesonics no, I think this is good now, but it's depending on #39100 (and possible changes to GDScript's decompress_dynamic). But it's good work 👍 ! We'll try to speed up the process of deciding on the blocker and then merge this as soon as possible.

@Wavesonics
Copy link
Contributor Author

#39100 should be good now (or at least I cleaned up the debug message that was there).

What sort of changes were you thinking for decompress_dynamic?

Copy link
Collaborator

@Faless Faless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Great job 🥇 !

@Faless Faless merged commit 2cb6b2a into godotengine:master Sep 7, 2020
@Wavesonics Wavesonics deleted the http-gzip branch December 2, 2020 17:32
@tavurth
Copy link
Contributor

tavurth commented May 3, 2021

Was just looking for this, glad to see it implemented! 🎉

Any chance this is back-portable?

@Calinou
Copy link
Member

Calinou commented May 3, 2021

Was just looking for this, glad to see it implemented! tada

Any chance this is back-portable?

It looks feasible while keeping backwards compatibility. The main difficult part is to refactor the copied code in variant_call.cpp, since that part has changed a lot in the master branch compared to 3.x. Feel free to give it a try 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants