Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aiohttp client #421

Merged
merged 8 commits into from
May 7, 2020
Merged

Conversation

joshuahlang
Copy link
Contributor

Initial attempt at an aiohttp-client integration. Still needs some work to document the usage at a minimum.

@joshuahlang joshuahlang requested a review from a team February 14, 2020 02:28
@joshuahlang joshuahlang force-pushed the aiohttp-client branch 3 times, most recently from 2d8b1ec to ff58602 Compare February 15, 2020 01:16
@GoPavel
Copy link
Contributor

GoPavel commented Feb 16, 2020

I think that trace_config's hooks on request start, request end, and request exception are good, but not enough. Your typical usage of aiohttp:

async with session.get(...) as response:
     # this code out of the span
     ...

Currently, trace hooks track the life-time of aiohttp.ClientRequest, but when errors occur with response(JsonError, from raise_for_status) your span not fail and have StatusCanonicalCode.OK.
So you should track the life-time of aiohttp.ClientResponse. It is also important for stream request, where any problem with stream connection should be detected by tracing:

async with session.get(...) as response:
    # now ClientRequest is closed.
   async for chunk in response.content:
       # this code out of the span
       ...

I tried to find a proper way to hook it, but on_response_chunk_received seems not to work.
So I can suggest only replace response_class with a wrapper.
My minimalistic response wrapper example:

class ClientResponseWithTracing(aiohttp.ClientResponse):
    def _start_span(self):
        ctx = tracer.start_as_current_span(self.url, kind=SpanKind.CLIENT)
        span_exit = ctx.__exit__
        span: Span = ctx.__enter__()
        setattr(self, '_span', span)
        setattr(self, '_exit_span', span_exit)

    def _release_span(self):
        if not hasattr(self, '_span'):
            return
        span: Span = getattr(self, '_span')
        exc_val: Optional[BaseException] = self.content.exception()
        if exc_val is not None:
            add_exception_to_events(exc_val, span)
            exc_type, exc_tb = type(exc_val), exc_val.__traceback__
            span.set_status(Status(StatusCanonicalCode.INTERNAL, exc_type.__name__))
            getattr(self, '_exit_span')(exc_type, exc_val, exc_tb)
        else:
            getattr(self, '_exit_span')(None, None, None)
        delattr(self, '_span')

    def close(self):
        self._release_span()
        super().close()

    def release(self) -> Any:
        self._release_span()
        super().release()

async def on_request_end_hook(session, trace_config_ctx, params):
    ...
    params.response._start_span()

@joshuahlang
Copy link
Contributor Author

Calling raise_for_status shouldn't prevent the trace status from being inferred from the HTTP status code. That happens prior to the aiohttp.ClientResponse object being returned to the caller. You're right that the on_response_chunk_received callback isn't very functional. It appears only to be triggered when calling aiohttp.ClientResponse.read here.

This raises an interesting question: What should be included in the span itself? E.g. if the server returns a non-JSON response, but the client tries to parse it as JSON (e.g. calling aiohttp.ClientResponse.json()) should that be reported as an error in the span for the HTTP request? It probably should for a connection related error, such as the server failing to send the complete response content. It becomes a bit more questionable if the server returns a content-type encoding that doesn't match the actual content encoding, and even more still if the client incorrectly calls aiohttp.ClientResponse.json() on a response

@GoPavel
Copy link
Contributor

GoPavel commented Feb 18, 2020

I want to pay attention to the stream queries. We can receive status 200 and after that, the stream can suddenly close with ClientPayloadError. I am certain that this exception related to the request span.

I have been a bit hasty when passing any exception to _exit_span in my ClientResponseWithTracing. Maybe the best-effort way is to try to decide whether the error related to request problems or related to business logic error.
For example:

if  isinstance(exc_val, aiohttp.ClientError) or isinstance(exc_val, json.JSONDecodeError):
   add_exception_to_events(exc_val, span)
   exc_type, exc_tb = type(exc_val), exc_val.__traceback__
   span.set_status(Status(StatusCanonicalCode.INTERNAL, exc_type.__name__))
   getattr(self, '_exit_span')(exc_type, exc_val, exc_tb)
else:
   getattr(self, '_exit_span')(None, None, None)

@GoPavel
Copy link
Contributor

GoPavel commented Feb 18, 2020

Calling raise_for_status shouldn't prevent the trace status from being inferred from the HTTP status code. That happens prior to the aiohttp.ClientResponse object being returned to the caller. You're right that the on_response_chunk_received callback isn't very functional. It appears only to be triggered when calling aiohttp.ClientResponse.read here.

Yes, you right. I tested with the default behavior when you call raise_for_status manually. Maybe, in that case, the user is responsible for HTTP status handling.

This raises an interesting question: What should be included in the span itself? E.g. if the server returns a non-JSON response, but the client tries to parse it as JSON (e.g. calling aiohttp.ClientResponse.json()) should that be reported as an error in the span for the HTTP request? It probably should for a connection related error, such as the server failing to send the complete response content. It becomes a bit more questionable if the server returns a content-type encoding that doesn't match the actual content encoding, and even more still if the client incorrectly calls aiohttp.ClientResponse.json() on a response

Maybe OT-spec will state something about that in the future.

@c24t
Copy link
Member

c24t commented Feb 27, 2020

@joshuahlang is this ready for review?

@joshuahlang joshuahlang force-pushed the aiohttp-client branch 2 times, most recently from 17f907f to 0384670 Compare February 28, 2020 02:45
@joshuahlang
Copy link
Contributor Author

There's a few open issues in the comments already, but does can be addressed in the formal review. I'll remove the WIP prefix once I finish the unit tests, which hopefully will be by EOD 2020-02-28 PST

@joshuahlang joshuahlang force-pushed the aiohttp-client branch 2 times, most recently from cda2d24 to 958aa88 Compare February 29, 2020 02:16
@joshuahlang joshuahlang changed the title WIP: aiohttp client aiohttp client Feb 29, 2020
@joshuahlang
Copy link
Contributor Author

@joshuahlang is this ready for review?

Ready now. Still an open need to update the eachdist.py/tox.ini to not attempt to install eggs on versions of python which they don't support. E.g. this module (ext-aiohttp-client) is not supported on Python 3.4 and causes the 3.4 coverage task to fail.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a couple minor discussions on response code standards, but looks good otherwise!

)
self.InMemoryExporter.clear()

def test_url_filter_option(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe query parameters should be a part of the atttribute:

https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/data-http.md#common-attributes

Full HTTP request URL in the form scheme://host[:port]/path?query[#fragment]. Usually the fragment is not transmitted over HTTP, but if it is known, it should be included nevertheless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the client wants to strip them for whatever reason, then they should be removed from the span completely. This is merely testing that a client-provided callback is actually called and used to process the URL before adding it to the span. In this case it simply removes all query params, but likely clients will use it to selectively remove API keys or PII from query params.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok. a little nervous that the test is one that causes behavior which violates the spec, but sounds good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the spec should be updated to account for this? Some things still pass secrets or PII in query strings. May be good to acknowledge that in the spec?

@toumorokoshi
Copy link
Member

Currently, trace hooks track the life-time of aiohttp.ClientRequest, but when errors occur with response(JsonError, from raise_for_status) your span not fail and have StatusCanonicalCode.OK.
So you should track the life-time of aiohttp.ClientResponse. It is also important for stream request, where any problem with stream connection should be detected by tracing:

This is an interesting one. in that context, what does on_request_end really mean? docs seem a little sparse: https://docs.aiohttp.org/en/stable/tracing_reference.html#aiohttp.TraceConfig.on_request_end

This raises an interesting question: What should be included in the span itself? E.g. if the server returns a non-JSON response, but the client tries to parse it as JSON (e.g. calling aiohttp.ClientResponse.json()) should that be reported as an error in the span for the HTTP request? It probably should for a connection related error, such as the server failing to send the complete response content. It becomes a bit more questionable if the server returns a content-type encoding that doesn't match the actual content encoding, and even more still if the client incorrectly calls aiohttp.ClientResponse.json() on a response

Agree with @GoPavel that this is a good discussion for the spec. But today, HTTP instrumentation do not concern themselves with handling mismatched content types, or anything to do with processing the response, except for the timing.

One major implementation philosophy of OpenTelemetry is to not interfere with the application runtime as much as possible. I think content type validation would be significant overhead we don't want to be involved in.

This module is only supported on Python3.5, which is the oldest supported by
aiohttp.
@joshuahlang
Copy link
Contributor Author

@toumorokoshi Had some time this weekend to rebase this off the latest in OTel.

@toumorokoshi
Copy link
Member

@joshuahlang great! Looks like CI is still failing but seems unrelated to your changes, taking a look

scripts/jaeger.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for coming back to this! I've added some minor comments but overall it looks great. I'm requesting changes because of the following missing files: CHANGELOG.md, LICENSE, MANIFEST.in.

Hopefully they'll be pretty quick to add, you can see an example here: https://github.com/open-telemetry/opentelemetry-python/tree/master/ext/opentelemetry-ext-requests

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record, we are not adding instrumentations (previously known as integrations) as children of BaseInstrumentor. Nevertheless, this can be approved now and migrated later to an instrumentation if that is more convenient, I guess.

scripts/coverage.sh Show resolved Hide resolved


# TODO: refactor this code to some common utility
def http_status_to_canonical_code(status: int) -> StatusCanonicalCode:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks very similar to this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, hence the TODO. Not sure where it could be shared from though. opentelemetry-sdk perhaps? I'm not very familiar with the OTel project structure

Comment on lines +157 to +160
elif callable(trace_config_ctx.span_name):
request_span_name = str(trace_config_ctx.span_name(params))
else:
request_span_name = str(trace_config_ctx.span_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really necessary to have an argument that can be either a callable or a string? Is it feasible that the span_name argument is only a string type argument and leave the processing of params (if necessary) to the caller of on_request_start who will have the responsibility of providing the span name to on_request_start?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on_request_start is called by aiohttp internally. aiohttp doesn't know the caller's context (e.g. that a request has a sensitive value in the query string). The higher-level caller instrumenting aiohttp.client doesn't have access to mutate the params directly. params is an internal construct used by aiohttp for tracing.

Copy link
Contributor

@codeboten codeboten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like only a couple of unused imports left to cleanup in the tests, otherwise this is looking good. Thanks!

@toumorokoshi toumorokoshi merged commit e800d26 into open-telemetry:master May 7, 2020
@toumorokoshi
Copy link
Member

@joshuahlang thanks so much for all the work! Exciting to get this in.

codeboten pushed a commit to codeboten/opentelemetry-python that referenced this pull request May 8, 2020
Adding initial aiohttp client.

This module is only supported on Python3.5, which is the oldest supported by
aiohttp.

Co-authored-by: Yusuke Tsutsumi <yusuke@tsutsumi.io>
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
* chore: update README

* chore: update README

* fix: review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants