-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: reuse HTTP sessions #9116
Conversation
✅ Deploy Preview for determined-ui canceled.
|
harness/determined/cli/_util.py
Outdated
@@ -99,6 +100,22 @@ def setup_session(args: argparse.Namespace) -> api.Session: | |||
) | |||
|
|||
|
|||
def session( | |||
fn: Callable[[argparse.Namespace, api.Session], None] | |||
) -> Callable[[argparse.Namespace, api.Session], None]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) -> Callable[[argparse.Namespace, api.Session], None]: | |
) -> Callable[argparse.Namespace, None]: |
cause the wrapped function does not take 2nd api.Session
param anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if self._http_session: | ||
self._http_session.close() | ||
self._http_session = self._make_http_session() | ||
return self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this changes the semantics a bit. I think the old method and with_
naming and its docstring indicated you make a new session like the old one but with this one setting different. you can continue using the old session object with the old setting.
now, the underlying object is updated in place with the new setting. it does not seem like this method is ever used in a way where the old object is still preserved. so the change in semantics is fine. but maybe
- update the docstring to reflect the different behavior
- consider renaming it to
set_retry
or something to reflect it updates the thing itself
params=None, | ||
headers=None, | ||
timeout=None, | ||
stream=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aren't all these parameters redundant?
@@ -110,6 +111,26 @@ def read_paginated( | |||
offset = pagination.endIndex | |||
|
|||
|
|||
def read_paginated_with_session( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: add an optional session
to read_paginated
? What does having separate functions get us?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually... I don't get why this was necessary to do at all. get_with_offset
has a session
already referenced. Why would we need do give it a different one rather than the one that tends to be in scope when it's defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is partially because i initially implemented it in read_paginated
but in a later commit decided to separate it out, and in doing so, left out the code that actually mattered 🙃 i fixed it, so that now read_paginated_with_session
is responsible for the session context, and get_with_offset
uses the session it's given.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need do give it a different one rather than the one that tends to be in scope when it's defined?
because the scope is different. get_with_offset
is a singular API call that can't manage a persistent session. imagine:
with self._session as sess:
def get_with_offset(int):
return bindings.get_Trials(
session=sess
)
# if we call read_paginated here, it doesn't seem right that it doesn't take in a session
resp = read_paginated(get_with_offset)
# if we call it here outside the session context, the session context is useless
resp = read_paginated(get_with_offset)
it makes sense to me that read_paginated
has the responsibility of entering/exiting the session context since it knows the "context" of requesting multiple pages in succession. then get_with_offset
just uses the session read_paginated_with_session
gives it.
i agree it's not great, but given the way read_paginated
is implemented with the separate get_with_offset
, this is the cleanest way i could find to do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: add an optional session to read_paginated? What does having separate functions get us?
tried this at some point too. this was v1 lol. main issue is that callables can't have optional default arguments, so every get_with_offset
would have to explicitly be written with in session=None
in their signature. that seems like a very annoying change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implicit-session interface to read_paginated
looks to me like:
- construct a method that maps an int to a list of things starting at that number
- exercise the method to get a complete list of things
The explicit-session interface is:
- construct a method that uses a session to map an int to a list of things starting at that number
- if the Session passed to
read_paginated
has an existing HTTP connection, close it - create a new persistent HTTP connection in the session passed to
read_paginated
- use the modified Session to call the master and get your complete list
- close the persistent HTTP connection in the passed Session
Modifying the Session seems bad to me. If it turns out we need an explicit-session interface, then I think it should either rely on persistent-HTTP management from the caller or create a brand-new session. Relying on the caller to manage it seems like the more natural interface from the point of view of a caller for the given signature.
Your intuition and mine differ about how
resp = read_paginated(get_with_offset)
looks without a passed session. I think read_paginated
/get_with_offset
interface is bad, but given the relationship between them, it feels more natural to me for the session to be implicit in get_with_offset
. Whether that's closed with read_paginated
or closed by the caller of read_paginated
, I don't so much care.
I don't think it's necessary to support the use case in the example above. I don't think a caller should expect that that use case works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following up on this, I think there's an analog in trying to manage SDK resources in a world where Sessions may be closed at any point.
with authentication.login(<...>) as sess:
exp = experiment.Experiment(id=1, session=sess)
exp.pause()
I don't think this is a use case we should try to support, either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed additional changes for read_paginated
pages=api.PageOpts.single if max_results else api.PageOpts.all, | ||
) | ||
with self._session as sess: | ||
resps = api.read_paginated( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still one of my least-favorite parts of the SDK. Would a future design want to build this into BaseSession
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't love it either, but i dunno. could also see read_paginated
belonging somewhere in bindings
. definitely out of scope for this PR
resps = api.read_paginated( | ||
get_with_offset=get_with_offset, | ||
pages=api.PageOpts.single if max_results else api.PageOpts.all, | ||
session=sess, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As currently written with separate read_paginated
and read_paginated_with_session
, I don't understand how this works. session
isn't an argument of read_paginated
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't lol, i initially implemented read_paginated
to take in session and just didn't change all the implementations.
raise errors.APIException(r) | ||
|
||
return r | ||
def _make_requests_session( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What difference should the reader make between a requests_session
and an http_session
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me like a BaseSession._http_session
is better named a _requests_session
, FWIW. The word "http" reads to me like it's supposed to mean "not https".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok -- suggestion: replace all the _make_http_session
methods with calls to _make_requests_session
that pass explicit values for all these parameters. That way I don't have to cross-reference attributes and figure out what they may have been set to from elsewhere (with_retry
took a lot of thinking before I felt comfortable that it was correct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, now I see that isn't possible. This seems like a smell. Thinking about how to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... this would be simpler if, like in requests, the resources were created during __init__
. In requests.sessions.Session
:
def __enter__(self):
return self
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any danger in doing it this way? I don't think a connection will be opened inside the requests Session until a GET, etc. is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, i tried this at some point. i think the awkwardness here goes back to wanting to support the existing behavior of keeping direct calls single-use sessions.
but it was my first instinct as well to initialize the session in the init. it ended up being more awkward, because then there's non-obvious logic that would be required in _do_request
on when to close them.
in the end i felt like it was much more explicit to create resources as the behavior dictates: only on entering a context manager do we start a persistent session, and _do_request
will use a persistent session if it exists, otherwise it will create a one-time session that only exits for the scope of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because then there's non-obvious logic that would be required in
_do_request
on when to close them.
Could we talk through this? I think it would be nice to know that resources are created rather than have to puzzle out context.
requests_session.mount( | ||
"https://", _HTTPSAdapter(server_hostname=server_hostname, max_retries=max_retries) | ||
) | ||
requests_session.mount("http://", adapters.HTTPAdapter(max_retries=max_retries)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HTTPAdapter
for https://
has a server_hostname
and the one for http://
doesn't. Is that a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NM. I get it now. Where did this pattern come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to worry about custom TLS certs if it's not HTTPS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the pattern for setting a hostname inside the subclassed HTTPAdapter
. I assume it's good practice, but when I went to look for examples out there on the internet to learn more about it I wasn't very successful. Where'd the pattern come from?
return r | ||
def _make_requests_session( | ||
server_hostname: Optional[str] = None, | ||
verify: Optional[Union[str, bool]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When verify
is optional, then I have to figure out what three states mean. I think it's slightly easier (and means the same thing) if it's just a Union[str, bool]
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably could default None to True, but it's not that obvious and i'm not sure that it's this method's responsibility to make that default decision.
verify
comes directly from cert.bundle
, which has Optional[Union[str, bool]]
, along with some not-super-trivial logic to determine which type it is:
if noverify:
self._bundle = False # type: Union[None, str, bool]
elif cert_pem is None:
self._bundle = None
else:
self._bundle = str
requests also takes in the same type and has some logic for handling None (ultimately it treats it as true AFAICT, but it's not immediately obvious).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm only now looking at the source for how requests.Sessions
are initialized.
...
ok. So a requests.Session
has a verify
attribute that's set to True
at init but which the docs say you can falsify later on. FWIW, I don't see mention of setting it to None
.
requests.Session.request
does take an Optional
verify
when constructing its request, and that logic is awful:
- if
request.verify \in (True | None)
, and a bundle-string exists in an environment variable, then overwriterequest.verify
with that bundle-string - if now
request.verify is None
, then use Session.verify for the request - if
request.verify is not None
, use whateverrequest.verify
now is for the request
ok. It's time to send the request. Now it's up to the Adapter to decide what to do with verify
. Looking at the source, the comment on the base class says that verify
should be either a string or a bool. This library is untyped so whatever, but says nothing about the possibility that it's a None
.
I think it's a type error at this point for verify
to be a None
unless the adapter we subclass it with explicitly handles that case. Ours does not, but having earlier set request.verify
to None
manually does mean that possibility is there.
My conclusion from this is that for our use case we should not set a requests.Session.verify
to None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaulting this to true now
harness/determined/core/_context.py
Outdated
@@ -39,6 +39,7 @@ class Context: | |||
|
|||
def __init__( | |||
self, | |||
_session: api.Session, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that I don't understand how this works. I can't tell how much of that is my fault, how much the PR's. When you're ready to talk about this part, could we chat?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #9116 +/- ##
==========================================
+ Coverage 44.59% 44.61% +0.01%
==========================================
Files 1273 1273
Lines 155832 155888 +56
Branches 2439 2439
==========================================
+ Hits 69501 69555 +54
- Misses 86092 86094 +2
Partials 239 239
Flags with carried forward coverage won't be shown. Click here to find out more.
|
3ce4af4
to
906c21f
Compare
|
||
|
||
def test_session_non_persistent_http_sessions() -> None: | ||
with api_server.run_api_server(address=("localhost", 8080)) as master_url: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think api_server.run_api_server
was a design mistake, and that we should either mock python endpoints or HTTP calls with responses
. I think this would be plenty easy to do here, and would be clearer to the reader, in addition.
I don't think you have to migrate, but I do think it'd be better for our test health to stop using the fake master.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
believe me i really didn't want to use this. tried both mocking endpoints and responses
. the underlying HTTP connection pools only exist for real connections
sortByMetric=sort_by if isinstance(sort_by, str) else None, | ||
states=[bindings.checkpointv1State.COMPLETED], | ||
) | ||
with self._session as sess: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm uncomfortable modifying a session as a side effect when list_checkpoints
is called. What if the session stored in self._session
already had a stored _http_session
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With list_checkpoints
coded in this way, is there any way to use the same HTTP connection for two successive calls?
with api.login(<details>) as sess:
persistent_det = Determined.from_session(sess)
persistent_det.get_experiment(<id>).list_trials() # Done with a single HTTPConnection (probably)
persistent_det.get_experiment(<id>).list_checkpoints() # closes the PoolManager in the session in persistent_det
persistent_det.get_experiment(<id>).list_trials() # persistent_det no longer persistent; new HTTPConnection per call to master```
eb249e4
to
5de8377
Compare
Description
Introduce the option of using persistent HTTP sessions in
api.Session
.Previously, each API call was wrapped in its own
requests.Session
, which meant making a new underlying TCP connection/TLS handshake every time. Sometimes this is necessary because we can't know when to close the session, and we don't want to leave too many dangling HTTP connections open. But in many cases, we make lots of API calls in quick succession within known bounds, and in these cases using a persistent HTTP session significantly improves performance:Reporting metrics in Core API/Detached mode:
Not using persistent HTTP sessions: 13-14s
With a persistent HTTP session: 6-7s
CLI calls that fetch multiple pages:
i.e.
list-experiments
,list-checkpoints
Before: 0.75s
After: 0.3s
api.Session
as a context managerThis PR exposes the option of re-using the same underlying HTTP session through
api.Session
as a context manager:When used directly, each HTTP call will create a new session every time. This is the current behavior, so existing code will continue to function the same way:
This functionality is implemented in Core API, where all internal API calls within a
core.Context
(with core.init(...) as core_context:
) will now share the same persistent HTTP connection which is closed when thecore_context
context exits.CLI
In the CLI,
api.Session
as a context manager is implemented as a decorator, and when used, exposes a session that is persistent within the decorated function:I only changed a few select APIs/CLIs to use the new functionality, mostly as examples of how it would be used. There are many places that would see performance benefits from persistent sessions, but I imagine that will probably be a separate effort/PR.
Other
api.Session
refactorsSome other small refactors made to
api.Session
:harness/determined/common/requests.py
was deleted, its logic consolidated intoharness/determined/common/api/_session.py
Test Plan
Commentary (optional)
Checklist
docs/release-notes/
.See Release Note for details.
Ticket