Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleRouter: support paths in BASE #405

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

bajnokk
Copy link
Contributor

@bajnokk bajnokk commented Jul 20, 2022

If Satosa is installed under a path which is not the root of the
webserver (ie. "https://example.com/satosa"), then endpoint routing must
take the base path into consideration.

Some modules registered some of their endpoints with the base path
included, but other times the base path was omitted, thus it made the
routing fail. Now all endpoint registrations include the base path in
their endpoint map.

Additionally, DEBUG logging was configured for the tests so that the
debug logs are accessible during testing.

Fixes #404

All Submissions:

  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?
  • Have you added an explanation of what problem you are trying to solve with this PR?
  • Have you added information on what your changes do and why you chose this as your solution?
  • Have you written new tests for your changes?
  • Does your submission pass tests?
  • This project follows PEP8 style guide. Have you run your code against the 'flake8' linter?

bajnokk and others added 2 commits March 8, 2023 12:38
If Satosa is installed under a path which is not the root of the
webserver (ie. "https://example.com/satosa"), then endpoint routing must
take the base path into consideration.

Some modules registered some of their endpoints with the base path
included, but other times the base path was omitted, thus it made the
routing fail. Now all endpoint registrations include the base path in
their endpoint map.

Additionally, DEBUG logging was configured for the tests so that the
debug logs are accessible during testing.
Rebased to current master. When composing the paths, use os.path.join
primarily, since it handles empty strings and duplicate separators
logically.
As long as we use the BASE_URL in the OpenID Connect frontend as an
issuer, it's not possible to create multiple provider discovery URLs.
Add documentation and a comment to explain this limitation.
Avoid messing README.md with an unwanted line break.
# See https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderConfig
# Unfortunately since the issuer is always `base_url` for all OIDC frontend instances,
# the discovery endpoint will be the same for every instance.
# This means that only one frontend will be usable for autodiscovery.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenID Providers supporting Discovery MUST make a JSON document available at the path formed by concatenating the string /.well-known/openid-configuration to the Issuer.

The issuer is discovered through a WebFinger request for resources of http://openid.net/specs/connect/1.0/issuer relation. The response contains one or more href properties with the Issuer URL, which is allowed to contain a path.

What we define is that the frontend contains the frontend name as a path component and under that you can query the well-known documents.

With that in mind we can have multiple frontends each with its own discovery.

The problem is that atm, the base_url is used instead of endpoint_baseurl.

We can introduce a configuration option to select between the two behaviours, or (even better) introduce a configuration to set the discovery URL for a frontend.


At some point I would like to invert this logic; instead of a component defining paths of URLs internally that mapped to functionality (which the routing module has to match to based on some rules), there should be URLs as entrypoints mapped to functionality directly (as it happens within most web frameworks - flask, django, fastapi, etc).

Copy link
Contributor Author

@bajnokk bajnokk Mar 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that atm, the base_url is used instead of endpoint_baseurl.
We can introduce a configuration option to select between the two behaviours, or (even better) introduce a configuration to set the discovery URL for a frontend.

Would you agree to add a use_module_name_in_issuer option (default False for backward compatibility, but the examples changed to True)?

A more subtle change but also harder to document alternative would be to make the assignment in

provider_config["issuer"] = base_url
optional, so that one could set the issuer manually under the provider dict.

Is any of the two OK with you, or am I misunderstanding the problem?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, both are ok with me. As long as we provide a way to configure things to work as before, it is fine to introduce such changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated patchset contains a commit, which does not perform the above assignment if the provider config has the issuer set. I did some research in git log, but I think it was just in this way forever.
I added a brief explanation to the example configuration, too.
Note that this could be a breaking change for those, who had a lurking "issuer" in their provider config, but since it's never been supported, I'm inclined to go this way rather than adding a new "fix-something-but-dont-break-old-config" type of configuration option.

self.name = name
self.endpoint_baseurl = os.path.join(self.base_url, self.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.path.join will not work on certain platforms (see Windows..).

To join URLs with paths using a function, use urllib.parse.urljoin; but it also has caveats (paths that begin with / will be considered the root, and bases that do not end with a / will be considered a file and will be truncated).

The simplest approach is to just concatenate with a / (ie, f"{self.base_url}/{self.name}").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always forget about Windows, good catch, thanks.

The biggest advantage of os.path.join that it handles double slashes and empty strings intelligently. I'm thinking about adding a path_join function to util.py which would save the work of working around the empty base_path with "/".join([foo, bar]) all the time. (And I didn't want to add Python >=3.9 dependency with str.removesuffix())

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would probably name the function join_url_paths (verb first).

Note that the latest pysaml2 already requires Python 3.9 and SATOSA will be updated to require it too. IdentityPython projects try to be compatible with the python that ships on the latest Debian stable release (which is now Python 3.9).
So, requiring Python 3.9 is fine; but no newer atm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated patch contains a join_paths implementation, which tries to handle separators a little bit more intelligently than a simple concatenation. I've replaced all erroneous os.path.join calls to join_paths, but didn't replace {}/{}.format(foo, bar) all over the code, since this appears way too many times.


if backend in self.backends:
backend = self._find_backend(context.path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there was an idea here to reuse the method that is invoked to find a frontend;

see #279

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must say that I can't fully understand all the _find_* methods. My motives for yet another implementation is that it is looking at the very specific part of the request path and tries to find the backend name. I'm probably mistaken but I think that the reused version would try to match the backend endpoint, which is not present in a frontend request.

If Satosa is installed under a path which is not the root of the
webserver (ie. "https://example.com/satosa"), then endpoint routing must
take the base path into consideration.

Some modules registered some of their endpoints with the base path
included, but other times the base path was omitted, thus it made the
routing fail. Now all endpoint registrations include the base path in
their endpoint map.

Provide a simple implementation for joining path components, since we
don't want to add the separator for empty strings and when any of the
path components already have it.

Additionally, DEBUG logging was configured for the tests so that the
debug logs are accessible during testing.
Even though the OIDC provider configuration has an element for setting
the issuer, for some reason it was rewritten to BASE unconditionally,
but this has broken provider endpoint discovery when multiple OIDC
frontends were in use.
src/satosa/backends/base.py Show resolved Hide resolved
"^{}$".format(join_paths(urlparse(issuer).path.lstrip("/"), autoconf_path)),
self.provider_config,
)
jwks_uri = ("^{}/jwks$".format(self.endpoint_basepath), self.jwks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
jwks_uri = ("^{}/jwks$".format(self.endpoint_basepath), self.jwks)
jwks_uri = ("^{}$".format(join_paths(self.endpoint_basepath, "jwks")), self.jwks)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are not equivalents. The equivalent form would be
jwks_uri = ("^{}$".format(join_paths("/", self.endpoint_basepath, "jwks")), self.jwks)
Or is it what you are suggesting that the leading '/' should not be present when endpoint_basepath is the empty string?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, I didn't think that self.endpoint_basepath would be empty.
Previously, self.name was used and was always filled; in general, we do not match routes that start with /.

So, I still think this is fine, but let me know if I'm skipping over anything.

src/satosa/frontends/openid_connect.py Show resolved Hide resolved
src/satosa/micro_services/base.py Show resolved Hide resolved
src/satosa/frontends/saml2.py Show resolved Hide resolved
src/satosa/frontends/saml2.py Outdated Show resolved Hide resolved
src/satosa/frontends/saml2.py Outdated Show resolved Hide resolved
src/satosa/routing.py Outdated Show resolved Hide resolved
src/satosa/routing.py Outdated Show resolved Hide resolved
bajnokk added a commit to bajnokk/SATOSA that referenced this pull request Jun 12, 2023
Add base_path and endpoint_basepath to backend and micro_services

Co-authored-by: Ivan Kanakarakis <ivan.kanak@gmail.com>
Copy link
Member

@c00kiemon5ter c00kiemon5ter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify the join_paths function

src/satosa/util.py Outdated Show resolved Hide resolved
src/satosa/util.py Outdated Show resolved Hide resolved
Setting an alternative issuer should not be an encouraged setup,
although provider discovery should work either way. The recommended
setting is to use the BASE as the issuer, and we can leverage the
agressive configuration value replacement logic, which rewrites all
occurences of <base_url> to the value of BASE. The unit test was
modified to guarantee this behaviour, though.
bajnokk added a commit to bajnokk/SATOSA that referenced this pull request Jun 13, 2023
@c00kiemon5ter c00kiemon5ter self-assigned this Sep 26, 2023
bajnokk added a commit to OneIdentity/SATOSA that referenced this pull request Nov 24, 2023
Add base_path and endpoint_basepath to backend and micro_services

Co-authored-by: Ivan Kanakarakis <ivan.kanak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't use paths in BASE
2 participants