Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Use oembed for url previews? #2752

Closed
fuzzy76 opened this issue Jan 4, 2018 · 13 comments · Fixed by #10822
Closed

Use oembed for url previews? #2752

fuzzy76 opened this issue Jan 4, 2018 · 13 comments · Fixed by #10822
Labels
T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@fuzzy76
Copy link

fuzzy76 commented Jan 4, 2018

Wouldn't it be an idea to use oembed for url previews if the url announces support for it?

https://oembed.com

@fuzzy76
Copy link
Author

fuzzy76 commented Jan 8, 2018

Also suggesting https://noembed.com :)

@zeratax
Copy link

zeratax commented Apr 1, 2018

very interested in this, just want to notice that this definitely needs to be optional per room, since I'm sure there are lots of people that don't want iframes to arbitrary sites even if they are okay with url previews.

edit: replaced client with room

@fuzzy76 fuzzy76 closed this as completed Apr 1, 2018
@fuzzy76 fuzzy76 reopened this Apr 1, 2018
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 6, 2021
Set `od:iframe` in json when oembed returns `<iframe/>`

Signed-off-by: Srdjan <srdjan@catalyst.net.nz>
@richvdh
Copy link
Member

richvdh commented Aug 6, 2021

Using oembed more widely would be a good fix for #9733.

@richvdh
Copy link
Member

richvdh commented Aug 6, 2021

We have both #10392 and #10536 attempting to do this. Both do so by adding new configuration options so that people can provide their own list of URL/provider mappings.

I think that's problematic, because (a) it requires everyone to configure it themselves; (b) we have more than enough config options as it is, and we should decide if we really need more. It would be much nicer to automatically base use the list at https://oembed.com/providers.json.

So here's what I think we should do.

Step one: Embed https://oembed.com/providers.json in the synapse source code repo (it's MIT-licensed, so I think we're ok on that front), and use that to build the _oembed_patterns map at startup.

Everything else is then an optional extension that can come later:

  • Consider if we need to keep our copy of providers.json up to date, and if so, how.
  • Use the autodiscovery mechanism for those sites that support it, so that we can reduce providers.json to those with discovery != true.
  • Allow specification of a custom providers.json file for people with exotic requirements.

@fuzzy76
Copy link
Author

fuzzy76 commented Aug 7, 2021

Agreed. To quote oEmbed.com:

Providers and consumers are strongly encouraged to use the discovery mechanism, rather than the registry.

So this seems to be the recommended path.

@srdjan-catalyst
Copy link

I have no problems implementing sourcing providers.json file, however we are those people with exotic requirements and will have to do a custom-providers.json or something too. Any suggestions where to place those files?

@richvdh
Copy link
Member

richvdh commented Aug 9, 2021

Any suggestions where to place those files?

I'd recommend a config option which can be used to specify the path for a custom providers.json file; if the config option is left unset, the embedded providers.json is used.

@ShadowJonathan
Copy link
Contributor

#9877 is also relevant here, though idk if thats a dup of this

@srdjan-catalyst
Copy link

Not really a dup, this is the first step and #9877 would be the next step.

@erikjohnston erikjohnston added the T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. label Aug 23, 2021
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
Moved twitter patterns from preview_url_resource to oembed_endpoints

Signed-off-by: Srdjan <srdjan@catalyst.net.nz>
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
Replaced oembed_endpoint full providers list with oembed_providers_dir,
and provided initial providers.json
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Aug 30, 2021
@clokep
Copy link
Member

clokep commented Aug 31, 2021

Step one: Embed oembed.com/providers.json in the synapse source code repo (it's MIT-licensed, so I think we're ok on that front), and use that to build the _oembed_patterns map at startup.

This part was partially done (and is configurable which file(s) you might want to use). For now it still only includes the same Twitter patterns as before, so there's a step one-a here which is "expand providers.json to the full file in a performant manner".

I think the other main useful step that @richvdh suggested is:

Use the autodiscovery mechanism for those sites that support it, so that we can reduce providers.json to those with discovery != true.

I think keeping the providers.json up-to-date can be left to a future thought (and is probably #9877 anyway).

@richvdh
Copy link
Member

richvdh commented Sep 1, 2021

Something that came up in the development of #10536 is that if we ship Synapse with a big list of providers by default, we'll probably need to do something more efficient than testing each URL against every single pattern in the providers file. I'd suggest splitting up the fixed bits of the host parts of the URL patterns and storing them in a tree:

                   com
                  /   \
            twitter    youtube
            /               \
[ patterns for  ]     [ patterns for  ]
[ *.twitter.com ]     [ *.youtube.com ]
[and twitter.com]     [and youtube.com]

@srdjan-catalyst
Copy link

#10536 was working on the assumption that there will always be either a "clean" domain name - eg twitter.com, or www. name. A strong assumption I admit, but then it goes straight to "clean" domain name key, it is not doing sequential testing of patterns.

@clokep
Copy link
Member

clokep commented Oct 8, 2021

I've closed this with #10822, which implements autodiscovery. That's the last big block of this piece of work. We might need to improve some performance of searching oEmbed URLs (as mentioned above (#2752 (comment)).

There's likely other bits of work we can do here, but I think they belong in new issues.

srdjan-catalyst added a commit to srdjan-catalyst/synapse that referenced this issue Mar 18, 2022
Set `od:iframe` in json when oembed returns `<iframe/>`

Signed-off-by: Srdjan <srdjan@catalyst.net.nz>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
7 participants