Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maoritelevision.com (appears to use brightcove) #24552

Open
5 tasks done
rsfinlayson opened this issue Mar 31, 2020 · 2 comments
Open
5 tasks done

maoritelevision.com (appears to use brightcove) #24552

rsfinlayson opened this issue Mar 31, 2020 · 2 comments
Labels
broken-IE problem with existing site extraction patch-available site-support-request Add extractor(s) for a new domain

Comments

@rsfinlayson
Copy link

Checklist

  • I'm reporting a new site support request
  • I've verified that I'm running youtube-dl version 2020.03.24
  • I've checked that all provided URLs are alive and playable in a browser
  • I've checked that none of provided URLs violate any copyrights
  • I've searched the bugtracker for similar site support requests including closed ones

Example URLs

Description

A username+password is needed; the following will work:
username=rf-mtv-test@live555.com,password=testtest

@rsfinlayson rsfinlayson added the site-support-request Add extractor(s) for a new domain label Mar 31, 2020
github-actions bot added a commit to hellopony/youtube-dl that referenced this issue Apr 9, 2021
@mitchelltornquist
Copy link

Hi,

The site maoritelevision.com has been updated to maoriplus.co.nz. However URL schemes appear to be the same.

I was able to use the help here: 4fb25ff

And use the same brightcove URL with the Maoriplus ID at the end. Hopefully an easy fix.

Thanks!

@dirkf
Copy link
Contributor

dirkf commented Apr 17, 2024

Generally, please open a new issue rather than necroposting in a closed issue. But as you suggest, this may be an easy fix.

Show URLs from the old domain redirect to https://www.maoriplus.co.nz/ (sample of 1), so we can bin the processing for those URLs.

This current show https://www.maoriplus.co.nz/show/code-the-reunion/play/6349856999112 plays in the browser from the UK with no account/cookies.

So now the extractor has the show's Brightcove ID in the URL that previously had to be extracted from the page, very simple:

...
-    _VALID_URL = r'https?://(?:www\.)?maoritelevision\.com/shows/(?:[^/]+/)+(?P<id>[^/?&#]+)'
+    _VALID_URL = r'https?://(?:www\.)?maoriplus\.co\.nz/show/(?P<series>[\w-]+)/play/(?P<id>[\d]+)'
...
     def _real_extract(self, url):
-        display_id = self._match_id(url)
-        webpage = self._download_webpage(url, display_id)
-        brightcove_id = self._search_regex(
-            r'data-main-video-id=["\'](\d+)', webpage, 'brightcove id')
+        brightcove_id = self._match_id(url)
         return self.url_result(
             self.BRIGHTCOVE_URL_TEMPLATE % brightcove_id,
             'BrightcoveNew', brightcove_id)

Then:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-vF', u'https://www.maoriplus.co.nz/show/code-the-reunion/play/6349856999112']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Git HEAD: f66e450bf
[debug] Python 2.7.15 (CPython i686 32bit) - Linux-6.1.0-18-686-pae-i686-with-debian-12.5 - OpenSSL 1.1.1a  20 Nov 2018 - glibc 2.1.3
[debug] exe versions: ffmpeg 5.1.4-0, ffprobe 5.1.4-0
[debug] Proxy map: {}
[brightcove:new] 6349856999112: Downloading JSON metadata
[brightcove:new] 6349856999112: Downloading m3u8 information
[brightcove:new] 6349856999112: Downloading m3u8 information
[brightcove:new] 6349856999112: Downloading MPD manifest
[brightcove:new] 6349856999112: Downloading MPD manifest
[brightcove:new] 6349856999112: Downloading MPD manifest
[brightcove:new] 6349856999112: Downloading MPD manifest
[info] Available formats for 6349856999112:
format code                                  extension  resolution note
hls-audio-0-en__Main_-0                      mp4        audio only [en] 
hls-audio-0-en__Main_-1                      mp4        audio only [en] 
dash-a41239e7-db53-4f4d-924c-54da159ba123-0  m4a        audio only [en] DASH audio   64k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-a41239e7-db53-4f4d-924c-54da159ba123-1  m4a        audio only [en] DASH audio   64k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-a41239e7-db53-4f4d-924c-54da159ba123-2  m4a        audio only [en] DASH audio   64k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-a41239e7-db53-4f4d-924c-54da159ba123-3  m4a        audio only [en] DASH audio   64k , m4a_dash container, mp4a.40.2 (48000Hz)
dash-fe47bcc9-7840-47af-9ea6-727c9bcb8fa4-0  mp4        640x360    DASH video  699k , mp4_dash container, avc1.42001e, video only
dash-fe47bcc9-7840-47af-9ea6-727c9bcb8fa4-1  mp4        640x360    DASH video  699k , mp4_dash container, avc1.42001e, video only
dash-fe47bcc9-7840-47af-9ea6-727c9bcb8fa4-2  mp4        640x360    DASH video  699k , mp4_dash container, avc1.42001e, video only
dash-fe47bcc9-7840-47af-9ea6-727c9bcb8fa4-3  mp4        640x360    DASH video  699k , mp4_dash container, avc1.42001e, video only
hls-839-0                                    mp4        640x360     839k , avc1.42001e, video only
hls-839-1                                    mp4        640x360     839k , avc1.42001e, video only
dash-274d6d64-f208-4760-a7d2-b1cf121016ab-0  mp4        960x540    DASH video 1199k , mp4_dash container, avc1.4d001f, video only
dash-274d6d64-f208-4760-a7d2-b1cf121016ab-1  mp4        960x540    DASH video 1199k , mp4_dash container, avc1.4d001f, video only
dash-274d6d64-f208-4760-a7d2-b1cf121016ab-2  mp4        960x540    DASH video 1199k , mp4_dash container, avc1.4d001f, video only
dash-274d6d64-f208-4760-a7d2-b1cf121016ab-3  mp4        960x540    DASH video 1199k , mp4_dash container, avc1.4d001f, video only
hls-1389-0                                   mp4        960x540    1389k , avc1.4d001f, video only
hls-1389-1                                   mp4        960x540    1389k , avc1.4d001f, video only
dash-01a7d27c-dd8c-48d6-a16f-5be2420bfa34-0  mp4        1280x720   DASH video 1995k , mp4_dash container, avc1.4d001f, video only
dash-01a7d27c-dd8c-48d6-a16f-5be2420bfa34-1  mp4        1280x720   DASH video 1995k , mp4_dash container, avc1.4d001f, video only
dash-01a7d27c-dd8c-48d6-a16f-5be2420bfa34-2  mp4        1280x720   DASH video 1995k , mp4_dash container, avc1.4d001f, video only
dash-01a7d27c-dd8c-48d6-a16f-5be2420bfa34-3  mp4        1280x720   DASH video 1995k , mp4_dash container, avc1.4d001f, video only
hls-2264-0                                   mp4        1280x720   2264k , avc1.4d001f, video only
hls-2264-1                                   mp4        1280x720   2264k , avc1.4d001f, video only
dash-7e4ac127-f037-4a0b-a1cd-0cebb729a65b-0  mp4        1920x1080  DASH video 3496k , mp4_dash container, avc1.640028, video only
dash-7e4ac127-f037-4a0b-a1cd-0cebb729a65b-1  mp4        1920x1080  DASH video 3496k , mp4_dash container, avc1.640028, video only
dash-7e4ac127-f037-4a0b-a1cd-0cebb729a65b-2  mp4        1920x1080  DASH video 3496k , mp4_dash container, avc1.640028, video only
dash-7e4ac127-f037-4a0b-a1cd-0cebb729a65b-3  mp4        1920x1080  DASH video 3496k , mp4_dash container, avc1.640028, video only
hls-3916-0                                   mp4        1920x1080  3916k , avc1.640028, video only
hls-3916-1                                   mp4        1920x1080  3916k , avc1.640028, video only
http-3687k-1080p-0                           mp4        1920x1080  3687k , MP4 container, H264, 1.35GiB
http-3687k-1080p-1                           mp4        1920x1080  3687k , MP4 container, H264, 1.35GiB (best)
$

The show page as seen by yt-dl (with no JS) is essentially empty, so it's lucky that the ID is handed to us. With JS enabled in the browser, the page actually fetches JSON containing the Brightcove data (along with some data for other video hosts), but it happens to be the same as hard-coded in the old extractor's BRIGHTCOVE_URL_TEMPLATE. Maybe there are other shows that will need this JSON to be loaded and parsed.

There is also some schema of seasons/episodes/playlists, etc, that someone who cared could implement.

@dirkf dirkf reopened this Apr 17, 2024
@dirkf dirkf added broken-IE problem with existing site extraction patch-available labels Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broken-IE problem with existing site extraction patch-available site-support-request Add extractor(s) for a new domain
Projects
None yet
Development

No branches or pull requests

3 participants