Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added translator for The Saturday Paper #3099

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jpwarren
Copy link
Contributor

@jpwarren jpwarren commented Aug 5, 2023

This PR adds a translator for Australian newspaper The Saturday Paper.

Added article section detection.
Added edition detection.
Added snapshot of article.
Adjusted test cases.
Comment on lines +83 to +89
item.publicationTitle = "The Saturday Paper";
item.title = attr(doc, 'meta[name="dcterms.title"]', 'content');
item.date = ZU.strToISO(attr(doc, 'meta[name="dcterms.date"]', 'content'));
item.abstractNote = attr(doc, 'meta[name="dcterms.description"]', 'content');
item.creators.push(ZU.cleanAuthor(attr(doc, 'meta[name="dcterms.creator"]', 'content'), "author"));
item.language = "en-AU";
item.url = url;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I think it suffices to make use of the 'Embedded Metadata' translator (such as by using the template code from Scaffold -> Tools menu -> Templates) for the majority of these fields. The EM translator will take care of these <meta> fields, and any improvements to EM translator will be automatically picked up by the translators that use it.

let editionRaw = doc.querySelector('p.issue-nav__current').textContent;
var em = editionRaw.match(/.*No\.\s(\d+)/);
if (em) {
item.edition = em[1];
Copy link
Collaborator

@zoe-translates zoe-translates Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure "edition" is the appropriate field? I think this refers to editions of a book, or in the context of newspapers, the different "prints" or versions of a particular day's issue, due to intra-day updates (edits).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants