Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AustLII translator updates #2882

Merged
merged 7 commits into from
Aug 3, 2023
Merged

Conversation

jpwarren
Copy link
Contributor

A few updates to the AustLII translator to more closely align with the Australian Guide to Legal Citation, 4th Edition.

  • Major jurisdictions are now supported and abbreviated.
  • Support for Report Series abbreviations, with some unauthorised report abbreviations supplied.
  • Statute names (called Acts here) more properly parsed, and the year put into dateEnacted where it should be.
  • Act names are Title Cased, with support for Act Name (Parenthetical Short Name) Acts.
  • Additional test cases and updated tests to match the added/changed features.

Capitalise titles of Acts, including parenthetical ones.
Using lowercase 'code' as key.
Added jurisdiction abbreviations.
Added some informal report series abbreviations.
Added more tests to demonstrate functionality works.
Copy link
Member

@AbeJellinek AbeJellinek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Some comments.

Comment on lines 111 to 115
var courtMap = new Map();
courtMap.set('Federal Court of Australia', 'FCA');
courtMap.set('High Court of Australia', 'HCA');
courtMap.set('Family Court of Australia', 'FamCA');
courtMap.set('Australian Information Commissioner', 'AICmr');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this (and the jMap below) outside the function so we don't need to rebuild every time?

newString += ZU.capitalizeTitle(words[i].toLowerCase(), true);
}
}
Zotero.debug(newString);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to debug here

@zoe-translates
Copy link
Collaborator

Hi, I wonder if there's any followup. This PR looks solid and it passes the tests.

In addition to the suggested changes, there are a few things we need to update as a result of changes made to the websites.

  • In the test cases, the number-suffixed "www[n].austlii.edu.au" domains are no longer responsive. The number can be simply removed ("www7." -> "www.") and the test cases should be updated.
  • Case law usually contain download links, in RTF and PDF/A. I think we should consider saving the PDF/A file as attachment, in addition or in preference to the HTML snapshot.

@jpwarren if you're still on this and if you can make the changes, this can be accepted.

Updated test URLs now that AustLII (mostly)
doesn't redirect site to a www\d hostname.
@jpwarren
Copy link
Contributor Author

jpwarren commented Aug 3, 2023

Sorry, put it down for a moment and then… months went by somehow.

Made the requested changes.

The github action for the CI appears to have failed partway through due to what looks like a misconfiguration of which Chrome driver it's looking for, not something to do with this PR specifically.

Comment on lines 142 to 149
function abbrevJurisdiction(fullname) {

var abbrev = jMap.get(fullname);
if (abbrev === undefined) {
abbrev = fullname;
}
return abbrev;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, with the suggested changes now in place, we can observe that the functions abbrevJurisdiction and abbrevCourt are substantially similar. Either function is called only once. Therefore, I think we can further simplify them:

  • Instead of using Map, we can just use object literals
var jMap = {
	"Commonwealth": "Cth",
	/* ... */
};

because we're ever going to just use it as a static mapping. Here the advantage of Map() is not apparent, and it will be less verbose.

  • Consequently, the abbrevXXX functions can be eliminated. And to get an abbreviation, we can use, say
jMap[fullname] || fullname;

(this is not theoretically bulletproof but it should suffice).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

I've kept them as separate maps/dictionaries to avoid key collisions, rather than having a single jMap object.

court and jurisdictions abbreviations.
@zoe-translates
Copy link
Collaborator

Thank you!

@zoe-translates
Copy link
Collaborator

zoe-translates commented Aug 3, 2023

Hello @jpwarren!

You don't have to merge upstream main branch into your PR branch. Please rebase your PR branch instead, so that the PR will only contain your changes.

@jpwarren
Copy link
Contributor Author

jpwarren commented Aug 3, 2023

This is what I get for mis-clicking in the website.

I am unsure of how to do the rebase safely. Advice would be appreciated so I don't make things worse as I fumble about trying to figure it out.

@zoe-translates
Copy link
Collaborator

OK, don't worry, we don't have to do it now. I'd be afraid that my instructions might make things worse, without knowing your configuration of the repo.

@dstillman dstillman merged commit b68ed12 into zotero:master Aug 3, 2023
0 of 2 checks passed
adam3smith pushed a commit to adam3smith/translators that referenced this pull request Aug 19, 2023
@LawData-user
Copy link

Hi
Thanks for the work on this translator. I hope this is the appropriate place for this feedback.
Unfortunately I am not able to code (except by asking ChatGPT to do it), however there are a few small functional changes I would like to suggest, which would really improve this translator:

  1. In the 'Extra' field it adds an entry 'Code: [jurisdiction abbreviation]'. I don't know what this is for, but it results in citations being inserted in Word that don't comply with the citation standards for Australia (AGLC) as it adds this information in front of the Media neutral court name, eg '[2023] Cth HCA 30' which should read '[2023] HCA 30'.
  2. If I also add the citation for the Authorised version of the case (for HCA it is the Commonwealth Law Reports (CLR)) then the citation also becomes messy, giving '(2023) 220 CLR HCA 29' rather than '(2023) 220 CLR 110; [2023] HCA 29'.
  3. The translator only works on Austlii and not Classic AustLII, it should be an easy fix as the 'classic version' just uses classic.austlii... instead of www.austlii... Otherwise, I think that the site is identical. Lots of people use the classic interface as it makes better use of screen space, even if it isn't so pretty.
  4. It would be great if you would have it add the pdf or rtf file as an attachment as many people read the case in Zotero and highlight (and even add inked comments), before using the tool to extract the highlighted passages (if it is a pdf) to a Note field. If it is an rtf at least it is downloaded and can be printed from word to pdf and then added). I am unaware of an alternative workflow from a webpage that is convenient.

Thank you again for your work.
Regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants