Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datetime. Ranges: Unknowns, Opens, and Unspecifieds. #540

Closed
JohnLukeBentley opened this issue Mar 4, 2017 · 43 comments
Closed

Datetime. Ranges: Unknowns, Opens, and Unspecifieds. #540

JohnLukeBentley opened this issue Mar 4, 2017 · 43 comments
Assignees

Comments

@JohnLukeBentley
Copy link

We might want to review ranges output.

See Gist-Biblatex-Ranges-Basic.tex, with the bib included as filecontents.

That spits out a 2 page pdf with three range sections: Unknown, Open, and Unspecified.

Note particular options that I've set as follows:

alldates=ymd, 
labeldate=ymd, 
mergedate=false

When the date(time) is a range we probably want to output it as such:

  • For in-text citations. For example for unknown/2006 I get (Simpson n.d.[a]) when perhaps we should have (Simpson unknown/2006a), or (Simpson unknown/2006) - using 'unknown' or 'open' as appropriate.

  • For references. For example for "open/2004-01-01" I get

    Simpson, Lisa (n.d.[a]). Cool Book 0097. /2004-01-01.
    

    ... when we should perhaps get ...

    Simpson, Lisa (open/2004-01-01). Cool Book 0097. open/2004-01-01.
    

    ... or, if we changed the options such that labeldate=year ...

    Simpson, Lisa (open/2004). Cool Book 0097. open/2004-01-01.
    

All this is before considering mergedate issues (mergedate=false above).

@moewew
Copy link
Collaborator

moewew commented Mar 5, 2017

One thing to notice is that again the output of \printlabeldate and the citations are different, this is probably unwanted here as well. See also #520 and ultimately #148. Regardless of how we stand towards unknown or open ranges, it certainly looks odd for the third case.

A conceptual question is if unkown should behave differently from giving no date at all. Should both get 'n.d.'?

I have no idea about the real problem though, dateunknown and friends don't seem to be used anywhere(?) and are not documented(?).

What I noticed, though is that some tests are set up in a way that they yield false if the (start) year is undefined and hence print nothing at all. This could be problematic for unkown/2006 range types, because we never get to printing the endyear.
Furthermore, labeldate would need to be more careful about assigning nodate with ranges, it should consider both start- and end-dates and act accordingly.

Unfortunately, I still haven't got the hang of this new date handling, so that's probably all I can say.

@JohnLukeBentley
Copy link
Author

Thanks! That helps me better present the issues ....

dateunknown and friends don't seem to be used anywhere(?) and are not documented(?).

In biblatex.pdf, page 36, under "2.3.8 Date and Time Specifications" there's mention of support for ranges. In particular "Table 3; Date specifications" (p37) list many EDTF date input and output formats and most of the example entail a range of some sort. There's also "Table 4: EDTF 5.2.2 Unspecified Date Parsing" (P 38).

From biblatex.dpf, p36 ...

Date felds such as the default data model dates date, origdate, eventdate, and urldate adhere to edtf (Extended Date/Time Format) specifcation levels 0 and 1. Also supported are the open-ended range specifcations in section 4.5 of the current working draft of ISO8601-2

The spec upon which that is based: Extended Date/Time Format (EDTF) 1.0 ... which defines at least the input format for date ranges in biblatex.

I have no idea about the real problem though,

Well you've done a good job of enumerating the problems which I'll repeat below. But if you mean "is this a real world problem?" there's at least one real world example I have in a (not yet shown) test file:

Lorem (da Vinci 1487/1490).
References
Da Vinci, Leonardo (1487/1490). Codex Trivulzianus.

Presumably unknown, open, and unspecfieds are rare in the real world. But it seems right to go ahead and support them given the road taken to support ranges where both dates are provided.

A conceptual question is if unkown should behave differently from giving no date at all. Should both get 'n.d.'?

Yes I think conceptually "unknown" and "open" are different from "no date supplied" (which I take "n.d." to stand for). "no date supplied" would indicate that the field hasn't been considered or processed. "unknown" and "open" are assertions about one of the dates, in the date range, of a work.

So all the problems which I think my Gist reveals (most of which you've identified) are:

  1. That we may want to make the output format to reflect the EDTF input format. As I mentioned, for something like

    Lorem (Simpson open/2004-01-01)
    Simpson, Lisa (open/2004-01-01). Cool Book 0097. open/2004-01-01.

    or, if we changed the options such that labeldate=year ...

    Lorem (Simpson open/2004)
    Simpson, Lisa (open/2004). Cool Book 0097. open/2004-01-01.

  2. One thing to notice is that again the output of \printlabeldate [in a reference] and the citations are different, this is probably unwanted here as well.

    Yes, for example, for 2004-06-01/unknown we get an undesirable

    (Simpson 2004a/)
    Simpson, Lisa (2004a). Cool Book 0093. 2004-06-01/.

  3. [From Datetime. Missing frills and datetime output, authoryear style, and mergedate. Bugs. #520] The problem is that even currently we are breaking backwards compatibility for dates with month precision (only) and the default mergedate=compact - the month is not shown.

    Just to prevent us from being confused:

    • 1999-01-uu means "Some day in January 1999";
    • 1999-01 means "January 1999".
    • 1999-uu means "Some month in 1999".

    ... it certainly looks odd for the third case.

    If by "third case" you mean the third case under Unspecifieds. That is, an input of 1999-uu I think this is quiet a bizarre case. An artifact of EDTF wanting to be complete about unspecifieds.

    @plk has cleverly recognized that EDTF unspecifieds can be conceptually mapped to date ranges (perhaps with some conceptual shoehorning). For example, "Some month in 1999", 1999-uu, can be expressed as a date range 1999-01/1999-12. But is "Published in some month in 1999" the same as "Published in 1999". It's hard for me to say. At the moment @plk has decided the answer is yes. So EDTF 1999-uu gets output as

    (Simpson 1999a)
    Simpson, Lisa (1999a). Cool Book 0130. 1999-01/1999-12.

    But an alternative might be to be "range strict" (when labeldate=ymd) ...

    (Simpson 1999-01/1999-12a)
    Simpson, Lisa (1999-01/1999-12a). Cool Book 0130. 1999-01/1999-12.

  4. What I noticed, though is that some tests are set up in a way that they yield false if the (start) year is undefined and hence print nothing at all. This could be problematic for unkown/2006 range types, because we never get to printing the endyear.

    Indeed unknown/2006 wrongly produces ....

    (Simpson n.d.[a])
    Simpson, Lisa (n.d.[a]). Cool Book 0092. /2006.

    ... when it might be better to do ...

    (Simpson unknown/2006)
    Simpson, Lisa (unknown/2006). Cool Book 0092. unknown/2006.

By pointing to #148 can I take it broadly: that labeldate=year is a complicating issue? That is, to throw in together with mergedate?

@moewew
Copy link
Collaborator

moewew commented Mar 5, 2017

Ah, sorry, when I said datenknown I was specifically referring to the toggle of the same name whose value is set in the .bbl, and not the concept of unknown dates and date ranges in general. It is an internal marker that I first assumed to have something to do with the output we get, on second glance, I'm not too sure about that any more. It probably is another symptom and not the cause.

When I said I had no idea about the real problem, I meant that I had (and indeed still have) no clue on how to go about solving the issue. It seems evident that the current output is sub-par and given that we want to support all date specifications that should be remedied.

OK, interesting point about the difference between 'unknown' and 'nodate', I somehow assumed without looking it up that date = {unknown} on its own would also be valid and that would in my book more or less translate to 'nodate' - even though I could easily persuaded to see that it is not in fact the same as 'nodate'.

I'm not sure about your point 3. I was saying in #520 that date = {2017-02} used to give '(February 2017)' with mergedate=compact, but now it only gives '(2017)' in the labeldate part of the bibliography entry.
When I was (clumsily) referring to the 'third case' I was trying to say that while I was not too sure about expected output of the first two sections in your example (that is to say, I could easily be persuaded to accept different things there), the third section looked unfortunate to me since the citation asserted start and end dates (of varying precision), yet the labeldate in the bibliography did not keep up with that by dropping the end date. (Again I don't want to get into details about how to handle 199u, I'm absolutely fine with how it is interpreted at the moment and don't have any feelings either way, but the output should be consistent).

#148 seems for me to have led to applying \printlabeldate with some mergedate settings, while other seem to use a copy of the .cbx's date printing. I can see that a separation between the two was desirable or even necessary, but it becomes a major headache for me now since we have to align the output of two different date printing commands that in many situations should give the same output anyway.

I am still struggling with how long authoryear.cbx's cite:labelyear+extrayear has become. And given that \printfield{labelyear} appears a few times in authoryear.bbx this is what made my authoryear.bbx in #520 so long and redundant due to heavy code duplication.

My preferred solution would have been that all macros (citation commands and mergedate in bibliography) use \printlabeldateextra instead of essentially rebuilding it in cite:labelyear+extrayear. But there was some argument against that, which I do not recall now. And the considerations in #148 could also indicate a few issues for that solution.

Sorry for the lengthy post. I had hoped we could keep the issues nice and short, but turns out I have a lot more opinions about this than I thought.

@JohnLukeBentley
Copy link
Author

That corrects some of my misinterpretations of your prior comments, thanks. In lieu of my going over any further detail that your corrections might reveal, perhaps I should stick sharing some general intuitions.

I'm assuming you are stilling wanting (I use the word with trepidation) to hold the batten and tackle the code here.

My head is more in the input/output rather than the code base. Given, that is, I'm new to the code base; and I'm reluctant to want to "program" in it (it feels a little awkward as a a programming language, like XSLT). But in the details you mention I recognize the general issue: it sounds like the code base (which might just mean the code in authoryear.*) could do with a refactor; and sounds like it's not clear if there are impediments to that refactor (e.g. in #148).

Refactor or not, the outstanding issues seem to entail taking care of:

  • Ranges (Unknowns. Opens, Unspecifieds) ... the issue of this thread.
  • Mergedates. Datetime. Missing frills and datetime output, authoryear style, and mergedate. Bugs. #520.
  • Datetimes with precision less than a year. E.g. 1999-06
  • Differences due to setting alldates and labeldate. E.g.
    • alldates=ymd; with labeldate=year; versus
    • alldates=ymd; with labeldate=ymd
  • (And also) handling the use of the pre-EDTF month field (see biblatex.pdf, p38, "2.3.9 Months and Journal Issues") in a way that doesn't conflict with a EDTF date field.

All these issues seem to intersect. Although, in coding terms it might be best to tackle them one at a time: the other issues might fall out as solved on that approach.

I might also suggest you keep in mind the possibility of jettisoning or deprecating features if this makes things simple, and subject to @plk's reflections. As a wild, if unlikely, example: perhaps the mergedate feature could be jettisoned or the options reduced.

@moewew
Copy link
Collaborator

moewew commented Sep 22, 2017

@plk I have had another look at this and most things look good now. The only problem we have left is with open/unknown start dates, here the dates come out as 'n.d.' which is not so good. It seems that \DeclareLabeldate would need to accept an empty year as OK, but should reject an undefined year field. Not sure what Biber has to say about this. I'm also not sure what extrayear should do in this case.

Seems as though EDTF is going to be superseded by ISO 8601, see https://www.loc.gov/standards/datetime/. As a result 'several syntactic changes were necessary', I'm not sure what that means for us.

@plk
Copy link
Owner

plk commented Sep 23, 2017

Oh great, they chose to use % as a replacement for ?~ ... worst possible choice for us!

@moewew
Copy link
Collaborator

moewew commented Sep 24, 2017

Of course we can choose to ignore this. But the documentation mentions EDTF quite often, it would be rubbish if the standard we chose to implement disappears or becomes abandoned. Problem is that ISO 8601 is in discussion stage right now and it will take some time until it is finalised. So following it now is risky.

We (unlike the EDTF/ISO people) need to worry about backwards compatibility. Although we can possibly get away with changing things in the less often used date features.

@plk
Copy link
Owner

plk commented Sep 24, 2017

I might submit a comment about the choice of '%' as it's rather silly since TeX is a widely used academic format for publications. I agree there is not much to do yet until 8601 is ratified but then, I think we have to switch to the latest version of it, yes.

@plk
Copy link
Owner

plk commented Sep 24, 2017

I looked at the 8601 part 2 spec and since we only cover level 1 currently, the changes are not so much. The main problem is the choice of % instead of ?~ ...

@moewew
Copy link
Collaborator

moewew commented Sep 25, 2017

But is % really that much of a problem? Could the field not be treated as verbose (like url)? The date field is special anyway since the input format is fixed by EDTF/ISO.

Do you have any thoughts about the open ranges, i.e. date = {unknown/2006}?

@plk
Copy link
Owner

plk commented Sep 25, 2017

I don't think that's necessary as '%' wouldn't even reach the .bbl - that gets processed by biber - it's just that it would occur in .bib files.

@moewew
Copy link
Collaborator

moewew commented Sep 25, 2017

Yes, but that is not too bad, as it can appear in verbose fields like URL as well and does not start a comment there either. So it is something people know already. It is still not a great choice, but it should not be a big problem.

@JohnLukeBentley
Copy link
Author

JohnLukeBentley commented Oct 13, 2017

On the incorporation of EDTF into a draft ISO 8601 ....

@plk wrote to @moewew:

I agree there is not much to do yet until 8601 is ratified but then, I think we have to switch to the latest version of it, yes.

Yes that seems right. So, in the meantime, stick with EDTF (as at https://www.loc.gov/standards/datetime/).

On the problem of the proposed uncertain and approximate symbol: % @moewew's last comment seems right (of course, I'm not the one having to touch the biber parser. @plk you might have a keener sense of the hassle this will cause as a parser author).

From a user perspective % appears considerably less semantically intuitive that ?~. On the other hand it avoids having to remember that ?~ is legal, but ~? is not. Of course, all that is an issue over at ISO.

Somewhat closer to the topic of the current thread, and on a quick skim of ISO_DIS 8601 Part 2 (pdf), section 4.4 Enhanced time interval it looks like there are new conventions for Unknown and Open date ranges:

  • Unknown is now represented by a blank, instead of "unknown" or "*" [Edit: this an error, see next post].
  • Open is now represented by a double dot (..), instead of "open" or "blank" [Edit: this an error, see next post].

In particular the change in meaning of blank: a potential backwards compatibility nightmare?

I wonder if it might become necessary to specify in your .bib file the datetime conventions you are using e.g. "Edtf-Locgov" V "Edtf-ISO8601Part2".

Section 4.3 also introduces a change to unpsecifieds. Essentially using "X" in place of "u". That seems like a superior convention and one to easily adapt to. However, in Section "4.10 Decade" there's the introduction of expressing a decade with three digits. E.g. So "the 1960s" can be represented by "196".

All that might warrant holding off on addressing the current bug (of this thread). However, if anyone wanted to plough on with it (working to the EDTF spec) then I only have to offer that @moewew you appear to be almost correct that ...

The only problem we have left is with open/unknown start dates ...

... unknown end dates also appear to be a problem. I get (Simpson 2004a-06-01/) but expect (under EDTF-locgov) (Simpson 2004a-06-01/unknown).

But unspecifieds (using the "u" symbol) look good to me against Gist-Biblatex-Ranges-Basic.tex.

Maybe all this is easier to tackle now that mergedate issues (#520) are largely solved (??).

@JohnLukeBentley
Copy link
Author

JohnLukeBentley commented Oct 13, 2017

Sorry. In the prior comment I represented Unknown and Open intervals in EDTF under loc.gov (http://www.loc.gov/standards/datetime/pre-submission.html) as entailing the following:

  • Unknown: expressed by "unknown" or "*";
  • Open: expressed by "open" or blank;

Looking again at the spec it appears to be simpler:

  • Unknown: expressed by "unknown" only;
  • Open: expressed by "open" only;

http://www.loc.gov/standards/datetime/pre-submission.html#extendedintervall1

For level 1:

  • 'unknown' may be used for the start or end date when it is unknown.
  • 'open' may be used when no end date is specified, either because there is none or for any other reason.

http://www.loc.gov/standards/datetime/pre-submission.html#bnf

(* *** L1Interval *** *)

L1interval = L1Start "/" L1End

  L1Start = ( dateOrSeason UASymbol?) | "unknown"
  L1End   = L1Start | "open" 
  ...
UASymbol = ("?" | "~" | "?~")

But perhaps I'm forgetting some backwards compatibility reason for allowing "*" and blank?

Edit: Moreover the spec doesn't seem to permit Open start dates. I might be forgetting that we decided to ignore that strange omission. Open is permitted at either the start or end dates in the draft ISO 8601 part 2. So both for future ISO reasons and because it makes sense I think we should continue to permit Open ("open") at the start or end date.

@JohnLukeBentley
Copy link
Author

So ... as you were. Opens and Unknowns are working in Biblatex.

  • Open can be input and output with "open" at either end of the date range, as makes sense and is supported by the draft ISO 8601 part 2. Blank is broken ... but that's ignored by EDTF.
  • Unknown can be input and output with "unknown" at either end of the date range, as EDTF requires. "*" is broken ... but that's ignored by EDTF.

That leaves the Unspecifieds. The current results look like ...

Description Detailed Description Input In text citation result
Within a decade. Some unspedifed year in the 1990s 199u (Simpson ca. 1990)
Within a century. Some unspecifed year in the 1900s. 19uu (Simpson ca. 1900)
Within a Year (tomonth level). Some month in1999. 1999-uu (Simpson ca. 1999a-01)
Within in month. Some day in January 1999. 1999-01-uu (Simpson ca. 1999b-01-01)
Wtihin a Year (today level). Some day in 1999. 1999-uu-uu (Simpson ca. 1999c-01-01)

... the current results don't really convey an "unspecified" date.

@moewew
Copy link
Collaborator

moewew commented Oct 24, 2017

The output I get with the development version is slightly different.
rangebasic-3
This seems to be more appropriate, but maybe it needs a 'ca.'? At the moment Biber sets dateunspecified that is never used, so we could use that.

The problem with open/unknown dates is what I described above. We would need to treat empty and undefined year fields differently.

@plk Maybe we need not only #1dateunknown, but also #1dateopen. These date combinations are also a nightmare for extrayear. extrayear suffers from a similar problem like mergedate, it was devised when the labeldate was only a year and can give unintuitive output with full labeldates. I suppose addressing this would need major restructuring and I'm not sure if it is worth it.

@plk
Copy link
Owner

plk commented Oct 24, 2017

Isn't dateopen inferable from having a defined but empty end date part?

@moewew
Copy link
Collaborator

moewew commented Oct 24, 2017

Probably, I didn't check. But I don't think this will work for labeldate. 'Open' dates trigger nodate there.

@JohnLukeBentley
Copy link
Author

On unspecifieds ... as you were and forgive my wrongly introducing a confusion. I've been testing zotero-better-bibtex and some errors crept into my *.bib file. That's corrected and I now have the same results as yourself (moewew) and those results are as they should be.

On open/unknown date ranges moewew wrote ...

The problem with open/unknown dates is what I described above ... [above:] The only problem we have left is with open/unknown start dates, here the dates come out as 'n.d.' which is not so good.

... but could you catch me up here? ....

Open and unknown start dates work fine if the date is specified with a "open" or "unknown" (right?). The problem you are pointing to is when blank or * is used (respectively). Given that EDTF doesn't support blank or * is this a problem just because you want to preserve backward compatibility? If there's no backward compatibility requirement the problem disappears.

I suppose addressing this would need major restructuring and I'm not sure if it is worth it.

If there is a backwards compatibility issue, rather than "requirement", a way to address that issue could be to formally deprecated blank and * support and issue relevant parsing error messages:

  • Open date ranges are no longer supported with blanks. Use the EDTF conformant 'open' string instead.
  • Unknown date ranges are no longer supported with a "*". Use the EDTF conformant 'unknown' string instead.

@moewew
Copy link
Collaborator

moewew commented Oct 28, 2017

I'm mainly talking about output and not input here, and in particular about the output of labeldate with labeldate=year.

ranfettestts

While the output of the date with \printdate (at the end of the entry) looks OK to me, \printlabeldateextra goes for nodate (without any '/' to signify a start or end date) to quickly for my taste. I think this is because in those cases the year field is empty in the .bbl.

I fear this would require changes not only in biblatex, but also in Biber. The extrayear calculation would need to be aware of the possibility that a start year is empty, but the end year is given. And while we are at it it would be interesting to at least turn extrayear optionally into extradate aware of the entire date and not only the year.

@plk
Copy link
Owner

plk commented Oct 28, 2017

This wouldn't necessarily be too difficult - currently extrayear is tracked by a combination of a name hash and labelyear. It wouldn't be so hard to change the tracking to use every available labeldate part - what did you have in mind? This could even go into 3.8 as it would be an enhancement rather than changing the current behaviour if there was another macro like \DeclareExtradateTemplate which allowed users to specify which information was used to track extrayear, defaulting to just labelyear?

@moewew
Copy link
Collaborator

moewew commented Oct 28, 2017

Oh, interesting.

In that case an optional way to make extrayear also track month and day and maybe even time (if available) would be a nice touch. It makes sense to also take month and day into account with labledate=ymd for example. If you have an idea for \DeclareExtradateTemplate already, that would be great.

I don't quite understand how extrayear honours enddates at the moment (I think it does).

So once that is settled, we probably need to tweak the date output routines to add the extrayear in the right place.

@plk
Copy link
Owner

plk commented Oct 28, 2017

I think the new macro would simply be an ordered list of labeldate parts which are used to track extrayear (which we’d rename extradate). It would output something in the .bcf which biber would use in the tracking routines. I can probably have this working today.

@moewew
Copy link
Collaborator

moewew commented Oct 28, 2017

Great! For backwards compatibility reasons we probably need to keep extrayear alive.

plk added a commit that referenced this issue Oct 28, 2017
plk added a commit that referenced this issue Oct 28, 2017
@plk
Copy link
Owner

plk commented Oct 28, 2017

This is done in 3.8/2.8 DEV. extrayear is now extradate with backwards compat and a deprecation warning if that field is used.

@moewew is correct that extradate obeys endyear by default because the default definition looks at labelyear first and this is composed of year and endyear, even if endyear is empty due to an explicit 2000/ etc. The nice thing about the new solution is that you can choose to not do that now by tracking using year explicitly.

See \DeclareExtradate in the docs. The default definition is in biblatex.def as usual and all default styles have been changed to use extradate.

@moewew
Copy link
Collaborator

moewew commented Oct 29, 2017

Just tested it and it works brilliantly. Thank you very much.

So the only thing left is the problem with open start dates yielding 'nodate'.

@plk
Copy link
Owner

plk commented Oct 29, 2017

Can you give a quick MWE here? I need to see what biber outputs in the .bbl.

@moewew
Copy link
Collaborator

moewew commented Oct 29, 2017

This is a shortened version of the MWE from the beginning

\documentclass{article}

\usepackage{filecontents}
\begin{filecontents}{\jobname.bib}
@book{simpson_cool_book_0092,
  author = {Simpson, Lisa},
  title = {Cool Book 0092},
  date = {unknown/2006}, % EDTF
  timestamp = {2016-10-31T05:55:17Z}
}

@book{simpson_cool_book_0093,
  author = {Simpson, Lisa},
  date = {2004-06-01/unknown}, % EDTF
  title = {Cool Book 0093},
  timestamp = {2016-10-31T05:55:37Z}
}

@book{simpson_cool_book_0094,
  author = {Simpson, Lisa},
  title = {Cool Book 0094},
  date = {*/2006}, % WD ISO 8601-2
  timestamp = {2016-10-31T05:56:44Z}
}

@book{simpson_cool_book_0095,
  author = {Simpson, Lisa},
  date = {2004-06-01/*},  % WD ISO 8601-2
  title = {Cool Book 0095},
  timestamp = {2016-10-31T05:57:12Z}
}

@book{simpson_cool_book_0096,
  author = {Simpson, Lisa},
  date = {2004-01-01/open}, % EDTF
  title = {Cool Book 0096},
  timestamp = {2016-10-31T05:57:31Z}
}

@book{simpson_cool_book_0097,
  author = {Simpson, Lisa},
  date = {open/2004-01-01}, % EDTF
  title = {Cool Book 0097},
  timestamp = {2016-10-31T05:57:56Z}
}

@book{simpson_cool_book_0098,
  author = {Simpson, Lisa},
  date = {2004-01-01/}, % WD ISO 8601-2
  title = {Cool Book 0098},
  timestamp = {2016-10-31T05:58:30Z}
}

@book{simpson_cool_book_0099,
  author = {Simpson, Lisa},
  date = {/2004-01-01}, % WD ISO 8601-2
  title = {Cool Book 0099},
  timestamp = {2016-10-31T05:58:56Z}
}
\end{filecontents}

\usepackage[%
    style=authoryear,
]{biblatex}
\addbibresource{\jobname.bib}
\begin{document}
\nocite{*}
\printbibliography
\end{document}

plk added a commit to plk/biber that referenced this issue Oct 29, 2017
@plk
Copy link
Owner

plk commented Oct 29, 2017

Please try 3.8/2.8 now which are both uploaded with all recent changes. I think this was a biber bug which should now be fixed. The problem was that dates with only end of range information were not registered as having come from EDTF date parsing when they in fact had. The MWE now gives what looks like the correct output.

@plk plk assigned plk and moewew Oct 29, 2017
@JohnLukeBentley
Copy link
Author

@plk thanks for the coding kung-fu.

@moewew I'm not sure why I wasn't seeing the 'nodate' problem you've illustrated. Something strange at my end. Anyway I did, finally, see the problem. With @plk's changes I now get, with the original Gist-Biblatex-Ranges-Basic.tex, with labeldate=ymd...

image

I think that's close to solving it (it at least gets rid of the 'nodate' issue). But note with labeldate=year I get ...

image

So looking at the first unknown citation result we have:

  • For labeldate=ymd: "(Simpson [a]/2006)"; and
  • For labeldate=year: "(Simpson /2006[a])"

I haven't been quite getting up to speed on the extradate/extrayear/ \DeclareExtradate details but the above would be an inconsistent and undesirable result (no?).

@moewew
Copy link
Collaborator

moewew commented Oct 30, 2017

We probably should go with

\DeclareFieldFormat{extradate}{%
  \ifboolexpr{test {\iffieldnums{labelyear}} or test {\iffieldnums{labelendyear}}}
    {\mknumalph{#1}}
    {\mkbibparens{\mknumalph{#1}}}} 

now.

The difference between labeldate=ymd and labeldate=year is unfortunate, but I'm not sure if it can be solved.
With the default \DeclareExtradate settings I would expect what we get for year "(Simpson /2006a)". With labeldate=ymd we can't really move the extradate since that would make it detached from year, but with the default \DeclareExtradate settings it belongs only to the year, not the entire date.
Intuitively, I would simply move extradate to the very end of the entire date. But that is only good if \DeclareExtradate takes the entire date into account; if it only takes portions of the date into account, extradate should be as close as possible to that portion.

@plk
Copy link
Owner

plk commented Oct 30, 2017

Yes, this hs been an open topic for some time - where to print extradate for dates with more than year granularity and for ranges. I would also tend to put it at the very end with a note in the docs about adjusting \DeclareExtradate if necessary?

@moewew
Copy link
Collaborator

moewew commented Oct 30, 2017

That is a very good solution for most cases. But it could be problematic with the standard labeldate = year, mergedate =compact and a date in YYYY-MM-DD format. Then the extradate should go with the year at all times, but could be moved to the end with merged dates.

@JohnLukeBentley
Copy link
Author

JohnLukeBentley commented Oct 31, 2017

I'm not across the code sufficiently to be of much aid. I note @moewew's "I'm not sure if it can be solved". However ...

... I'm understanding extradate to be the alphabetic index 'a', 'b', 'c', etc ...

On the issue of where to put extradate for datetimes more precise than a year (and displayed as such) that are not in ranges, and mergedate issues aside (let's assume mergedate is false), then I seem to recall you @plk (in discussions long ago) persuading me that it should go after the year.

And indeed looking at the documentation for \DeclareExtradate (which you've lately pointed us to) I see that while you have a (very flexible) facility for playing with datetime scope, by default you keep that scope to the conventional year. And so, by default, I get results (with your and @moewew's recent aid on another topic) like ...

image

... with extradate after the year I think that looks good and I wouldn't, personally as a user, want to alter that scope. I think it looks good because it best facilities facilitates, as a cognitive matter, a reader to correlate citation and bibliography entries.

On the issue of where to put extradate in ranges, and again ignoring mergedate complications, consider a set of entries by the same author with (input) dates and titles like the following:

  • unknown/2006-06; Title Lorem
  • unknown/2006-01; Title Ispusm
  • unknown/2018; Title Dolor
  • 2006/unknown; Title Sit
  • 2007/unknown; Title Amet
  • 2007-05-10/unknown; Title Consectetur

I'd suggest we'd want to:

  • Group and sort based on start date (even if unknown or open), not the end date (choosing either date is somewhat arbitrary but I think a start date carries an intuitive privilege);
  • (By default at least) For the start date put extradate after the year if available, or in the place of blank for unknown or open start dates.

So the output would go like this:

  • Simpson, Lisa (a/2018). Title Dolor. /2018
  • Simpson, Lisa (b/2006-01). Title Ispusm. /2006-01
  • Simpson, Lisa (c/2006-06). Title Lorem. /2006-06
  • Simpson, Lisa (2006/unknown). Title Sit. 2006/
  • Simpson, Lisa (2007a/unknown). Title Amet. 2007/
  • Simpson, Lisa (2007b-05-10/unknown). Title Consectetur. 2007-05-10/unknown

... ignoring the issue of whether you want brackets "[]" around some of the extradate's.

So, in other words, I'd favour the current results when labeldate=ymd and suggest that the results when labeldate=year should be changed to conform ... if the coding is not too difficult.

@moewew
Copy link
Collaborator

moewew commented Oct 31, 2017

You'll have to correct me if I'm wrong, but at the moment (3.7 and the defaults of 3.8) extradate takes both the start and the end year into account for disambiguation. (For the rest of this I will assume alldates=ymd, labeldate=year and default \DeclareExtradate settings.) So /2006, 2006/, 2005/2006, 2006/2007 and 2006/2008 can all coexist without anyone getting any extradate. I think this is a sensible approach since it avoids to throw around extradates where they are not needed. That means that the extradate 'belongs' to both the start and end year. I would find it natural to come after the end year, because then it would come after the two years that govern it and not in between. With a YYYY-MM-DD formatted date I would not know here to put the extradate. It should not go after the entire date, since that would imply that it is decided by the entire date. But then it is not clear whether to put it after the start or end year.

@plk
Copy link
Owner

plk commented Oct 31, 2017

You are right about the initial assumption since the first item in the default extradate spec is labelyear which is composed of start+end year, even if open. I wonder if it would be worth having another field in the .bbl which tells you which of the extradate spec fields determined the value of extradate - would that help to place the actual extradate marker? This would be analogous to the new uniquename functionality.

@moewew
Copy link
Collaborator

moewew commented Oct 31, 2017

I thought about that as well and it could help in certain situations (and could make things more flexible), but does not help us with the current problem of where exactly to place the extradate in ranges, I don't think.

@plk
Copy link
Owner

plk commented Oct 31, 2017

I have added extradatescope to the .bbl which may be useful in this.

@JohnLukeBentley
Copy link
Author

JohnLukeBentley commented Oct 31, 2017

Moewew wrote:

So /2006, 2006/, 2005/2006, 2006/2007 and 2006/2008 can all coexist without anyone getting any extradate. I think this is a sensible approach since it avoids to throw around extradates where they are not needed. That means that the extradate 'belongs' to both the start and end year.

That's better than my suggestion to have extradate belong to the start year.

And therefore, as you and have @plk have suggested, when there is a need to display an extradate (because the author has multiple works with the same year range) and labeldate=year then the extradate ought go at the end of the two years: (2006/2007a)

When labeldate=ymd, and displaying a YYYY-MM-DD (or more precise) formatted datetime, where should the extradate go (you ask)?

For the sake of being able to stare at example possibilities you folk already have in mind (and assuming that range years, not range datetimes, determine the extradate value) ...

Possibility 1: after the end date:

  • (2003-06-06/2004-06-03a)
  • (2003-01-01/2004-07-07b)

Possibility 2: after the end year:

  • (2003-06-06/2004a-06-03)
  • (2003-01-01/2004b-07-07)

As you @moewew correctly note, if extradate goes after the end date that wrongly implies that whole dates, rather than merely the years, determine the extradate value.

That, then, would seem to leave possibility 2, after the end year. This could be understood as follows:

It would be consistent with designating that the range years determine the extradate and so the extradate goes after the last year in that range. So possibility 2 looks similar suitably analogous to what happens if you toggled to labeldate=year ...

  • (2003/2004a)
  • (2003/2004b)

The understanding continues ... an extradate after a year, even with a YYYY-MM-DD output date, would be consistent with the look already decided upon in non-date range circumstances. See my "Barker, Anne" image a few posts up with output like ...

Barker, Anne (2016a-02-24). “China Sends ...

On the separate but related issue of the new extradatescope .... If I understand the intentions behind extradatescope it is to report on the \DeclareExtradate settings, to reveal whether the extradate value is determined by a year (the default) or datetime precisions finer than a year.

That would seem to be helpful down the track (perhaps quite soon) with certain edge cases, but moewew seems correct that it doesn't help with the default case: when extradate is determined by the years of the date range. But I hope my exemplifications above help with the default case.

Edit 01: "similar" to "suitably analogous"; "help" to "hope".

@JohnLukeBentley
Copy link
Author

A testing file for this issue: Gist-Biblatex-Ranges-MutlipleWorksByAuthorInSameRange.tex .

@plk
Copy link
Owner

plk commented Nov 3, 2017

@moewew - do you think we can/need to do anything with this? I think this is now the last outstanding thing before release.

@moewew
Copy link
Collaborator

moewew commented Nov 3, 2017

At the moment I have no good idea what to do with this - neither conceptually nor technically. The status quo should be OK for most users and acceptable in some more advanced situations. I think it is more important to get the new release out than to solve this. At best no-one notices a problem, at worst someone complains, but then they may have intuitions about how things should look.

I suggest we close this ticket, as the discussion has become quite long again, and open a new one for the extradate issue in ranges.

@JohnLukeBentley
Copy link
Author

At the basic level the original range issues, to do with Unknowns, Opens, and Unspecifieds, are now solved. That is, leaving aside anticipation of ISO changes and the "Extradate and multiple works by author in same range" issue.

So I've opened "Datetimes. Extradate and multiple works by author in same range. #644": essentially starting with information from my previous two posts ...

then they may have intuitions about how things should look.

In #644 you might like to answer: what of my (John Bentley's) intuitions about how things should look?

I'll leave the honours to you @moewew to close the current thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants