Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardize format of TeX math #39

Open
Melissa37 opened this issue Feb 13, 2015 · 14 comments
Open

Standardize format of TeX math #39

Melissa37 opened this issue Feb 13, 2015 · 14 comments

Comments

@Melissa37
Copy link

This should not have been listed in recommendations yet as is still a question to be answered here.
Standardize insertion of TeX math. Each publisher wants it differently. Here are 3 examples:

<tex-math><![CDATA[...]]></tex-math>

<tex-math><![CDATA[$$...$$]]></tex-math>

<tex-math><?CDATA...?></tex-math>

Best to have an agreed form so that browsers can render the TeX natively.

@Melissa37 Melissa37 added the math label Feb 13, 2015
@Klortho
Copy link
Member

Klortho commented Feb 18, 2015

What is that third one? Is that a processing instruction? Where did you get that example from? I don't think that's a good alternative -- we should remove it.

I'd like to see more real-world TeX examples.

@Melissa37
Copy link
Author

It's in our recommendations :-)

@Klortho
Copy link
Member

Klortho commented Feb 18, 2015

I know .. I saw it there. I'm just wondering where it came from originally.

@Melissa37
Copy link
Author

I think Kaveh did some editing a little while ago

@kaveh1000
Copy link

Sorry for the delay in replying to this. These are just 3 actual real world examples. To be clear we don't have to recommend any of these and can think of the best approach, but the suggestion is that there is one recommended way to save tex in JATS

@Klortho Klortho changed the title Math recommendations point 5 Standardize format of TeX math Mar 4, 2015
@Klortho
Copy link
Member

Klortho commented Mar 4, 2015

I want to propose these recommendations for content:


The content of the element should be math-mode LaTeX, without the delimiters that are normally used to switch into / out of math mode (\\[...\\], \\(...\\), $$...$$, etc.). For example:

<disp-formula>
  <tex-math id='M1'>a = b</tex-math>
</disp-formula>

Rationale: the TeX enclosed in the element is only used within JATS articles to produce mathematics, so specifying that the content is in math-mode precludes ambiguities that could arise if more general-purpose TeX structures were included (such as, for example, commands to produce tables of contents).

The only exception to this is that a subset of the LaTeX environments can be used to wrap the entire contents. (These are normally not permitted in math-mode.) These are typically used to specify alignment for multi-line equations. These are of the form \begin{XXX} ... \end{XXX}. Those environments that are permitted by MathJax are also allowed within the element; see MathJax TeX and LaTeX Support - Environments for a list. For example:

<disp-formula>
  <tex-math id='M1'>
    \begin{align*}
      x^2 + y^2 &amp; = 1\\
      x         &amp; = \sqrt{1-y^2}
    \end{align*}
  </tex-math>
</disp-formula>

Note that XML CDATA sections can be used to aid in embedding TeX markup within the XML of a JATS document. This obviates the need to use escape sequences such as &amp; for an ampersand. Using a CDATA section, the previous example would appear as follows.

<disp-formula>
  <tex-math id='M1'><![CDATA[
    \begin{align*}
      x^2 + y^2 & = 1\\
      x         & = \sqrt{1-y^2}
    \end{align*}
  ]]></tex-math>
</disp-formula>

@Klortho
Copy link
Member

Klortho commented Mar 4, 2015

The above grew out of my experience since our last telecon of figuring out how to get PMC tex content to render using MathJax. I think MathJax provides a really good use-case for this, since for JATS content, we just want to allow a subset of all the various types of content that is possible with TeX.

Not included in the above is a specification for what packages can be assumed to be included in the environment -- i.e. what macros are permitted. I want to add something to the effect of, "the standard set of packages that comes with LaTeX", but I'm afraid that there isn't any such standard. From my limited investigations, I think that different distributions include different sets of packages.

@pkra, if you have a moment, could you review the above?

@kaveh1000
Copy link

These are good ideas Klortho, and I agree with you for inline math, but for display, what happens when we have multiple lines and we need equation numbers. And the environment could be "equation", "eqnarray" or other.

@pkra
Copy link

pkra commented Mar 5, 2015

This seems reasonable to me, especially since it's based on your PMC experience (I think I don't see enough "wild" content).

Defining "basic LaTeX packages" is difficult since you can always use TeX primitives to do awful things. Saying "out-of-the-box LaTeX math-mode macros with AMS packages" (and perhaps mentioning texlive 2014 or the Debian textlive-base package as vague reference points) might get close (fwiw, MathJax supports a few more, such as mchem, cancel, and extpfeil).

As @kaveh1000 pointed out elsewhere concerning MathML, a limitation of this recommendation will be that anything more graphical (tikz, xypic etc) gets thrown out. But if the primary recommendation is MathML anyway, then following your proposal makes perfect sense. Perhaps such graphical things should be dealt with differently in JATS (especially JATS4reuse) but I don't run into this enough to say much about it.

Again, Wikipedia's texvc might be better from a technical perspective since it is stricter and a formal grammar. However, I admit I think it's too strict for publishing needs (i.e., not even trivial macros like \newcommand{R}{\mathbb{R}} would be allowed).

A particular difficulty is probably macro definitions since "reducing" macros isn't really feasible in TeX (it's hard to say where to stop from a TeX perspective and even within MathJax). So some recommendation might be sensible.

Another one will be equation labels. I think we should come up with some recommendation on how to reconciliate TeX equation labels and ids in JATS or MathML. (Seconding Kaveh who commented while I was writing ;-) )

@Klortho
Copy link
Member

Klortho commented Mar 5, 2015

@kaveh1000 wrote:

what happens when we have multiple lines and we need equation numbers. And the environment could be "equation", "eqnarray" or other.

@pkra wrote:

Another one will be equation labels. I think we should come up with some recommendation on how to reconciliate TeX equation labels and ids in JATS or MathML. (Seconding Kaveh who commented while I was writing ;-) )

I have been assuming that equation labels are handled by JATS, not by TeX. That's true of a few of the equations that I've seen in PMC content, but I haven't done an exhaustive review. I think it would be better if equation labels were not done in TeX.

I started to try to enumerate the allowed LaTeX environments, and only specify the starred ones ("equation*" and not "equation") are allowed, because the unstarred environments don't generate equation numbers. But then I thought it would be easier to just reference the MathJax documentation. But after your comment, I'm inclined to go back to enumerating them. What do you think?

@pkra
Copy link

pkra commented Mar 5, 2015

I think it would be better if equation labels were not done in TeX.

I agree. Let's include that for TeX.

But on the risk of going off topic, what about MathML? Duplication is obviously not a good idea but it's strange not to have the label in the MathML, especially for re-use (copy&paste / other forms of extraction need to retain a label, e.g., "First Law of Motion"). Of course, the burden could be put on the rendering/conversion side, e.g., JATS to HTML conversion could be expected to re-integrated the label into the MathML. But I'd be hesitant since it means twice the work in most cases.

@Klortho
Copy link
Member

Klortho commented Mar 14, 2015

I just noticed the @notation and @version attributes that can be put on the <tex-math> element. Should we specify that these should not be used? Or, would they be useful? Clearly, if they are to be used, there needs to be better specification of allowed values, and exactly what they mean.

@pkra
Copy link

pkra commented Mar 16, 2015

I forgot that I ran into these and was confused. The two values for @notation seem rather limited -- I can't imagine many authors writing clean LaTeX (e.g., technically $... $ is deprecated in LaTeX), and the implementor's note seems odd.

Similarly, @version seems of limited use. On the one hand, TeX and LaTeX are incredibly stable, on the other hand, we suggest "math mode only" and to me versions seem to more sense with full documents (where then each package would have precise versions as well).

@rajesh2k8
Copy link

rajesh2k8 commented Apr 24, 2019

Hi,
I am facing the issue while parsing JATS to HTML or PDF. In my jats some tags related tex math:

<disp-formula> <tex-math id="M7"><![CDATA[\documentclass[12pt]{minimal} \usepackage{wasysym} \usepackage[substack]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage[mathscr]{eucal} \usepackage{mathrsfs} \DeclareFontFamily{T1}{linotext}{} \DeclareFontShape{T1}{linotext}{m}{n} { &#x003C;-&#x003E; linotext }{} \DeclareSymbolFont{linotext}{T1}{linotext}{m}{n} \DeclareSymbolFontAlphabet{\mathLINOTEXT}{linotext} \begin{document} $$ {\mathrm{Acc/Acc:\hspace{.5em}}}\frac{{\mathit{ade2-202}}}{{\mathit{ADE2}}}\hspace{.5em}\frac{{\mathit{ura3-59}}}{{\mathit{ura3-59}}}\hspace{.5em}\frac{{\mathit{ADE1}}}{{\mathit{adel-201}}}\hspace{.5em}\frac{{\mathit{ter1-Acc}}}{{\mathit{ter1-Acc}}}\hspace{.5em}\frac{{\mathit{MATa}}}{{\mathit{MAT{\alpha}}}} $$ \end{document}]]></tex-math> </disp-formula>
After converting i am loading the page math formula is not coming. For display purpose i tried mathjax library.
Still, it is not working. Anybody can give any solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

5 participants