Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the DOMParser class to HTML #5190

Merged
merged 4 commits into from
Jan 28, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 148 additions & 2 deletions source
Original file line number Diff line number Diff line change
Expand Up @@ -3248,7 +3248,8 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
<dfn data-x="event listener type" data-x-href="https://dom.spec.whatwg.org/#event-listener-type">type</dfn> and
<dfn data-x="event listener callback" data-x-href="https://dom.spec.whatwg.org/#event-listener-callback">callback</dfn></li>

<li>The <dfn data-x="document's character encoding" data-x-href="https://dom.spec.whatwg.org/#concept-document-encoding">encoding</dfn> (herein the <i>character encoding</i>) and
<li>The <dfn data-x="document's character encoding" data-x-href="https://dom.spec.whatwg.org/#concept-document-encoding">encoding</dfn> (herein the <i>character encoding</i>),
<dfn data-x="concept-document-mode" data-x-href="https://dom.spec.whatwg.org/#concept-document-mode">mode</dfn>, and
<dfn data-x="concept-document-content-type" data-x-href="https://dom.spec.whatwg.org/#concept-document-content-type">content type</dfn> of a <code>Document</code></li>
<li>The distinction between <dfn data-x-href="https://dom.spec.whatwg.org/#xml-document">XML documents</dfn> and
<dfn data-x-href="https://dom.spec.whatwg.org/#html-document">HTML documents</dfn></li>
Expand Down Expand Up @@ -3333,7 +3334,6 @@ a.setAttribute('href', 'https://example.com/'); // change the content attribute
<p>The following features are defined in <cite>DOM Parsing and Serialization</cite>: <ref spec=DOMPARSING></p>

<ul class="brief">
<li><dfn data-x-href="https://w3c.github.io/DOM-Parsing/#the-domparser-interface"><code>DOMParser</code></dfn></li>
<li><dfn data-x="dom-innerHTML" data-x-href="https://w3c.github.io/DOM-Parsing/#dom-element-innerhtml"><code>innerHTML</code></dfn></li>
<li><dfn data-x="dom-outerHTML" data-x-href="https://w3c.github.io/DOM-Parsing/#dom-element-outerhtml"><code>outerHTML</code></dfn></li>
</ul>
Expand Down Expand Up @@ -94670,6 +94670,152 @@ document.body.appendChild(frame)</code></pre>
</div>


domenic marked this conversation as resolved.
Show resolved Hide resolved
<!-- We anticipate adding XMLSerializer here in the future, thus the id="". When we do that, add
<h4 id="domparser">Parsing HTML or XML documents</h4> below this. outerHTML/innerHTML might
also live here? -->
<h3 id="dom-parsing-and-serialization">DOM parsing</h3>

domenic marked this conversation as resolved.
Show resolved Hide resolved
<p>The <code>DOMParser</code> interface allows authors to create new <code>Document</code> objects
by parsing strings, as either HTML or XML.</p>

<dl class="domintro">
<dt><var>parser</var> = new <code subdfn data-x="dom-DOMParser-constructor">DOMParser</code>()</dt>
domenic marked this conversation as resolved.
Show resolved Hide resolved
<dd>
<p>Constructs a new <code>DOMParser</code> object.</p>
</dd>

<dt><var>document</var> = <var>parser</var> . <code subdfn data-x="dom-DOMParser-parseFromString">parseFromString</code>( <var>string</var>, <var>type</var> )</dt>
<dd>
<p>Parses <var>string</var> using either the HTML or XML parser, according to <var>type</var>,
and returns the resulting <code>Document</code>. <var>type</var> can be "<code>text/html</code>"
(which will invoke the HTML parser), or any of "<code>text/xml</code>",
"<code>application/xml</code>", "<code>application/xhtml+xml</code>", or
"<code>image/svg+xml</code>" (which will invoke the XML parser).</p>

<p>For the XML parser, if <var>string</var> can be parsed, then the returned
<code>Document</code> will contain elements describing the resulting error.</p>

<p>Note that <code>script</code> elements are not evaluated during parsing, and the resulting
document's <span data-x="document's character encoding">encoding</span> will always be
<span>UTF-8</span>.</p>

<p>Values other than the above for <var>type</var> will cause a <code>TypeError</code> exception
to be thrown.</p>
</dd>
</dl>

<p class="note">The design of <code>DOMParser</code>, as a class that needs to be constructed and
then have its <code data-x="dom-DOMParser-parseFromString">parseFromString()</code> method called,
is an unfortunate historical artifact. If we were designing this functionality today it would be a
standalone function.</p>

<pre><code class="idl" data-x="">[Exposed=Window]
domenic marked this conversation as resolved.
Show resolved Hide resolved
interface <dfn>DOMParser</dfn> {
<span data-x="dom-DOMParser-constructor">constructor</span>();

[NewObject] <code>Document</code> <span data-x="dom-DOMParser-parseFromString">parseFromString</span>(DOMString <var>string</var>, <span>DOMParserSupportedType</span> <var>type</var>);
};

enum <dfn>DOMParserSupportedType</dfn> {
"<span data-x="dom-DOMParserSupportedType-texthtml">text/html</span>",
"<span data-x="dom-DOMParserSupportedType-otherwise">text/xml</span>",
"<span data-x="dom-DOMParserSupportedType-otherwise">application/xml</span>",
"<span data-x="dom-DOMParserSupportedType-otherwise">application/xhtml+xml</span>",
"<span data-x="dom-DOMParserSupportedType-otherwise">image/svg+xml</span>"
};</code></pre>

<div w-nodev>

<p>The <dfn data-x="dom-DOMParser-constructor"><code>DOMParser()</code></dfn> constructor steps
domenic marked this conversation as resolved.
Show resolved Hide resolved
are to do nothing.</p>

domenic marked this conversation as resolved.
Show resolved Hide resolved
<p>The <dfn data-x="dom-DOMParser-parseFromString"><code>parseFromString(<var>string</var>,
<var>type</var>)</code></dfn> method steps are:</p>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This choice matches Chromium and WebKit. Gecko seems to use the relevant global object's API base URL, i.e. the relevant global object's associated Document's base URL.

Gecko's choice is more consistent with many other APIs in the spec. However, for an old API like this, I think we should just go with the majority.


domenic marked this conversation as resolved.
Show resolved Hide resolved
<ol>
<li>
<p>Let <var>document</var> be a new <code>Document</code>, whose <span
data-x="concept-document-content-type">content type</span> is <var>type</var> and <span
data-x="concept-document-URL">url</span> is this's <span>relevant global object</span>'s <span
data-x="concept-document-window">associated <code>Document</code></span>'s <span
data-x="concept-document-URL">URL</span>.</p>
<!-- When https://github.com/whatwg/html/issues/4792 gets fixed we need to investigate which of
HTMLDocument vs. XMLDocument gets returned, and when, with tests. In particular we'll want
to check the application/xhtml+xml case as even for navigation it seems like browsers
diverge there: https://github.com/whatwg/html/issues/5157#issuecomment-573052638. -->

<p class="note">The document's <span data-x="document's character encoding">encoding</span> will
be left as its default, of <span>UTF-8</span>. In particular, any XML declarations or
<code>meta</code> elements found while parsing <var>string</var> will have no effect.</p>
</li>

<li>
<p>Switch on <var>type</var>:</p>

<dl class="switch">
<dt>"<dfn data-x="dom-DOMParserSupportedType-texthtml"><code>text/html</code>"</dfn></dt>
<dd>
<ol>
<li><p>Set <var>document</var>'s <span data-x="concept-document-type">type</span> to "<code
data-x="">html</code>".</p></li>

<li><p>Create an <span>HTML parser</span> <var>parser</var>, associated with
<var>document</var>.</p></li>

<li><p>Place <var>string</var> into the <span>input stream</span> for <var>parser</var>. The
encoding <span data-x="concept-encoding-confidence">confidence</span> is
<i>irrelevant</i>.</p></li>

<li>
<p>Start <var>parser</var> and let it run until it has consumed all the characters just
inserted into the input stream.</p>

<p class="note">This might mutate the document's <span
data-x="concept-document-mode">mode</span>.</p>
domenic marked this conversation as resolved.
Show resolved Hide resolved
</li>
</ol>

<p class="note">Since <var>document</var> does not have a <span
data-x="concept-document-bc">browsing context</span>, <span
data-x="concept-n-script">scripting is disabled</span>.</p>
</dd>

<dt><dfn data-x="dom-DOMParserSupportedType-otherwise">Otherwise</dfn></dt>
<dd>
<ol>
<li><p>Create an <span>XML parser</span> <var>parse</var>, associated with
<var>document</var>, and with <span>XML scripting support disabled</span>.</p></li>

<li><p>Parse <var>string</var> using <var>parser</var>.</p>

<li>
<p>If the previous step resulted in an XML well-formedness or XML namespace well-formedness
error, then:</p>

<ol>
<li><p>Assert: <var>document</var> has no child nodes.</p></li>

<li><p>Let <var>root</var> be the result of <span data-x="create an element">creating an
element</span> given <var>document</var>, "<code data-x="">parsererror</code>", and "<code
data-x="">http://www.mozilla.org/newlayout/xml/parsererror.xml</code>".</p></li>

<li><p>Optionally, add attributes or children to <var>root</var> to describe the nature of
the parsing error.</p></li>

<li><p><span data-x="concept-node-append">Append</span> <var>root</var> to
<var>document</var>.</p></li>
</ol>
</li>
</ol>
</dd>
</dl>
</li>

<li><p>Return <var>document</var>.</p>
</ol>

</div>


<h3 split-filename="timers-and-user-prompts" id="timers">Timers</h3>

Expand Down