Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Implement script to validate list indentation in docs #21520

Open
datapythonista opened this issue Jun 18, 2018 · 5 comments
Open

DOC: Implement script to validate list indentation in docs #21520

datapythonista opened this issue Jun 18, 2018 · 5 comments
Labels
CI Continuous Integration Code Style Code style, linting, code_checks Docs

Comments

@datapythonista
Copy link
Member

In #21518 it's been identified that several lists in the documentation don't follow the restructuredText standard (no indentation for top-level lists, 4 space indentation for sublists).

In that issue, (hopefully) all the formatting has been fixed manually. But it'd be useful to have a script that validates all the documentation pages, and makes sure no wrong formatting exists. Adding this script to lint.sh will also prevent that no lists with the wrong formatting are added in the future.

@anich003
Copy link

Hi @datapythonista,

I'm new to the open source community and was thinking this could be a good first place to contribute.

Would this just involve writing a shell command that greps for any line in the .rst files that looks like a list and doesn't have either none or 4 spaces?

@datapythonista
Copy link
Member Author

@anich003 that sounds great. The main challenge here is to correctly detect all lists with wrong indentation avoiding false positives.

I see 2 possible, processing the raw .rst files for patterns. Or reuse the sphinx functions that do the parsing to infer whether the indentation was wrong.

Let me know if you need help with it.

@anich003
Copy link

@datapythonista it seems the .rst files definitely have "well-formed" sublists (No space or 4 space before each *) but there are also some files that have 3-space lines (dsintro.rst, comparison_with_stata.rst) and 5-space lines (io.rst, install.rst).

The stata lines are, for example, comment lines within stata while the io examples might be sublists but its not clear.

It seems we'd want to ignore the code block asterisks and detect the lines in io.rst. I'm not sure how I'd accomplish this with a series of piped greps (my original plan) so I'll have to look more into the sphinx functions. Thoughts?

@datapythonista
Copy link
Member Author

Thanks for looking at it @anich003. There wouldn't be many lists with wrong indentation, as I reviewed them manually recently. So it's normal you found just few.

I think it's not a simple problem to detect them automatically, so it makes sense that the piped greps are not good enough. I think we'll have to parse the files ourselves (probably with a Python script). Or reuse sphinx parsing and detect when they are not being rendered correctly (for indented lists sphinx creates a block quote around the list).

@FHaase
Copy link
Contributor

FHaase commented Dec 6, 2018

I think sublists require 2 instead of 4 spaces to indent:
From Sphinx Documentation

* this is
* a list

  * with a nested list
  * and some subitems

* and here the parent list continues

From A ReStructuredText Primer

* a bullet point using "*"

  - a sub-list using "-"

    + yet another sub-list

  - another item

with-blockquote

<ul>
<li><p class="first">if the dtype is unsupported (e.g. <code class="docutils literal notranslate"><span class="pre">np.complex</span></code>) then the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code>, if provided, will be called
for each value, otherwise an exception is raised.</p>
</li>
<li><p class="first">if an object is unsupported it will attempt the following:</p>
<blockquote>
<div><ul class="simple">
<li>check if the object has defined a <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method and call it.
A <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method should return a <code class="docutils literal notranslate"><span class="pre">dict</span></code> which will then be JSON serialized.</li>
<li>invoke the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code> if one was provided.</li>
<li>convert the object to a <code class="docutils literal notranslate"><span class="pre">dict</span></code> by traversing its contents. However this will often fail
with an <code class="docutils literal notranslate"><span class="pre">OverflowError</span></code> or give unexpected results.</li>
</ul>
</div></blockquote>
</li>
</ul>

vs

without-blockquote

<ul class="simple">
<li>if the dtype is unsupported (e.g. <code class="docutils literal notranslate"><span class="pre">np.complex</span></code>) then the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code>, if provided, will be called
for each value, otherwise an exception is raised.</li>
<li>if an object is unsupported it will attempt the following:<ul>
<li>check if the object has defined a <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method and call it.
A <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method should return a <code class="docutils literal notranslate"><span class="pre">dict</span></code> which will then be JSON serialized.</li>
<li>invoke the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code> if one was provided.</li>
<li>convert the object to a <code class="docutils literal notranslate"><span class="pre">dict</span></code> by traversing its contents. However this will often fail
with an <code class="docutils literal notranslate"><span class="pre">OverflowError</span></code> or give unexpected results.</li>
</ul>
</li>
</ul>

@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Dec 11, 2019
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Code Style Code style, linting, code_checks Docs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants