Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: DataFrame.to_latex directs to Styler.to_latex #41648

Closed
wants to merge 121 commits into from

Conversation

attack68
Copy link
Contributor

@attack68 attack68 commented May 24, 2021

Closes #41649.
A lot of the audit trail for the features to get this to work are documented in the issue.

For Reviewers

These are the key changes to note in DataFrame.to_latex tests

  • The expected output remains as close as possible to the original, altering the arg input in to_latex to demonstrate the new combinations needed.
  • No tests have been removed, and all pass with often minor alterations to the test formatting .
  • hrules=True is added to all tests to get \toprule \midrule \bottomrule. This is a new option and the default in Styler is not to have these.
  • \cline (for multirow) has gone. This feature is not implemented in Styler.
  • col_space is gone, meaning there is not a nice console alignment of text. Of course the LaTeX rendering is unaffected by this, and when styling LaTeX commands are added this needs to be removed anyway.
  • There are some nice new features added here, but all currently exists from Styler anyway, and that is where testing resides.

Follow-Ups

  • Make jinja2 a hard dependency (CI: add jinja2 as a hard dependency for DataFrame.to_latex/to_html #43423) (this PR actually contains the jinja2 PR to help with tests passing)
  • Deprecate all pandas.options.display.latex documenting the replacement styler.latex options instead.
  • Remove all redundant code for python LatexFormatter.
  • Reimplement cline? Im not a huge fan of this and I think it (maybe?) quite complicated to implement

Revised Docs

Screen Shot 2021-09-06 at 23 50 02
Screen Shot 2021-09-06 at 23 50 20
Screen Shot 2021-09-06 at 23 50 32
Screen Shot 2021-09-06 at 23 50 44

@attack68 attack68 marked this pull request as draft May 24, 2021 20:48
@attack68 attack68 added Deprecate Functionality to remove in pandas IO LaTeX to_latex labels May 24, 2021
@attack68 attack68 added the Styler conditional formatting using DataFrame.style label Jun 29, 2021
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Aug 18, 2021
@attack68 attack68 changed the title WIP: DEPR: DataFrame.to_latex() in favour of Styler.to_latex() WIP: REF: DataFrame.to_latex() directs to Styler.to_latex() Aug 28, 2021
@attack68 attack68 marked this pull request as ready for review September 17, 2021 05:11
@attack68 attack68 changed the title WIP: REF: DataFrame.to_latex directs to Styler.to_latex REF: DataFrame.to_latex directs to Styler.to_latex Sep 17, 2021
@attack68
Copy link
Contributor Author

This is now at a stage where it can be reviewed.

@lithomas1
Copy link
Member

If this requires jinja2 to work now as opposed to previously, would we need a deprecation cycle first? (Argument names also seem to be changed, but I'm not too familiar with this section of the code base, so idk whether the old args would still work)

This also probably needs a whatsnew.

@attack68
Copy link
Contributor Author

If this requires jinja2 to work now as opposed to previously, would we need a deprecation cycle first? (Argument names also seem to be changed, but I'm not too familiar with this section of the code base, so idk whether the old args would still work)

This also probably needs a whatsnew.

What would the deprecation cycle look like, since this is a kind of an all or none change therefore:

  • a) we do nothing except warn users that the implementation will change? This doesn't seem to offer any benefit since the new implementation can't be tested by users?
  • b) we warn the implementation will change and offer a DataFrame.to_html_test method so that users can become accustomed to the new method before its officially added, still using the old.
  • c) we adopt the details contained in this PR whereby much of the implementation remains intact with warnings for the deprecations and what are the alternatives, but users will have to make changes.

I'm not completely convinced of the need for a deprecation cycle, other than whats included here, but interested in the opinions.

Currently the deprecated args just give warnings they are no longer used. It might be possible to squeeze some of them into the new format, however, e.g. old bold_rows=True triggers new bold_header="both" for example.

Copy link
Member

@ivanovmg ivanovmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comments below.
Should there be a what's new section on the breaking changes?

Comment on lines +3413 to +3414
f"`{k}` is deprecated after transition to `Styler.to_latex()` "
f"signature." + ("" if v is None else f" Consider {v} instead."),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that this message is a bit refactored by extracting msg and recommendation, so that it is slightly more readable.

msg = (
    f"`{k}` is deprecated "
    "after transition to `Styler.to_latex()` signature."
)
recommendation = f"Consider {v} instead" if v else ""
warnings.warn(
    ' '.join([msg, recommendation]),
    FutureWarning,
    stacklevel=2,
)

Comment on lines +3421 to +3423
if (
escape is None and get_option("styler.format.escape") == "latex"
) or escape is True:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be reasonable to extract a variable here?

is_latex_escape = (
    (escape is None and get_option("styler.format.escape") == "latex")
    or escape is True
)
escape = "latex" if is_latex_escape else None

)
return DataFrameRenderer(formatter).to_latex(

for ax in [0, 1]: #
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything below this line contains quite complex logic.
I suppose that extracting a whole new class, like StylerLatexAdapter or something like this, may be reasonable here.
It will then be possible to encapsulate kwargs deprecation, handling headers, index, etc there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a gap in Styler currently since format_index is so new, that a Styler class can be instantiated with formatting variable for data objects, but not for index objects. Of course it opens up a query how best to do this since there are quite a lot of formatting args and if applied to data values and each index separately it introduces a potential host of new args.

With regards to the general complexity, yes I agree the logic has been to squeeze one method into the other and an adapter might be useful. Can we wait on feedback from others too?

@@ -61,15 +61,15 @@ def test_to_latex_to_file_utf8_without_encoding(self):

def test_to_latex_tabular_with_index(self):
df = DataFrame({"a": [1, 2], "b": ["b1", "b2"]})
result = df.to_latex()
result = df.to_latex(hrules=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it OK, that in the previous implementations \toprule and the like were drawn by default, but now it is necessary to specify hrules=True explicitly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh I wondered about that as well, but the Styler default is to exclude them, and I do wonder if a better default is not to add extraneous items from an additional latex package requirement. (disregarding the current pandas implementation()

Comment on lines -202 to +200
result = df.to_latex(longtable=True)
result = df.to_latex(environment="longtable")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like longtable=True will not be working now.
I mean you issue a warning if longtable is not None, but would it not be more reasonable to also ensure that if a user supplied longtable=True the longtable is actually created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically I could try to work around the existing args feeding the new ones, but for this first review I wanted to get some feedback and the cleanest at the start was to simply raise a warning that it has been replaced.

"bold_rows",
],
)
def test_deprecation_warning(deprecated):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for the recommendation to consider new kwarg?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main question is why are we caring about any new options at all here? personally i would make the existing options work (maybe with a deprecation warning if needed), and actually deprecate the entire function in place of df.styler.to_latex()

column_format=None,
longtable=None,
position=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to add all of these options? e.g. since we can already do this with styler

self,
columns=columns,
col_space=col_space,
kwargs = locals()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't want to do any of this here. better in io/format/latext.py (or you can make a latex_deprecated or similar).

@attack68
Copy link
Contributor Author

main question is why are we caring about any new options at all here? personally i would make the existing options work (maybe with a deprecation warning if needed), and actually deprecate the entire function in place of df.styler.to_latex()

Yes OK I see your point. I will try a second proposal which attempts to reuse all the existing args and does not anything new args (since you are right: all available in styler), but adds a deprecation warning.

@lithomas1
Copy link
Member

lithomas1 commented Sep 17, 2021

main question is why are we caring about any new options at all here? personally i would make the existing options work (maybe with a deprecation warning if needed), and actually deprecate the entire function in place of df.styler.to_latex()

+1 on this. Would this mean #43423 would be pushed backed to 2.0, if the current functionality is just being deprecated and not removed?

@attack68
Copy link
Contributor Author

main question is why are we caring about any new options at all here? personally i would make the existing options work (maybe with a deprecation warning if needed), and actually deprecate the entire function in place of df.styler.to_latex()

+1 on this. Would this mean #43423 would be pushed backed to 2.0, if the current functionality is just being deprecated and not removed?

If we target 2.0 to use jinja2 for to_latex to_html and to_string, then perhaps the simplest thing to do for 1.4.0 is just add a FutureWarning and thats it? I thought this might be annoying for jupyter notebook users but Future warnings don't show up by default when rendering the cell, but do if the method is called directly (see below)

This would of course push the jinja2 dependency to 2.0 as well? And in 2.0 the major version change would probably allow me the backdrop to then overhaul these methods as per this PR?

Is this the preferred route - I'm happy to work more on this but prefer an agreed spec before engaging
Screen Shot 2021-09-18 at 13 22 23

@attack68
Copy link
Contributor Author

closing this temporarily in favour of #44411

@attack68 attack68 closed this Nov 16, 2021
@attack68 attack68 deleted the depr_df_to_latex branch March 6, 2022 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas IO LaTeX to_latex Styler conditional formatting using DataFrame.style
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEPR: LatexFormatter in DataFrame.to_latex in favour of Styler
4 participants