Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

valid parameters check while reading data #22189

Closed
shantanuo opened this issue Aug 3, 2018 · 8 comments
Closed

valid parameters check while reading data #22189

shantanuo opened this issue Aug 3, 2018 · 8 comments
Labels

Comments

@shantanuo
Copy link

Pandas seems to ignore the extra (invalid) parameters. For e.g.

Code Sample, a copy-pastable example if possible

import pandas as pd
df=pd.read_excel('myfile.xlsx', some_dummy_param=True)

Note that some_dummy_param does not throw an error.

Problem description

Is there any way to make sure only valid parameters are passed to read_excel method?

Expected Output

Since there is no such parameter called "some_dummy_param", I should get an error:

TypeError: init() got an unexpected keyword argument 'some_dummy_param'

@TomAugspurger
Copy link
Contributor

I thought we had another issue for this (#17994 was aobut sheet_name vs. sheetname).

@TomAugspurger TomAugspurger added the IO Excel read_excel, to_excel label Aug 3, 2018
@TomAugspurger TomAugspurger added this to the Contributions Welcome milestone Aug 3, 2018
@TomAugspurger
Copy link
Contributor

The most difficult part is that kwds is passed through to the underlying engine. We would want to reconcile the common keyword arguments, and then perhaps provide an engine_kwargs parameter that's a dict of engine-specific kwargs.

@shantanuo
Copy link
Author

That thread ends with your note "the goal is to remove the **kwargs from the signature". And the issue is closed! May be a decorator should be considered.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 3, 2018 via email

@shantanuo
Copy link
Author

I just saw a decorator called "deprecate_kwarg" to deprecate a keyword argument of a function. I thought similar concept can be used for "validate_kwarg". Ignore this comment if that is not the case.

@shantanuo
Copy link
Author

shantanuo commented Aug 3, 2018

May be this function in pandas/pandas/util/_decorators.py

def validate_kwarg():
    def _validate_kwarg(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            expected_keys=['io', 'sheet_name','header', 'names', 'index_col', 'usecols', 'squeeze', 'dtype', 'engine', 
               'converters', 'true_values',  'false_values', 'skiprows', 'nrows', 'na_values', 'verbose', 'parse_dates',
               'date_parser', 'thousands', 'comment', 'skipfooter', 'convert_float']

            if set(kwargs.keys()).difference(set(expected_keys)): 
                raise ValueError('invalid parameter found')

            return func(*args, **kwargs)
        return wrapper
    return _validate_kwarg

and a decorater added in the file pandas/io/excel.py

@validate_kwarg()
@appender(_read_excel_doc)
@deprecate_kwarg("parse_cols", "usecols")
@deprecate_kwarg("skip_footer", "skipfooter")
def read_excel(io,

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 3, 2018 via email

@mroeschke
Copy link
Member

engine_kwargs is getting added in 1.3 to read_excel which I think fits the spirit of this issue.

https://pandas.pydata.org/pandas-docs/dev/whatsnew/v1.3.0.html#other-api-changes

Closing, but happy to reopen if there is more to this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants