Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shape Mismatch error for new data set #32

Open
rudra0713 opened this issue Oct 21, 2019 · 6 comments
Open

Shape Mismatch error for new data set #32

rudra0713 opened this issue Oct 21, 2019 · 6 comments

Comments

@rudra0713
Copy link

Hey, I have been trying to use a sentiment analysis dataset with the imdb class (mentioned in the notebook) as a multitask.

This is the sample format of the sentiment data:
train_data = [['I', 'am', 'going', 'to', 'school', '.'], ['I', 'am', 'not', 'feeling', 'good', '.']] train_labels = [0, 1] test_data = [['I', 'wass', 'so', 'sick', 'yesterday', '.']] test_labels = [1]
Unfortunately, this runs to the error

ValueError: generator yielded an element of shape (48,) where an element of shape () was expected.

Can you kindly help me solve this issue?

@JayYip
Copy link
Owner

JayYip commented Oct 22, 2019

Seems it's mixing the data and labels. Did you use the exactly same pre-process function in the notebook?

@rudra0713
Copy link
Author

Thanks for your response. This is my preprocessing function:
`@preprocessing_fn
def sentiment_cls(params, mode):
# train_data = pickle.load(open("data/sentiment_train_data.p", "rb"))
# train_labels = pickle.load(open("data/sentiment_train_label.p", "rb"))
# test_data = pickle.load(open("data/sentiment_test_data.p", "rb"))
# test_labels = pickle.load(open("data/sentiment_test_label.p", "rb"))

train_data = [['I', 'am', 'going', 'to', 'school', '.'], ['I', 'am', 'going', 'to', 'college', '.']]
train_labels = [0, 1]
test_data = [['I', 'am', 'going', 'to', 'university', '.']]
test_labels = [0]

label_encoder = get_or_make_label_encoder(params, 'sentiment_cls', mode, train_labels + test_labels)

if mode == TRAIN:
    input_list = train_data
    target_list = train_labels
else:
    input_list = test_data
    target_list = test_labels
return input_list, target_list

`
The first four lines load the actual dataset. Since that was not working, I tried with toy exaxples, which is also not working.

This is the new problem dictionary:
new_problem_type = {'imdb_cls': 'cls', 'sentiment_cls': 'cls'} new_problem_process_fn_dict = {'imdb_cls': imdb_cls, 'sentiment_cls': sentiment_cls}

Please let me know if I am missing something very simple.

@JayYip
Copy link
Owner

JayYip commented Oct 23, 2019

Could you please try changing the input data to ['I am going to school .', 'I am going to college .']?

@rudra0713
Copy link
Author

I tried that, but the error does not change.

@JayYip
Copy link
Owner

JayYip commented Oct 25, 2019 via email

@rudra0713
Copy link
Author

Thanks. Please, let me know if you find anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants