Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError: Can't instantiate abstract class MultiLabelPipeline with abstract methods _forward, _sanitize_parameters, postprocess, preprocess #7

Open
dujiaxin opened this issue Dec 3, 2021 · 5 comments

Comments

@dujiaxin
Copy link

dujiaxin commented Dec 3, 2021

Thanks for this great work.
I am using transformer v4
I know this is not transformer v2 as requested in the readme file but I cannot install v2.11.0 anymore because there are dependency errors in that version. And I tried v2.4.1 it raise other errors.

In transformer v4, it raises:
Traceback (most recent call last):
goemotions = MultiLabelPipeline(
TypeError: Can't instantiate abstract class MultiLabelPipeline with abstract methods _forward, _sanitize_parameters, postprocess, preprocess

Do you think you can update your code so it can work with the latest Hugginface transformer (v4)?

@santimarro
Copy link

santimarro commented Jan 13, 2022

Hey, I had the same issue but I managed to make it work with this:

`from transformers import AutoTokenizer, AutoModelForSequenceClassification
from pprint import pprint
from multilabel_pipeline import MultiLabelPipeline

tokenizer = AutoTokenizer.from_pretrained(
"monologg/bert-base-cased-goemotions-original"
)
model = AutoModelForSequenceClassification.from_pretrained(
"monologg/bert-base-cased-goemotions-original")

def tokenize_text(text):
# Replace "text" with whatever column name has your text inputs
return tokenizer(text, truncation=True)

texts = [
"Hey that's a thought! Maybe we need [NAME] to be the celebrity vaccine endorsement!",
"it’s happened before?! love my hometown of beautiful new ken 😂😂",
"I love you, brother.",
"Troll, bro. They know they're saying stupid shit. The motherfucker does nothing but stink up libertarian subs talking shit",
]

goemotions = MultiLabelPipeline(
model=model,
tokenizer=tokenizer,
threshold=0.3
)
pprint(goemotions(texts))`

Just make sure to use the multilabel_pipeline provided in the python with the same name in this repo!

@1shershah
Copy link

Hey, I had the same issue but I managed to make it work with this:

`from transformers import AutoTokenizer, AutoModelForSequenceClassification from pprint import pprint from multilabel_pipeline import MultiLabelPipeline

tokenizer = AutoTokenizer.from_pretrained( "monologg/bert-base-cased-goemotions-original" ) model = AutoModelForSequenceClassification.from_pretrained( "monologg/bert-base-cased-goemotions-original")

def tokenize_text(text): # Replace "text" with whatever column name has your text inputs return tokenizer(text, truncation=True)

texts = [ "Hey that's a thought! Maybe we need [NAME] to be the celebrity vaccine endorsement!", "it’s happened before?! love my hometown of beautiful new ken 😂😂", "I love you, brother.", "Troll, bro. They know they're saying stupid shit. The motherfucker does nothing but stink up libertarian subs talking shit", ]

goemotions = MultiLabelPipeline( model=model, tokenizer=tokenizer, threshold=0.3 ) pprint(goemotions(texts))`

Just make sure to use the multilabel_pipeline provided in the python with the same name in this repo!

This will still not work, if you create a custom Pipeline with abstract Pipeline, you have to override the
abstract methods _forward, _sanitize_parameters, postprocess, preprocess

@1shershah
Copy link

1shershah commented Jan 30, 2022

Solution :

from transformers import BertTokenizer
##files from git
from model import BertForMultiLabelClassification
from multilabel_pipeline import MultiLabelPipeline
from pprint import pprint

tokenizer = BertTokenizer.from_pretrained("monologg/bert-base-cased-goemotions-ekman")
model = BertForMultiLabelClassification.from_pretrained("monologg/bert-base-cased-goemotions-ekman")

texts = [
    "Hey that's a thought! Maybe we need [NAME] to be the celebrity vaccine endorsement!",
    "it’s happened before?! love my hometown of beautiful new ken 😂😂",
    "I love you, brother.",
    "Troll, bro. They know they're saying stupid shit. The motherfucker does nothing but stink up libertarian subs talking shit",
]
import torch
import numpy as np
results = []
for txt in texts:
    inputs = tokenizer(txt,return_tensors="pt")
    outputs = model(**inputs)
    scores =  1 / (1 + torch.exp(-outputs[0]))  # Sigmoid
    threshold = .3
    for item in scores:
        labels = []
        scores = []
        for idx, s in enumerate(item):
            if s > threshold:
                labels.append(model.config.id2label[idx])
                scores.append(s)
        results.append({"labels": labels, "scores": scores})

@bubbazz
Copy link

bubbazz commented Apr 2, 2022

the implementation above is nearly identical to the pipeline.py but i get different result. can somebody explain what the reason for this is?

results:

[{'labels': ['joy', 'neutral'],
  'scores': [tensor(0.3892, grad_fn=<UnbindBackward0>),
   tensor(0.5499, grad_fn=<UnbindBackward0>)]},
 {'labels': ['joy', 'surprise'],
  'scores': [tensor(0.9277, grad_fn=<UnbindBackward0>),
   tensor(0.4548, grad_fn=<UnbindBackward0>)]},
 {'labels': ['joy'], 'scores': [tensor(0.9889, grad_fn=<UnbindBackward0>)]},
 {'labels': ['anger'], 'scores': [tensor(0.7580, grad_fn=<UnbindBackward0>)]}]

@hannahburkhardt
Copy link

@bubbazz if you mean that you aren't getting outputs for all labels, but only the main labels, try this.

from transformers import BertTokenizer, AutoModelForSequenceClassification, pipeline

model_name = 'original' #'ekman'

tokenizer = BertTokenizer.from_pretrained(f"monologg/bert-base-cased-goemotions-{model_name}")
model = AutoModelForSequenceClassification.from_pretrained(f"monologg/bert-base-cased-goemotions-{model_name}", num_labels=28)

goemotions=pipeline(
        model=model, 
        tokenizer=tokenizer, 
        task="text-classification",
        return_all_scores=True,
        function_to_apply='sigmoid',
    )

goemotions(texts)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants