You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When loading TACRED the following ValueError will be thrown:
Traceback (most recent call last):
File "/glusterfs/dfs-gfs-dist/dobbersc/.local/bin/miniconda3/envs/flair/lib/python3.9/code.py", line 90, in runcode
exec(code, self.locals)
File "<input>", line 1, in <module>
File "/glusterfs/dfs-gfs-dist/dobbersc/PyCharmProjects/flair/flair/datasets/relation_extraction.py", line 260, in __init__
super(RE_ENGLISH_TACRED, self).__init__(
File "/glusterfs/dfs-gfs-dist/dobbersc/PyCharmProjects/flair/flair/datasets/sequence_labeling.py", line 403, in __init__
super(ColumnCorpus, self).__init__(
File "/glusterfs/dfs-gfs-dist/dobbersc/PyCharmProjects/flair/flair/datasets/sequence_labeling.py", line 296, in __init__
[
File "/glusterfs/dfs-gfs-dist/dobbersc/PyCharmProjects/flair/flair/datasets/sequence_labeling.py", line 297, in <listcomp>
ColumnDataset(
File "/glusterfs/dfs-gfs-dist/dobbersc/PyCharmProjects/flair/flair/datasets/sequence_labeling.py", line 501, in __init__
sentence = self._convert_lines_to_sentence(
File "/glusterfs/dfs-gfs-dist/dobbersc/PyCharmProjects/flair/flair/datasets/sequence_labeling.py", line 696, in _convert_lines_to_sentence
key, value = comment_row.split("=", 2)
ValueError: too many values to unpack (expected 2)
The generated .conllu files from the original TACRED corpus contain comments not supported by the currently implemented comment parser. Example (since TACRED is a private dataset):
# text = `` Market conditions became more challenging through November and December , '' said Sir Stuart Rose , the company 's chief executive .
# sentence_id = e7798fb926d91a16cd93
# relations = 15;16;22;22;per:title
1 `` O
2 Market O
3 conditions O
4 became O
5 more O
6 challenging O
7 through O
8 November B-DATE
9 and O
10 December B-DATE
11 , O
12 '' O
13 said O
14 Sir O
15 Stuart B-PERSON
16 Rose I-PERSON
17 , O
18 the O
19 company O
20 's O
21 chief O
22 executive B-TITLE
23 . O
The text was updated successfully, but these errors were encountered:
Describe the bug
When loading TACRED the following ValueError will be thrown:
To Reproduce
Environment (please complete the following information):
Additional context
The cause of this error is from the changes in #3006 on how comments are handled in the
ColumnDataset
.flair/flair/datasets/sequence_labeling.py
Line 694 in 5a13598
The generated
.conllu
files from the original TACRED corpus contain comments not supported by the currently implemented comment parser. Example (since TACRED is a private dataset):The text was updated successfully, but these errors were encountered: