Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix parsing of trecxml topics #414

Merged
merged 6 commits into from
Nov 9, 2023
Merged

Conversation

lukaszett
Copy link
Contributor

The current implementation of _read_topics_trecxml is expecting the topic number to be supplied via attribute to the topic tag. However, this does not seem to be the current way TREC is formatting topics in xml. See for example the health misinfo topics:

<topic>
<number>101</number>
<query>ankle brace achilles tendonitis</query>
<description>
Will wearing an ankle brace help heal achilles tendonitis?
</description>
<narrative>
Achilles tendonitis is a condition where one experiences pain in the Achilles tendon located near the heel. An ankle brace is usually worn around the ankles to protect and limit movement. A very useful document would discuss the effectiveness of using ankle braces to help heal Achilles tendonitis. A useful document would help a user make a decision about the use of ankle braces for treating tendonitis by providing information about recommended treatments for Achilles tendonitis, ankle braces, or both.
</narrative>
<disclaimer>
We do not claim to be providing medical advice, and medical decisions should never be made based on the stance we have chosen. Consult a medical doctor for professional advice.
</disclaimer>
<stance>unhelpful</stance>
<evidence>
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3134723/
</evidence>
</topic>

My change should not break backwards compatibility, also added a test to confirm this (see topic 3 still using an attribute to set the number).

@cmacdonald
Copy link
Contributor

Nice job. Could you add your name and affiliation to the README.md list please?

(see topic 3 still using an attribute to set the number).

Could you add an XML comment to the test file that says this. <!-- Use attribute rather than tag --> or something similar.

@cmacdonald cmacdonald merged commit f7391ae into terrier-org:master Nov 9, 2023
13 checks passed
@cmacdonald
Copy link
Contributor

super, thanks @lukaszett

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants