Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pythainlp.wsd for Thai Word Sense Disambiguation #818

Merged
merged 8 commits into from
Jul 14, 2023
Merged

Conversation

wannaphong
Copy link
Member

@wannaphong wannaphong commented Jul 12, 2023

pythainlp.wsd is the class for Thai Word Sense Disambiguation.

  • pythainlp.wsd.get_sense - Get word sense from the sentence. This function will get definition and distance from context in sentence.
  • pythainlp.corpus.thai_wsd_dict - Get Thai Word Sense Disambiguation dictionary with definition from wiktionary.
  • pythainlp.corpus.thai_dict - Get Thai dictionary with definition from wiktionary.

Example

from pythainlp.wsd import get_sense

print(get_sense("เขากำลังอบขนมคุกกี้","คุกกี้"))
# output: [('โปรแกรมคอมพิวเตอร์ใช้ในทางอินเทอร์เน็ตสำหรับเก็บข้อมูลของผู้ใช้งาน', 0.0974416732788086), ('ชื่อขนมชนิดหนึ่งจำพวกขนมเค้ก แต่ทำเป็นชิ้นเล็ก ๆ แบน ๆ แล้วอบให้กรอบ', 0.09319090843200684)]

print(get_sense("เว็บนี้ต้องการคุกกี้ในการทำงาน","คุกกี้"))
# output: [('โปรแกรมคอมพิวเตอร์ใช้ในทางอินเทอร์เน็ตสำหรับเก็บข้อมูลของผู้ใช้งาน', 0.1005704402923584), ('ชื่อขนมชนิดหนึ่งจำพวกขนมเค้ก แต่ทำเป็นชิ้นเล็ก ๆ แบน ๆ แล้วอบให้กรอบ', 0.12473666667938232)]

Google colab for testing: https://colab.research.google.com/drive/1YvTSrJ4lghFMxB5XuZMHKP-TNjTw0u5w?usp=sharing

@wannaphong wannaphong added the enhancement enhance functionalities label Jul 12, 2023
@wannaphong wannaphong added this to the 4.1 milestone Jul 12, 2023
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@wannaphong wannaphong changed the title Add pythainlp.wsd Add pythainlp.wsd for Thai Word Sense Disambiguation Jul 12, 2023
@wannaphong wannaphong marked this pull request as ready for review July 12, 2023 08:35
@coveralls
Copy link

coveralls commented Jul 12, 2023

Coverage Status

coverage: 83.522% (-8.1%) from 91.616% when pulling f38c51a on add-wsd into 2e7dc23 on dev.

@sonarcloud
Copy link

sonarcloud bot commented Jul 13, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@wannaphong wannaphong merged commit bf74de6 into dev Jul 14, 2023
12 of 19 checks passed
This was referenced Jun 21, 2023
@wannaphong wannaphong deleted the add-wsd branch December 4, 2023 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement enhance functionalities
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants