Skip to content

nikita-moor/etymological-dictionary

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Offline etymological dictionary

It is not a real application, but a DEMO I prepared for one (unaccomplished) employment.

Introduction

Simple Python application for demonstration of typical processing of RDF data:

  • data import
  • standard (RDFS) and custom vocabularies
  • RDF graph building
  • storing data into Berkeley DB (Sleepycat)
  • RDF graph querying (SPARQL)

Was created for demonstration of proof concept, but can be used for real inquiry of words' etymology:

> python ./dictionary.py --lang=eng scholar

Result:

scholar — English
 < scoler — Middle English (1100-1500)
  < scolere — Old English (ca. 450-1100)
   < scholaris — Latin
    < schola — Latin
     < σχολή — Ancient Greek (to 1453)
      < σχολεῖον — Ancient Greek (to 1453)

Pre-requisites

  1. Install Python 3 libraries: rdflib, bsddb.

  2. Download "Etymological Wordnet 2013-02-08" dataset extracted by Gerard de Melo from English Wiktionary (License: CC-BY-SA 3.0). Direct download link:

    etymwn-20130208.zip (26.2 Mb)

  3. Extract zip-file into the project folder (etymological-dictionary/etymwn.tsv).

  4. Run python ./import.py to import data into the internal database. It will take about 1 hour of time and 3.8 Gb of space.

Usage

Run dictionary.py with two parameters: a word and the corresponding ISO 639-3 language code:

> python ./dictionary.py --lang=eng muscular

To integrate the etymology dictionary into the GoldenDict application:

  1. Open Edit|Dictionaries...

  2. On the "Programs" tab add new item:

    • type: Plain text
    • command: python /path/to/dictionary.py --lang=eng %GDWORD%
  3. Add the newly created source to your dictionary set on the "Groups" tab

About

Offline etymological dictionary based on Wiktionary data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages