Skip to content
/ sqlitify Public

Converts well-known public datasets (dictionaries, lists etc.) in SQLite database format

License

Notifications You must be signed in to change notification settings

amq5/sqlitify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

SQLITify

This project is collections of standalone scripts and patches for converting different pieces of data into SQLite database format. Right now it concentrated on dictionaries that exists in form of ad hoc text files or are purely web-based (this limits ability to query them alot).

urban-dictionary.py

Being run from command line, creates file urban-dict.db in current directory. Process is safe to interrupt with pressing Ctrl-C or programmaticaly (this is necessary because it takes very long time to complete) and will continue from point it was stopped previously.

hagen-full.py

Command line utility, usage python hagen-full.py "path/to/Полная парадигма. Морфология.txt" path/to/sqlite.db First argument is Russian morphology text file, it could be extracted from here (RAR archive). Second argument is resulting DB, it will coontain table parsed_morpho with structure

ColumnPossible values
new_groupTrue if first row of grouped words
main_wordTrue if this word is default form (like infinitive for verbs, etc.)
optionalTrue if this form is optional
wordWord itself
part_of_speech'сущ':1,'прл':2,'гл':3,'мест':4,'союз':5,'предик':6,'част':7,'межд':8,'предл':9, 'числ':10, 'прч':11, 'дееп':12, 'нар':13,'ввод':14
gender'муж':1, 'жен':2, 'ср':3,'общ':4
number'ед':1,'мн':2
plural'им':1,'род':2,'дат':3,'вин':4,'тв':5,'пр':6,'зват':7,'счет':8,'мест':8,'парт':10
tense'буд':3,'наст':2, 'прош':1
declension'1-е':1,'2-е':2,'3-е':3
transitive'перех':1,'пер/не':2,'непер':3
spirit'одуш':1,'неод':2
adverb_type'вопр':1,'обст':2,'опред':3,'сравн':4
circumstance_type'врем':1,'места':2,'напр':3,'причин':4,'цель':5
definition_type'степ':1,'кач':2, 'спос':3
perfect_type'сов':1,'несов':2,'2вид':3
number_type'кол':1,'поряд':2,'собир':3,'неопр':4
pronoun_type'прил':1,'сущ':2,'нар':3
infinitive1 if true
pledge1 if 'страд'
impersonal1 if 'безл'
shortened1 if 'крат'
immutable1 if 'неизм'
reflexive1 if 'воз'
superlative1 if 'прев'
imperative1 if 'пов'

About

Converts well-known public datasets (dictionaries, lists etc.) in SQLite database format

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages