Skip to content

trenslow/thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modeling Mutual Intelligibility between Slavic Language Pairs

by Tyler Renslow

This repository contains all scripts and data used for my master's thesis.

The repository structure loosely follows that of the Team Data Science Process developed by Microsoft. More info can be found here.

Software Dependencies

All scripts were written in Python 3, with additional packages used for different tasks.

Packages for data processing:

Packages for modeling:

  • TensorFlow (code was written when v1.7 was latest, may be broken now)
  • For training TensorFlow models on NVIDIA GPUs, follow the instructions at this link.

TODO:

  • refactor all paths in scripts
  • check compatability with latest TensorFlow version
  • find more efficient way to store scraped wikipedia articles, with the goal of making them easier to share and process
  • store all large data in compressed files to save disk space
  • reformat log files in a smart way to reflect which feature set used to train model

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published