Skip to content

A large-scale approach to multifactorial analysis of morphosyntactic alternations

Notifications You must be signed in to change notification settings

kayaulai/morphosyn-alternation-largescale

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

morphosyn-alternation-largescale

This is the GitHub repository for my current project on a large-scale approach to multifactorial analysis in morphosyntactic alternations.

The files are supposed to be run in order (1-extract.py is the first to be run, and so on.)

There are currently five steps in the process (the files at each process are subject to change):

-Step 1: Extract the data from the treebank and turn them into a Python-readable form -Step 2: Extract clausal information from the treebank, focusing on within-sentence features -Step 2.5-3: Extract clausal information from the treebank, focusing on inter-sentence features -Step 4: Manually correct errors in automatic extraction -Step 5: Encode and analyse data

The code files are those with the step ID at the beginning. In addition, there are CSV files generated from those code files. Currently, 'dec25table.csv' is used as the main source of data in Step 5. Correction files are those containing extra columns which correct the values of the automatically-extracted columns of steps 2-3; the currently used correction files are:

-dec22-table-withintersentence-modified-3000clauses.csv

Attributions:

Universal Dependencies - English gold standard treebank

https://github.com/UniversalDependencies/UD_English-EWT

Syllable count:

https://github.com/eaydin/sylco

About

A large-scale approach to multifactorial analysis of morphosyntactic alternations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published