Northeastern Neo-Aramaic Corpus Data

Northeastern Neo-Aramaic consists of a very diverse group of Aramaic dialects that were spoken until modern times in Northern Iraq, North West Iran and South Eastern Turkey by Christian and Jewish communities. These are among the last remaining living vestiges of the Aramaic language, which was one of the major languages of the region in antiquity.

This text corpus consists of transcribed and recorded texts gathered by Prof. Geoffrey Khan and his team in their efforts to preserve these increasingly endangered languages. This repository contains raw and corrected source texts written in North Eastern Neo-Aramaic (NENA) dialects. All original source material is converted to a standard mark-up format, regardless of its original format.

The purpose of this repository is to curate source texts that will be used for building a complete text corpus in Text-Fabric. The corpus will in turn be analyzed and annotated for linguistic features.

Name		Name	Last commit message	Last commit date
Latest commit History 256 Commits
docs		docs
parsed_texts		parsed_texts
sources		sources
standards		standards
text_parser		text_parser
texts		texts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Northeastern Neo-Aramaic Corpus Data

Contents

nena_format - description of the NENA mark-up format

standards - standards for NENA alphabet, linguistic codes, including regex patterns

texts - contains the NENA texts by version and dialect in NENA mark-up

parsed_texts - contains all NENA texts parsed into JSON hierarchies

text_parser - a SLY parser for producing the NENA JSON parsings from NENA mark-up

sources - original source material used for generating NENA texts

About

Releases 2

Packages

Languages

License

CambridgeSemiticsLab/nena_corpus

Folders and files

Latest commit

History

Repository files navigation

Northeastern Neo-Aramaic Corpus Data

Contents

nena_format - description of the NENA mark-up format

standards - standards for NENA alphabet, linguistic codes, including regex patterns

texts - contains the NENA texts by version and dialect in NENA mark-up

parsed_texts - contains all NENA texts parsed into JSON hierarchies

text_parser - a SLY parser for producing the NENA JSON parsings from NENA mark-up

sources - original source material used for generating NENA texts

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages