Skip to content

Nexdata-AI/200475-Sentences-Chinese-Text-Normalization-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

200475-Sentences-Chinese-Text-Normalization-Data

Description

200,475 Sentences - Chinese Text Normalization Data. Annotate the special symbols and Arabic numerals in the sentences as Chinese characters.

For more details, please refer to the link: https://www.nexdata.ai/datasets/tts/1102?source=Github

Specifications

Data content

200,475 sentences of text were transcribed in Chinese characters;

Data scale

200,475 original texts with 457,832 annotations;

Content source

Sentences extracted from various types of news, articles, novels, etc.

Language

Chinese;

Annotation

Annotate the special symbols and Arabic numerals in the sentences as Chinese characters;

Applications

TTS, Text normalization;

Licensing Information

Commercial License