-
Notifications
You must be signed in to change notification settings - Fork 10
core_CR
Johnsd11 edited this page Mar 6, 2023
·
5 revisions
Reads document texts from text files in a directory tree.
Parameter name | Parameter description | Example Values | Default | Mandatory |
---|---|---|---|---|
WriteBanner | Write a large banner at each major step of the pipeline. | false | ||
InputDirectory | Directory for all input files. | true | ||
Encoding | The character encoding used by the input files. | false | ||
Extensions | The extensions of the files that the collection reader will read. | false | ||
KeepCR | Keep windows-format carriage return characters at line endings. This will only keep existing characters | false | ||
CRtoSpace | Change windows-format CR + LF character sequences to LF + . | false | ||
PatientLevel | The level in the directory hierarchy at which patient identifiers exist. Default value is 1; directly under root input directory. | false | ||
StripQuotes | Replace document-enclosing quote characters with space characters. | false |
Reads document texts from database text fields.
Parameter name | Parameter description | Example Values | Default | Mandatory |
---|---|---|---|---|
SqlStatement | SQL statement to retrieve the document. | true | ||
DocTextColName | Name of column from resultset that contains the document text. | true | ||
DbConnResrcName | Name of external resource for database connection. | true | ||
DocIdColNames | Specifies column names that will be used to form a document ID. | false | ||
DocIdDelimiter | Specifies delimiter used when document ID is built. | false | ||
ValueFileResrcName | Name of external resource for prepared statement value file. | false |
Reads document texts from Lucene text fields.
Parameter name | Parameter description | Example Values | Default | Mandatory |
---|---|---|---|---|
IndexDirectory | Location of lucene index | true | ||
FieldName | Field to look in for document text | false | ||
MaxWords | Maximum number of words to process (approximate -- actually based on characters | true |
Reads document texts from text files specified in a provided list.
Parameter name | Parameter description | Example Values | Default | Mandatory |
---|---|---|---|---|
files | The text files to be loaded | true |
Reads document texts and annotations from XMI files specified in a provided list.
Parameter name | Parameter description | Example Values | Default | Mandatory |
---|---|---|---|---|
files | The XMI files to be loaded | true |
Reads document texts and annotations from XMI files in a directory tree.
Parameter name | Parameter description | Example Values | Default | Mandatory |
---|---|---|---|---|
WriteBanner | Write a large banner at each major step of the pipeline. | false | ||
InputDirectory | Directory for all input files. | true | ||
Encoding | The character encoding used by the input files. | false | ||
Extensions | The extensions of the files that the collection reader will read. | false | ||
KeepCR | Keep windows-format carriage return characters at line endings. This will only keep existing characters | false | ||
CRtoSpace | Change windows-format CR + LF character sequences to LF + . | false | ||
PatientLevel | The level in the directory hierarchy at which patient identifiers exist. Default value is 1; directly under root input directory. | false | ||
StripQuotes | Replace document-enclosing quote characters with space characters. | false |
- PLACEHOLDER
- ctakes-assertion
- ctakes-chunker
- ctakes-clinical-pipeline
- ctakes-constituency-parser
- ctakes-context-tokenizer
- ctakes-core
- ctakes-coreference
- ctakes-dependency-parser
- ctakes-dictionary-lookup
- ctakes-dictionary-lookup-fast
- ctakes-drug-ner
- ctakes-examples-ctakes_cnlpt_py
- ctakes-fhir
- ctakes-lvg
- ctakes-ne-contexts
- ctakes-pbj
- ctakes-pos-tagger
- ctakes-preprocessor
- ctakes-relation-extractor
- ctakes-smoking-status
- ctakes-temporal
- ctakes-ytex
- ctakes-pbj
- Python COMPONENTS
- examples