ctakes relation extractor

The relation extractor is designed to annotation relations between certain Event, Entity and Modifier annotations.
There are currently models trained for detecting body site and severity using machine learning with a model trained on manually annotated clinical data.

Collection Readers
Annotation Engines
Utilities
Piper Files

Collection Readers

XMI Reader (3)

Reads document texts and annotations from XMI files specified in a provided list.

Source class: XMIReader
Source package: org.apache.ctakes.relationextractor.eval
Parent class: org.apache.uima.fit.component.JCasCollectionReader_ImplBase
Products: Document Id

Parameter	Description	Class	Required	Default
files	The XMI files to be loaded	List	Yes

Annotation Engines

Causal Relation Annotator

Annotates Causal relations in sentences.

Source class: CausesBringsAboutRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Generic Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Degree of Annotator

Annotates Degree Of relations.

Source class: DegreeOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Degree Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Degree of Annotator 1

Annotates Degree Of relations in sentences containing a single entity mention of a valid degree_of type and a single modifier.

Source class: Baseline1DegreeOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.DegreeOfRelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Degree Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Degree of Annotator 2

Annotates Degree Of relations between two shortest-distance entities in sentences with multiple modifiers.

Source class: Baseline2DegreeOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Degree Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Degree of Annotator 3

Annotates Degree Of relations between two shortest-distance entities in sentences as long as there is no intervening modifier.

Source class: Baseline3DegreeOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Degree Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Degree of Annotator 4

Annotates Degree Of relations between two entities whenever they are enclosed within the same noun phrase.

Source class: Baseline4DegreeOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.DegreeOfRelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Degree Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Location of Annotator

Annotates Location Of relations.

Source class: LocationOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Location Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Location of Annotator

Annotates Location Of relations.

Source class: ThreadSafeLocationExtractor
Source package: org.apache.ctakes.relationextractor.concurrent
Parent class: org.apache.ctakes.relationextractor.ae.LocationOfRelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Location Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Location of Annotator 1

Annotates Location Of relations in sentences containing exactly two entities (where the entities are of the correct types).

Source class: Baseline1EntityMentionPairRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Location Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Location of Annotator 2

Annotates Location Of relations in sentences containing with multiple anatomical sites.

Source class: Baseline2EntityMentionPairRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Location Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Location of Annotator 3

Links each anatomical site with the closest entity of a type that's suitable for location_of, as long as there is no intervening anatomical site.

Source class: Baseline3EntityMentionPairRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Location Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Location of Annotator 4

Annotates Location Of relations between two entities whenever they are enclosed within the same noun phrase.

Source class: Baseline4EntityMentionPairRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae.baselines
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Location Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Manages / Treats Annotator

Annotates Manages / Treats relations.

Source class: ManagesTreatsRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Generic Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Manifestation of Annotator

Annotates Manifestation Of relations.

Source class: ManifestationOfRelationExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Generic Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Modifier Extractor

Annotates Modifiers and Chunks.

Source class: ModifierExtractorAnnotator
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.cleartk.ml.CleartkAnnotator
Dependencies: Base Token, Sentence
Products: Identified Annotation, Chunk

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No

Thread safe Degree of Annotator

Annotates Degree Of relations.

Source class: ThreadSafeDegreeExtractor
Source package: org.apache.ctakes.relationextractor.concurrent
Parent class: org.apache.ctakes.relationextractor.ae.DegreeOfRelationExtractorAnnotator
Dependencies: Sentence, Identified Annotation
Products: Degree Relation

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No
ProbabilityOfKeepingANegativeExample	probability that a negative example should be retained for training	double	No

Thread safe Modifier Extractor

Annotates Modifiers and Chunks.

Source class: ThreadSafeModifierExtractor
Source package: org.apache.ctakes.relationextractor.concurrent
Parent class: org.apache.ctakes.relationextractor.ae.ModifierExtractorAnnotator
Dependencies: Base Token, Sentence
Products: Identified Annotation, Chunk

Parameter	Description	Class	Required	Default
classifierFactoryClassName	provides the full name of the ClassifierFactory class to be used.	String	No	org.cleartk.ml.jar. JarClassifierFactory
dataWriterFactoryClassName	provides the full name of the DataWriterFactory class to be used.	String	No	org.cleartk.ml.jar. DefaultDataWriterFactory
isTraining	determines whether this annotator is writing training data or using a classifier to annotate. Normally inferred automatically based on whether or not a DataWriterFactory class has been set.	Boolean	No

Utilities

Anafora XML Reader (Metastasis)

Reads annotations from DeepPhe schema Anafora XML files in a directory.

Source class: MetastasisAnaforaXMLReader
Source package: org.apache.ctakes.relationextractor.metastasis
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Products: Identified Annotation, Location Relation

No available configuration parameters.

Gold Annotation Copier

Copies an annotation type from the Gold view to the System view.

Source class: CopyFromGold
Source package: org.apache.ctakes.relationextractor.eval
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase

Parameter	Description	Class	Required	Default
AnnotationClasses		Class[]	Yes
GoldViewName		String	Yes

Gold Stats Calculator

Count various stats such as token and relation counts based on the gold standard data.

Source class: GoldAnnotationStatsCalculator
Source package: org.apache.ctakes.relationextractor.data
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Dependencies: Sentence, Base Token, Identified Annotation, Generic Relation, Location Relation, Degree Relation

No available configuration parameters.

Identified Annotation Expander

Enlarges the text span of an identified annotation based upon part of speech.

Source class: IdentifiedAnnotationExpander
Source package: org.apache.ctakes.relationextractor.ae
Parent class: org.apache.uima.fit.component.JCasAnnotator_ImplBase
Dependencies: Identified Annotation

No available configuration parameters.

Piper Files

Default Relation Pipeline

Clinical Pipeline with degree-of and location-of relations.

Default Relation Pipeline

$\textcolor{gray}{\textsf{// Clinical Pipeline with degree-of and location-of relations. }}$

$\textcolor{gray}{\textsf{// Default Clinical Pipeline }}$
$\textcolor{magenta}{\textbf{load}}$ DefaultFastPipeline

$\textcolor{gray}{\textsf{// degree-of, relation-of }}$
$\textcolor{magenta}{\textbf{load}}$ RelationSubPipe

Relation Sub Pipe

Commands and parameters to create a default relation extraction sub-pipeline.

Relation Sub Pipe

$\textcolor{gray}{\textsf{// Commands and parameters to create a default relation extraction sub-pipeline. }}$
$\textcolor{gray}{\textsf{// This is not a full pipeline. }}$

$\textcolor{gray}{\textsf{// Modifiers. Use addLogged to log start and finish of processing. There aren't default models, so set specifically }}$
$\textcolor{green}{\textbf{add}}$ ModifierExtractorAnnotator $\textcolor{purple}{\textbf{classifierJarPath}}$= $\textcolor{violet}{\textsf{/org/apache/ctakes/relation/extractor/models/modifier\_extractor/model.jar}}$

$\textcolor{gray}{\textsf{// Degree of severity, etc. }}$
$\textcolor{green}{\textbf{add}}$ DegreeOfRelationExtractorAnnotator $\textcolor{purple}{\textbf{classifierJarPath}}$= $\textcolor{violet}{\textsf{/org/apache/ctakes/relation/extractor/models/degree\_of/model.jar}}$

$\textcolor{gray}{\textsf{// Location. }}$
$\textcolor{green}{\textbf{add}}$ LocationOfRelationExtractorAnnotator $\textcolor{purple}{\textbf{classifierJarPath}}$= $\textcolor{violet}{\textsf{/org/apache/ctakes/relation/extractor/models/location\_of/model.jar}}$

Sectioned Relation Pipeline

Clinical Pipeline with section, paragraph and list detection and degree-of and location-of relations

Sectioned Relation Pipeline

$\textcolor{gray}{\textsf{// Clinical Pipeline with section, paragraph and list detection and degree-of and location-of relations }}$

$\textcolor{gray}{\textsf{// Default Clinical Pipeline with section, paragraph and list detection }}$
$\textcolor{magenta}{\textbf{load}}$ SectionedFastPipeline

$\textcolor{gray}{\textsf{// degree-of, relation-of }}$
$\textcolor{magenta}{\textbf{load}}$ RelationSubPipe

Ts Default Relation Pipeline

Thread Safe Default Clinical Pipeline with degree-of and location-of relations

Ts Default Relation Pipeline

$\textcolor{gray}{\textsf{// Thread Safe Default Clinical Pipeline with degree-of and location-of relations }}$

$\textcolor{gray}{\textsf{// Default Clinical Pipeline }}$
$\textcolor{magenta}{\textbf{load}}$ TsDefaultFastPipeline

$\textcolor{gray}{\textsf{// degree-of, relation-of }}$
$\textcolor{magenta}{\textbf{load}}$ TsRelationSubPipe

Ts Relation Sub Pipe

Commands and parameters to create a relation extraction sub-pipeline.

Ts Relation Sub Pipe

$\textcolor{gray}{\textsf{// Commands and parameters to create a relation extraction sub-pipeline. }}$
$\textcolor{gray}{\textsf{// This is not a full pipeline. }}$

$\textcolor{gray}{\textsf{// Modifiers. Use addLogged to log start and finish of processing. There aren't default models, so set specifically }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{concurrent.ThreadSafeModifierExtractor}}$ $\textcolor{purple}{\textbf{classifierJarPath}}$= $\textcolor{violet}{\textsf{/org/apache/ctakes/relation/extractor/models/modifier\_extractor/model.jar}}$

$\textcolor{gray}{\textsf{// Degree of severity, etc. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{concurrent.ThreadSafeDegreeExtractor}}$ $\textcolor{purple}{\textbf{classifierJarPath}}$= $\textcolor{violet}{\textsf{/org/apache/ctakes/relation/extractor/models/degree\_of/model.jar}}$

$\textcolor{gray}{\textsf{// Location. }}$
$\textcolor{green}{\textbf{add}}$ $\textcolor{blue}{\textsf{concurrent.ThreadSafeLocationExtractor}}$ $\textcolor{purple}{\textbf{classifierJarPath}}$= $\textcolor{violet}{\textsf{/org/apache/ctakes/relation/extractor/models/location\_of/model.jar}}$

Ts Sectioned Relation Pipeline

Thread Safe Clinical Pipeline with section, paragraph and list detection and degree-of and location-of relations.

Ts Sectioned Relation Pipeline

$\textcolor{gray}{\textsf{// Thread Safe Clinical Pipeline with section, paragraph and list detection and degree-of and location-of relations. }}$

$\textcolor{gray}{\textsf{// Default Clinical Pipeline with section, paragraph and list detection }}$
$\textcolor{magenta}{\textbf{load}}$ TsSectionedFastPipeline

$\textcolor{gray}{\textsf{// degree-of, relation-of }}$
$\textcolor{magenta}{\textbf{load}}$ TsRelationSubPipe

ctakes relation extractor

Collection Readers

XMI Reader (3)

Annotation Engines

Causal Relation Annotator

Degree of Annotator

Degree of Annotator 1

Degree of Annotator 2

Degree of Annotator 3

Degree of Annotator 4

Location of Annotator

Location of Annotator

Location of Annotator 1

Location of Annotator 2

Location of Annotator 3

Location of Annotator 4

Manages / Treats Annotator

Manifestation of Annotator

Modifier Extractor

Thread safe Degree of Annotator

Thread safe Modifier Extractor

Utilities

Anafora XML Reader (Metastasis)

Gold Annotation Copier

Gold Stats Calculator

Identified Annotation Expander

Piper Files

Default Relation Pipeline

Relation Sub Pipe

Sectioned Relation Pipeline

Ts Default Relation Pipeline

Ts Relation Sub Pipe

Ts Sectioned Relation Pipeline

Home

Running cTAKES

Pipelines

GUI Tools

Pipeline Components

Python Bridge to Java

Examples

Code Modules

General

Clone this wiki locally