Annotations
UCE is compatible with a variety of annotations, provided they exist within the UIMA format. Within UCE, these annotations are used situationally for features or search enhancements, depending on the annotation.
Below you will find an ever-expanding list of importable and compatible annotations within UCE, ranging from standard Named-Entity annotations to more situational taxon or time annotations. All of these annotations can be generated and annotated within the corpus through the Docker Unified UIMA Interface.
OCR
Since much of the literature has yet to be digitized, UCE provides support for corpora containing documents that have undergone Optical Character Recognition (OCR) extraction. These annotations assist in reconstructing the physical layout of the pages within UCE.
More Details
Sentence
Divides the documents into their respective sentences.
More Details
Named-Entity
Extracts named entities from a document, categorizing them into four types: organization (ORG), person (PER), location (LOC), and miscellaneous (MISC).
More Details
Lemma, POS & Morphological Features
Lemmatization reduces inflected words to their root form. Within UCE, searches are enhanced by considering these root forms.
More Details
Semantic Role Labels (SRL)
SRL identifies semantic relations between the lexical constituents of a sentence, assigning labels to words or phrases that indicate their semantic roles, such as agent, goal, or result.
More Details
Time
Extracts temporal expressions, including time and date formats, from a document, analogous to Named-Entity Recognition tasks.
More Details
UceDynamicMetadata
Offers a dynamic and easy way to annotate key-value filters, which are then imported and used within UCE for the creation of custom filters.
More Details
Taxon
The recognition of unambiguous names of biological entities is referred to as a taxon. Herein, UCE supports the import of multiple model-annotations, such as GNFinder or Gazetteer.
More Details
WikiLinks
Maps potential words and phrases to their corresponding Wikidata URLs, facilitating the retrieval and access of additional information.
More Details
UnifiedTopic
Extracts topics from a document in the form of a list of keywords or categories, which can be used to summarize the content or identify its main theme. The list of categories depends on the model used for annotation.
More Details
GeoNames
The recognition of locations within texts and their annotation with hierarchical data, alternate and historical names, and tagging with unique identifiers.
More Details
Logical Links
Link documents, annotations, and even texts to other entities so that they are connected with a defined edge and weight. UCE thus enables the grouping and linking of any entity with any other entity.
More Details
Negation
Identifies and marks negation cues, their scopes, and affected events or concepts within text. This helps to determine when statements express the absence, denial, or opposite of something mentioned.
More Details
Emotion
Detects emotional expressions or affective states (such as joy, anger, fear, or sadness) conveyed within text segments. These annotations can be used to study emotional tone and affective communication.
More Details
Sentiment
Analyzes the overall sentiment polarity of text (positive, negative, or neutral), enabling corpus-wide mood analysis or evaluation of opinions expressed in documents.
More Details