Configurations
UCE has three different configuration levels and with it, three different config files. These levels are:
- INSTANCE
- CORPUS
- DEVELOPER
If you're a user only setting up UCE via Docker, then the DEVELOPER level is of no interest to you. In the following, we outline the different configurations and their usage within UCE.
UCE Configuration
(INSTANCE)
UCE is customizable in a variety of ways, including color scheme, corpora identity, metadata, and more. To inject your UCE instance with your configuration, the uceConfig.json file exists. Through it, you can model the UCE instance within JSON and then pass that uceConfig.json file into the Web Portal through the command line.
You can copy the example uceConfig.json below and create your own configuration from it.
uceConfig.json
{
  "meta": {
    "name": "John Doe Lab",
    "version": "1.0.0",
    "description": "The John Doe Lab works in the field of finance analysis and, in this context, gathers large amounts of data for their sentiment or entailment tasks. This data is made available through the <b>Finance</b> corpus. Herein, ..."
  },
  "corporate": {
    "team": {
      "description": "The team behind the Finance corpus is part of the <a target='_blank' href='https://www.john-doe-lab.org/'>John Doe Lab</a> of the Doe-University.",
      "members": [
        {
          "name": "Prof. John Doe",
          "role": "Supervisor",
          "description": "Mr. Doe is the supervisor of the lab.",
          "contact": {
            "name": "Prof. Dr. John Doe",
            "email": "doe@doe-university.de",
            "website": "https://john-doe.org/team/john-doe/",
            "address": "Doe-Street 10<br/>11111 Doe"
          },
          "image": "FILE::https://upload.wikimedia.org/wikipedia/commons/9/99/Sample_User_Icon.png"
        },
      ]
    },
    "contact": {
      "name": "John Doe Lab",
      "email": "doe@doe-university.de",
      "website": "https://www.john-doe-lab.org/contact",
      "address": "Doe-Street 10<br/>11111 Doe"
    },
    "website": "https://www.john-doe-lab.org",
    "logo": "FILE::https://upload.wikimedia.org/wikipedia/commons/9/99/Sample_User_Icon.png",
    "name": "John Doe Lab",
    "primaryColor": "#00618f",
    "secondaryColor": "rgba(35, 35, 35, 1)",
    "imprint": "<p>No imprint set.</p>"
  },
  "settings": {
    "rag": {
      "models": [
        {
          "model": "ollama/gemma3:latest",
          "url": "http://your.ollama.server.com/",
          "apiKey": "",
          "displayName": "Gemma3 (4.3B - Google)"
        },
        {
          "model": "openai/o4-mini",
          "url": "",
          "apiKey": "YOUR_OPENAI_API_KEY",
          "displayName": "GPT-4o-mini (OpenAI)"
        }
      ]
    },
  "analysis": {
    "enableAnalysisEngine": false
  },
  "authentication": {
    "isActivated": false,
    "publicUrl": "http://localhost:8080",
    "redirectUrl": "http://localhost:4567/auth"
  },
}
| Property | Description | 
|---|---|
| name | Name of your project or your lab, shown on the front page of the web portal. | 
| version | Your personal version counts. | 
| description | A description shown on the front page of the portal. Use it to describe your UCE instance. | 
| Property | Description | 
|---|---|
| team | Outline and display your team in a dedicated Teams-Tab within your UCE instance. | 
| team.description | Describe the team working on this project. | 
| team.members[] | Create a list of member-objectsto model your team and each member. | 
| contact | The contact information is shown in the footer of the webportal. Deposit contact information such as name, website and email for others to contact you through the UCE instance. | 
| website | The website of your lab or corporation. | 
| logo | The logo is shown in the top left of the web portal. You can inject the logo via a file path FILE::{PATH}(works with online paths as well) or directly through Base64-encoded imagesBASE64::data:image/png;base64,{BASE64}. | 
| name | The name of your lab or corporation. | 
| primaryColor | Set the primary color for the UCE web portal and model your color scheme. | 
| secondaryColor | Set the secondary color for the UCE web portal and model your color scheme. | 
| imprint | Fill in a full HTML page of your imprint which will then be available via button in the footer of your UCE instance. | 
| Property | Description | 
|---|---|
| rag | Set the settings for the RAGbot (if applicable). | 
| rag.models[] | A list of supported LLMs that power the RAGBot for the user. All listed models will be available for the user via dropbox. | 
| rag.models.model | A language model that UCE is supposed to power the RAGBot with. Currently, we support Ollama and OpenAI out of the box, so this name needs to be the actual model's id (e.g. ollama/gemma3:latest* or openai/gpt-4o-mini). | 
| rag.models.url | Needed if a local Ollama server is used. This is the base url to that server which will be used by the RAG Service to communicate with it. | 
| rag.models.apiKey | Needed if the OpenAI API is used. Fill in your own OpenAI api key that will be used by the RAG Service for communication. | 
| rag.models.displayName | The name the user sees for this model in the UCE webportal. | 
| Property | Description | 
|---|---|
| enableAnalysisEngine | Enable or disable the built-in analysis engine into UCE, which is powered through DUUI. | 
| Property | Description | 
|---|---|
| isActivated | Enable or disable the authentication server for UCE, allowing login and user access, which is powered through Keycloak. | 
| publicUrl | This is the base url under which the Keycloak authentication server is reachable by UCE. If the default url or port was changed of the Keycloak Service, this needs to be adjusted. | 
| redirectUrl | This is the base url of the running UCE webportal, which is then passed into Keycloak. The latter needs this url for communicating with its client's callbacks, in this case UCE. | 
Within the source code, you also find a defaultUceConfig.json that you can mirror. This is also the configuration UCE uses if no explicit config is provided. Inject the uceConfig.json into the UCE web portal by means of command line arguments, as outlined in earlier sections.
Corpus Configuration
(CORPUS)
As the name suggests, the corpusConfig.json holds metadata about a single corpus within UCE. Unlike the uceConfig.json, the corpus config is obligatory and needs to be imported by the Corpus-Importer.
You can copy the example corpusConfig.json below and create your own configuration from it.
corpusConfig.json
{
  "name": "Corpus_Name",
  "author": "University Doe",
  "language": "de-DE/en-EN/...",
  "description": "The corpus was gathered as part of the John Doe project.",
  "addToExistingCorpus": true,
  "annotations": {
    "annotatorMetadata": false,
    "uceMetadata": false,
    "logicalLinks": false,
    "OCRPage": false,
    "OCRParagraph": false,
    "OCRBlock": false,
    "OCRLine": false,
    "srLink": false,
    "lemma": false,
    "namedEntity": false,
    "sentence": false,
    "sentiment": false,
    "emotion": false,
    "time": false,
    "geoNames": false,
    "taxon": {
      "annotated": false,
      "//comment": "[Are the taxons annotated with biofid onthologies through the 'identifier' property?]",
      "biofidOnthologyAnnotated": false
    },
    "wikipediaLink": false,
    "completeNegation": false,
    "cue": false,
    "event": false,
    "focus": false,
    "scope": false,
    "xscope": false,
    "unifiedTopic": false
  },
  "other": {
    "//comment": "[Is this corpus also available on https://sammlungen.ub.uni-frankfurt.de/? Either true or false]",
    "availableOnFrankfurtUniversityCollection": false,
    "includeKeywordDistribution": false,
    "enableEmbeddings": false,
    "enableRAGBot": false
  }
}
| Property | Description | 
|---|---|
| name | The name assigned to the corpus. | 
| author | The entity or institution that created the corpus. | 
| language | Languages included in the corpus, specified in locale format (e.g., "de-DE", "en-EN"). | 
| description | A brief overview of the corpus and its purpose. | 
| addToExistingCorpus | Boolean flag indicating whether to append this data to an existing corpus *(looked up by name)*, or whether a new corpus should be created. | 
| annotations | Object outlining how the corpus was annotated and which annotation layers are available. | 
| annotations.annotatorMetadata | Boolean flag indicating if metadata about the annotator (e.g., name, date, or tool used) is included. | 
| annotations.uceMetadata | Boolean flag indicating if metadata per document is included (e.g. publishers, author etc.), which is done through its own UIMA-Typesystem. | 
| annotations.logicalLinks | Boolean flag indicating if logical or structural links between annotation layers (e.g., reference chains or document relations) are included. This is also done through its own UIMA-Typesystem. | 
| annotations.OCRPage | Boolean flag indicating if OCR data at the page level is included. | 
| annotations.OCRParagraph | Boolean flag indicating if OCR data at the paragraph level is included. | 
| annotations.OCRBlock | Boolean flag indicating if OCR data at the block level is included. | 
| annotations.OCRLine | Boolean flag indicating if OCR data at the line level is included. | 
| annotations.srLink | Boolean flag indicating if semantic role links (verb-argument structures) are annotated. | 
| annotations.lemma | Boolean flag indicating if lemmatization (base forms of words) is performed. | 
| annotations.namedEntity | Boolean flag indicating if named entities (e.g., persons, locations, organizations) are annotated. | 
| annotations.sentence | Boolean flag indicating if sentence boundaries are annotated. | 
| annotations.sentiment | Boolean flag indicating if sentiment analysis annotations (positive, neutral, negative) are included. | 
| annotations.emotion | Boolean flag indicating if emotion annotations (e.g., anger, joy, sadness) are included. | 
| annotations.time | Boolean flag indicating if temporal expressions (e.g., dates, time spans) are annotated. | 
| annotations.geoNames | Boolean flag indicating if geographic names are annotated and linked to GeoNames identifiers. | 
| annotations.taxon | Object containing details about taxon annotations. | 
| annotations.taxon.annotated | Boolean flag indicating if taxons are annotated in the corpus. | 
| annotations.taxon.biofidOnthologyAnnotated | Boolean flag indicating if taxons are annotated with BioFID ontologies through the identifierproperty. | 
| annotations.wikipediaLink | Boolean flag indicating if Wikipedia links are included for entities or terms. | 
| annotations.completeNegation | Boolean flag indicating if complete negation structures (negation cues and scopes) are annotated. | 
| annotations.cue | Boolean flag indicating if linguistic cues (e.g., trigger words for negation or modality) are annotated. | 
| annotations.event | Boolean flag indicating if event annotations (occurrences, actions, or states) are included. | 
| annotations.focus | Boolean flag indicating if focus annotations (focus elements or highlighted text segments) are included. | 
| annotations.scope | Boolean flag indicating if negation or modality scopes are annotated. | 
| annotations.xscope | Boolean flag indicating if extended scopes (cross-sentence or multi-event) are annotated. | 
| annotations.unifiedTopic | Boolean flag indicating if unified topic annotations (global thematic categories) are included. | 
| other | Object containing additional corpus-related properties. The following flags require the setup of the RAG-Service. | 
| other.availableOnFrankfurtUniversityCollection | Boolean flag indicating if the corpus is also available via the Frankfurt University Collection. | 
| other.includeKeywordDistribution | Boolean flag indicating if keyword distribution data should be generated and cached upon import. | 
| other.enableEmbeddings | Boolean flag indicating if embeddings (vector representations of texts) should be created and cached upon import. | 
| other.enableRAGBot | Boolean flag indicating if the RAGBot feature (retrieval-augmented chatbot for corpus interaction) should be enabled. | 
Common Configuration
(DEVELOPER)
In the source code's uce.common module, you'll find a common.conf file. In it, you can adjust and edit any configurations needed to run the application, such as DB connection strings, API endpoints, and the like. To properly run UCE in a development setting, you need to ensure that all the local connection strings match your setup. For that, the most relevant ones are:
| Property | Description | 
|---|---|
| rag.webserver.base.url | The base url to the RAG-service's webserver (if setup), e.g.: http://localhost:5678/. | 
| sparql.host | The base url to the Sparql-service's webserver (if setup), e.g.: http://localhost:3030/ | 
| sparql.endpoint | The endpoint of the Sparql-service's webserver, e.g.: my-ontology/sparql | 
| postgresql.hibernate.connection.url | The connection string to the Postgresql-DB-service, e.g.: jdbc:postgresql://localhost:5433/uce | 
You'll also find one more file, called common-release.conf. Since, for the release, most of the connections differ from the local setup (specifically in the docker compose network), the common-release.conf is used when building UCE with docker and has the same properties as its debug counterpart.