RAG Service
The RAG service is a Python web server running on Flask, which acts as the gateway to modern AI, NLP, and ML technologies. It handles the creation of embeddings, querying large language models (LLMs) for the RAGbot, and similar operations.
High Resources
Depending on the use case and the available machine, the RAG service may be resource-intensive, particularly in terms of RAM and GPU usage.
User Setup
For this, it is assumed that the repository has already been cloned in a prior step. Afterwards, simply start the service via Docker Compose:
The RAG service should now be up and running on the port mapped in the Dockerfile. See further down for information about model usage and customizable settings of the service.
Developer Setup
After cloning the repository, navigate to the rag
folder. There, create a new Python environment using a tool of your choice:
The name env
is already included in .gitignore
, so it's recommended to use that name if possible.
Activate the environment:
Activate on Windows
On Windows, activate the environment with the following command:
Then, install the dependencies and start the service:
Settings
As of now, the only language model that works out of the box without modifying the source code is ChatGPT. When enabling the RAGBot in the Corpus Configuration, you must provide your OpenAI key in the settings
section of the UCE Configuration.
However, it is easy to adjust the RAG service to query locally hosted or alternative language models instead of using the OpenAI API by diving the rag
source code. We plan to add out-of-the-box configuration options for this functionality as soon as possible.