Custom AI Agent

🚀 AgentAI Deployment Guide

Pre-trained using your own data for full control and privacy.

This setup enables you to deploy AgentAI with support for the following services:

🌐 OpenWebUI
🧠 Ollama Server
🤖 N8N Workflow
🔗 Pipelines Server
📜 Example Pipeline Script (aiagent.py)
🦠 AIAgent - Tech support chat bot. Rust example application (based on LangChain)
🗄 Qdrant vector database

With this configuration, you can create custom AI agents using your own infrastructure (on-premises, private cloud, or VPS) with providers like:

️🌌 Hetzner Cloud | ️🌌 AWS | ️🌌 Google Cloud | ️🌌 Azure

All applications are packed into docker containers.

📍 Accessing Deployed Services

Once the package is deployed, you can access the services at:

Service URL's:

💬 OpenWebUI

http://<your-server-ip>:3001
http://chat.<yourdomain>.<tld>

🔁 N8N Workflow

http://<your-server-ip>:5678
http://n8n.<yourdomain>.<tld>

🧠 Ollama Server

http://<your-server-ip>:11434

🗂️ All Docker files and configurations are located in /root/ on your server.

🔧 Custom RAG App Integration with OpenWebUI

To build your own RAG (Retrieval-Augmented Generation) application and connect it to OpenWebUI:

✅ The Pipelines server is pre-configured and running in Docker.
📁 Add your pipeline script to: /root/openwebui/pipelines/ or use the example file provided: aiagent.py.

🔌 Connecting Pipelines Server to OpenWebUI

🔐 Go to OpenWebUI → Account Settings → Admin Settings

➕ Add the following:

URL: http://pipelines:9099
API Key: Provided in the PDF attachment in your welcome email

🧹 Navigate to “Pipelines” from the left menu.
⬇️ Select your custom script from the dropdown.
📂 Save the configuration.
💬 Your script will now appear in the model selection dropdown in the chat interface.

🦠 AIAgent - Tech support chat bot

The data source file for the app can be modified in:

/root/aiagent/datasource/csv/example.csv

📁 Server Requirements

To run AgentAI and related services efficiently, we recommend the following hardware specifications:

💡 GPU (Recommended): Greatly improves LLM response speed and model performance.
📊 CPU: More cores = better performance. Minimum 8 vCPUs recommended.
📈 RAM: At least 16 GB for baseline functionality. More is better, especially when running multiple containers.

We tested the package on an 8 vCPU / 16 GB RAM server without GPU, and it ran slowly. While GPU is not strictly required, it is strongly recommended.

You can install this package on your own server, which may be more cost-effective and customizable than using public cloud services.

⚠️ Disclaimer

⚠️ Experimental Version.
This release is for testing and development purposes only. It is not production-ready.