In this blog post, we'll dive into setting up a powerful AI development environment using Docker Compose. The setup includes running the Ollama language model server and its corresponding web interface, Open-WebUI, both containerized for ease of use.
Ollama is an open-source platform designed for running artificial intelligence models locally, leveraging GPU acceleration to enhance performance. It serves as a framework that allows users to deploy and manage AI models efficiently, particularly those requiring significant computational resources.

The Open-WebUI (Graphical User Interface) component of Ollama provides a user-friendly interface accessible via a web browser. This WebUI enables interaction with the AI models hosted by Ollama without the need for technical expertise or direct command-line access. Despite its name, the Open-WebUI may still require GPU support due to real-time interactions with computationally intensive tasks, though this is less common and might vary based on specific use cases.

he provided Docker Compose file sets up two services:
- ollama: An instance of the Ollama language model server.
- open-WebUI: A web-based interface that interacts with the Ollama server.
This setup allows you to run AI models locally and access them through a browser, all while leveraging Docker's containerization for easy management.
Let's Break Down the Compose File
1. The ollama
Service
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 2
capabilities: [gpu]
environment:
- OLLAMA_HOST=0.0.0.0:11434
restart: unless-stopped
Key Configurations
- Image: Uses the official
ollama/ollama
image in its latest version. - Container Name: The container will be named
ollama
. - Port Mapping: Maps port
11434
on the host to port11434
in the container. This is necessary because Ollama serves its API on this port. - Volumes: Persists data using a Docker volume named
ollama_data
, mounted at/root/.ollama
. This ensures that any downloaded models or configurations are retained even if the container stops or restarts. - GPU Resources: The
deploy.resources.reservations.devices
section configures NVIDIA GPU usage. Specifically, it requests two GPUs withgpu
capabilities. This is essential for running GPU-accelerated AI models. - Environment Variable: Sets
OLLAMA_HOST=0.0.0.0:11434
, which tells Ollama to listen on all interfaces (i.e., not just localhost) and the specified port. - Restart Policy: The container will restart unless explicitly stopped (
restart: unless-stopped
).
2. The open-WebUI
Service
services:
open-webui:
image: ghcr.io/open-WebUI/open-WebUI:main
container_name: open-WebUI
ports:
- "3000:8080"
volumes:
- openwebui_Data:/app/Backend/Data
environment:
- OLLAMA_BASE_URL=http://ollama:11434
extra_hosts:
- "host.docker.internal:host-gateway"
restart: always
Key Configurations
- Image: Uses the
ghcr.io/open-WebUI/open-WebUI
image from GitHub Container Registry. - Container Name: The container will be named
open-WebUI
. - Port Mapping: Maps port
3000
on the host to port8080
in the container. This is where the web interface will be accessible. - Volumes: Persists data using a Docker volume named
openwebui_Data
, mounted at/app/Backend/Data
. This allows for persistent storage of any user configurations or generated content. - Environment Variable: Sets
OLLAMA_BASE_URL=http://ollama:11434
, which configures the web interface to communicate with the Ollama server running in its own container. - Extra Hosts: Adds an entry to
/etc/hosts
inside the container, mappinghost.docker.internal
to the Docker host's IP address. This is useful for connecting to services on the host machine from within a container. - Restart Policy: The container will always restart upon failure (
restart: always
).
3. Volumes
volumes:
ollama_data:
openwebui_data:
This section defines two named volumes:
ollama_data
: Stores data for the Ollama service.openwebui_data
: Stores data for the Open-WebUI service.
Named volumes are recommended as they provide better control over data persistence and separation compared to bind mounts.

How It All Works
- Ollama Service:
- Starts with GPU support, allowing it to run compute-intensive AI models.
- Listens on port
11434
for incoming API requests. - Data is persisted in the
ollama_data
volume.
- Open-WebUI Service:
- Provides a web interface accessible at
http://localhost:3000
. - Communicates with the Ollama server via the specified base URL (
http://ollama:11434
). - Data is persisted in the
openwebui_data
volume.
- Provides a web interface accessible at
Getting Started
To use this setup:
- Install Docker: Ensure Docker and Docker Compose are installed on your system.
- NVIDIA Drivers: Make sure you have NVIDIA drivers and CUDA installed for GPU support.
- Docker Permissions: Grant Docker permission to access your GPUs.
- Run the Setup:
- Save the provided compose file as
docker-compose.yml
. - Run
docker compose up
to start both services.
- Save the provided compose file as
- Access the Web Interface:
- Open a browser and navigate to
http://localhost:3000
.
- Open a browser and navigate to
Key Considerations
- GPU Usage: The current configuration requests two GPUs. Adjust this based on your system's capabilities.
- Ports: Ensure that ports
11434
and3000
are not being used by other services. - Volumes: Named volumes can be managed using Docker commands, allowing you to back up or restore data as needed.
This setup provides a robust environment for working with AI models locally. By leveraging Docker Compose, you can manage both the Ollama server and its web interface efficiently. Whether you're developing new AI applications or experimenting with existing ones, this configuration offers flexibility and scalability, especially when combined with GPU acceleration.
Let me know if you have any questions or need further clarification and I will explain why I setup this instance in a next blog post !