Putting ilab in a container AND making it go fast
Containerization of ilab allows for portability and ease of setup. With this,
users can now run lab on OpenShift to test the speed of ilab model train and generate
using dedicated GPUs. This guide shows you how to put the ilab CLI, all of its
dependencies, and your GPU into a container for an isolated and easily reproducible
experience.
Build the ilab container image
We encourage you to read the Containerfile to understand the following explanation.
To reduce the size of the final image, we do a multi-stage build with the development
CUDA image for the build stage and a virtual environment. The base image for the final
stage is the CUDA runtime image, with no build tools and we copy the site-packages
folder from the first stage.
You will also notice that the entry point of the container image is /opt/app-root/bin/ilab.
This allows to call the ilab command with opening a new shell. See below how it
is beneficial when combined with an alias.
For convenience, we have created a cuda target in the Makefile, so building is
as simple as make cuda.
The default image name is localhost/instructlab, but you can override it with
the CONTAINER_PREFIX environment variable. The following command line will build an
image tagged quay.io/ai-labs/instructlab:cuda:
make cuda CONTAINER_PREFIX=quay.io/ai-labs/instructlab
Configure Podman with NVIDIA container runtime
To configure your machine running RHEL 9.4+, you can follow NVIDIA's documentation to install NVIDIA container toolkit and to configure Container Device Interface to expose the GPUs to Podman.
Here is a quick procedure if you haven't.
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
sudo dnf config-manager --enable nvidia-container-toolkit-experimental
sudo dnf install -y nvidia-container-toolkitThen, you can verify that NVIDIA container toolkit can see your GPUs.
Example output:
INFO[0000] Found 2 CDI devices nvidia.com/gpu=0 nvidia.com/gpu=all
Finally, you can generate the Container Device Interface configuration for the NVIDIA devices:
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
Run the GPU-accelerated ilab container
When running our model, we want to create some paths that will be mounted in
the container to provide data persistence. As an unprivileged user, you will
run a rootless container, so you need to let the internal user
access the files in the host. For that, we use podman unshare chown.
mkdir -p ${HOME}/.ilab podman unshare chown 1001:1001 -R ${HOME}/.ilab
Then, we can run the container, mounting the above folder and using the first NVIDIA GPU.
podman run --rm -it --user 1001 --device nvidia.com/gpu=0 --volume ${HOME}/.ilab:/opt/app-root/ilab:Z localhost/instructlab:cudaThe above command will print the help, as we didn't pass any argument.
Let's initialize our configuration, download the model and start the chat bot.
podman run --rm -it --device nvidia.com/gpu=0 --volume ${HOME}/.ilab:/opt/appr-root/ilab:Z localhost/instructlab:cuda init podman run --rm -it --device nvidia.com/gpu=0 --volume ${HOME}/.ilab:/opt/app-root/ilab:Z localhost/instructlab:cuda download podman run --rm -it --device nvidia.com/gpu=0 --volume ${HOME}/.ilab:/opt/app-root/ilab:Z localhost/instructlab:cuda chat
Creating an alias
Now that you know how to run the container, you probably find it cumbersome
to type the long podman run command, so we provide an alias definition in
the containers/cuda/instructlab-cuda.alias
file.
You simply need to put it in your ${HOME}/.bashrc.d folder and restart your
bash shell to be able to call only instructlab or ilab.
Container file dependencies
Currently our e2e CI tests do not use these containers and the main user documentation README.md instructs users to install from PyPI rather than build from these container files. However, dependencies on these in-tree container files, such as with the ai-lab-recipes project may exist.