The first step
- launch the AWS ec2 type
G4dn-xlarge
1# initialization the default ebs volume
2sudo file -s /dev/nvme2n1
3lsblk -f
4mkfs -t xfs /dev/nvme2n1
5mount /dev/nvme2n1 /mnt
6
7# persistent the mount info to `/etc/fstab`
8# view the UUID
9blkid
10
11# write the information
12echo "UID=xxxxx-3047-437a-81f0-xxxxx /mnt xfs defaults,nofail 0 2" >> /etc/fstab
- Extends the root ebs volume
1# modify the ebs volume size in aws console
2# extend the partition
3growpart /dev/nvme0n1 1
4# extend thf filesystem
5resize2fs /dev/nvme0n1p1
1# Add Docker's official GPG key:
2sudo apt-get update
3sudo apt-get install ca-certificates curl
4sudo install -m 0755 -d /etc/apt/keyrings
5sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
6sudo chmod a+r /etc/apt/keyrings/docker.asc
7
8# Add the repository to Apt sources:
9echo \
10 "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
11 $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
12 sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
13sudo apt-get update
14sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
15
16sudo docker run hello-world
- Install the
nvidia-driver
1sudo apt-get install -y nvidia-driver-525 nvidia-dkms-525
2# to view the GPU information
3nvidia-smi
Enable the docker GPU runtime
1# Add Nvidia repo to system
2distribution=ubuntu22.04 && \
3curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
4curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
5# install the Nvidia container toolkit
6sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
7# configure the docker runtime
8sudo nvidia-ctk runtime configure --runtime=docker
9# restart the docker daemon
10sudo systemctl restart docker
11docker info | grep Runtimes
Prepare the pytorch environment
1# pull the full functional docker image
2docker pull pytorch/pytorch:2.4.0-cuda12.1-cudnn9-runtime
3# create local host path to mount to container
4mkdir -p /mnt/models
5# running the docker container
6docker run --gpus=all -it -v /mnt/models:/models -p 8000:8000 pytorch/pytorch:2.4.0-cuda12.1-cudnn9-runtime bash
7# install huggingface cli in container
8pip install huggingface_hub[cli]
Download the Model in Huggingface from container
1huggingface-cli download Qwen/Qwen2.5-7B-Instruct --local-dir=./Qwen2.5-7B-Instruct/ --cache-dir=./cache --local-dir-use-symlinks=False --resume-download
2huggingface-cli download facebook/opt-125m --local-dir=./opt-125m/ --cache-dir=./cache --local-dir-use-symlinks=False --resume-download
Launch the LLM model
1# in container `/models/Qwen2.5-7B-Instruct` folder
2# make sure the model folder is in current path
3# if the GPU is old maybe add the --dtype float
4vllm serve Qwen2.5-7B-Instruct/ --dtype float
5vllm serve opt-125m/ --dtype float
1git clone https://github.com/langgenius/dify.git
2cd dify/docker
3cp .env.example .env
4docker compose up -d
5
6docker compose ps
- Access the Dify: http://your ip/install
Reference