About a week ago, I was asked to build a new server. This is going to be used for research purposes so the spec is quite high. 16 dedicated CPU cores, 110GB RAM and an NVIDIA Tesla T4 GPU. It’s running on Azure and the applications needed on it are a little different. So this was a lot of fun.
First the VM type: It’s a Standard_NC16as_T4_v3 server. You can’t just go buy one of these. You must create a support request with Microsoft so that they can release the number of cores required for this specific type of server. This is a painful process! There were 200 processor cores available in that subscription but obviously not at the right type. However, there is a very useful category when creating a support request in the Azure Portal for requesting additional cores. What isn’t so useful is the portal didn’t understand that I had enough cores. I needed the specific cores for this research server. I spoke to a HPC (High Computer Performance) specialist about something unrelated during the week and he knew what I was talking about right away. But it took over a week for Azure Support to understand what I was looking for then make the required changes.
Moving on, Once Microsoft did what they needed, setting up the new server wasn’t difficult. It was created within about 10 minutes after I finished with the VM creation wizzard.
The main requirements of this server are Cuda and KenLM and this is really what this post is about. I don’t spend every day in a Linux environment. So when I need to install something like this that I wouldn’t use often, I rely heavily on documentation. It’s not that I couldn’t go hunt down all the installation sources and dependencies. But that would be a waste of time. And time is not something I really like to waste.
I took notes during this process. These include the commands that I used to install everything and the various sources I read through to learn a bit more about what I was installing and how it could and should be done.
- I was given the option of quickly installing Cuda using Docker. But I didn’t want any performance issues to slow down this research project. In the cloud, time is money. But to be sure, I investigated sources online to see if there were major issues with performance. I quickly found many. Here’s just one.
- Install the NVIDIA drivers.
- This documentation regarding installing Cuda on Ubuntu 20.04 is out of date and initially took me in the wrong direction. It resulted in the installation of Cuda 10.1. Not the prefered version of 11.6.
- This describes the purposes and benefits of the Kaldi library.
- Getting started with Kaldi and text to speech. This is an NVIDIA created document.
- While researching this, I found a lot of documentation that indicated that I would run into trouble with NVIDIA drivers and potential conflicts with kernel versions and / or module versions.
- Here’s another document regarding trouble that people have encountered when trying to install specific NVIDIA drivers. Many people seem to downgrade the driver. But I didn’t want to acknowledge this as an option if at all possible.
- Kaldi is updated regularly but I seee there’s a competing project.
- It’s called Coqui-ai
- I thought I was going to be installing this from source at one point. So here’s the Kaldi Git repository.
- I ran into problems several times trying to find either the right versions or fixing unmet dependencies. There was a command in this post that I referred to frequently. It’s a no brainer really but I always prefer copying and pasting when I can avoid typing. Even when it’s something simple.
That command is:
apt clean; apt update; apt purge cuda; apt purge nvidia-*; apt autoremove - A poster on this forum topic provided the list of unmet dependencies for the 470 version of the NVIDIA GPU driver. So this saved me from dependency recursion hell.
- I finally found this fantastic NVIDIA source that gives you several choices then provides a resultant list of commands that basically does everything. I could have saved myself over an hour of trial and error if I had found this the first time.
- I then needed to install KenLM This was easy thanks to a guide found here.
In case anyone copies and pastes the following lines, I am going to proceed my comments with #.
# First you need to determine the GPU that you have and the suggested driver. Fortunately, this is way easier than it used to be.
apt install ubuntu-drivers-common
ubuntu-drivers devices
# Do not use this next command. It installs way too much and will result in massive dependency issues when you go to install Cuda.
# ubuntu-drivers autoinstall
# After installing the GPU driver, you must reboot.
reboot now
# The following command will install the NVIDIA gPU driver. It will also install the unmet dependencies.
apt install nvidia-driver-470 libnvidia-gl-470 libnvidia-compute-470 libnvidia-decode-470 libnvidia-encode-470 libnvidia-ifr1-470 libnvidia-fbc1-470
# This will install all of the Cuda dependencies.
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/7fa2af80.pub
add-apt-repository "deb https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/ /"
apt-get update
apt-get -y install cuda
# Add the Cuda binaries to your path:
echo 'export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}' >> ~/.bashrc
# You can test that Cuda is installed and that the version installed is as expected as follows:
nvcc --version
# IF at some point, you need to start again, this one-liner will remove all the NVIDIA and Cuda packages that you might have installed using aptitude / apt-get.
# apt clean; apt update; apt purge cuda; apt purge nvidia-*; apt autoremove apt install cuda
# The following lines will install KenLM on Ubuntu 20.04.
apt-get update
apt-get install build-essential libboost-all-dev cmake zlib1g-dev libbz2-dev liblzma-dev
apt-get install build-essential libboost-all-dev cmake zlib1g-dev libbz2-dev liblzma-dev -y
git clone https://github.com/kpu/kenlm
cd kenlm/
mkdir build
cd build
cmake ..
make -j 4
make install
0 Comments