If you're struggling to install CUDA on your Linux machine, rest assured you're not alone. Pretty much every single CUDA installation tutorial out there was neither a complete guide nor for the faint of heart. I managed to break the Ubuntu system twice already (the second time was when I was trying to upgrade from CUDA 7.5 to 8.0, but I got a bit more careful and had all my files backed up).
This tutorial hopefully will help some of you who is still struggling with the installation, in case all else fails. It's probably not as exhaustive as I hope it to be, but at least it's out there for your reference.
The common problem when installing CUDA is the NVIDIA drivers. Uninstalling or reinstalling the drivers during the CUDA installation process is a big no-no. The reason being is that you would only need to install the toolkit, NOT the NVIDIA driver itself because the current driver on your Linux machine is the stable version for the system.
Before we start, lets do a bit of package update and make sure you have updated NVIDIA drivers for Ubuntu. So, from the terminal, write:
Assuming that you're installing CUDA on a freshly installed Linux system or does not have pre-existing CUDA installed, here are the steps:
1) From your terminal, download the latest CUDA 8 repo from the NVIDIA website:
2) Add the repo to your system:
3) Update package list:
4) Then, we install the toolkit:
The above command line is the real trick of installing CUDA without NVIDIA drivers. If you do 'sudo apt-get install cuda', this will try to install the driver and most likely you will have a failed installation.
5) The next step is to update the .bashrc configuration:
6) Then append the following lines:
7) Save the file and do 'source':
8) Now, we need to set up the environment:
9) Then add this line and save the file:
10) Source it:
11) Next is this one:
2) Source again:
13) Run 'ldconfig':
Now, you might see a warning that says '/sbin/ldconfig.real: /usr/lib32/nvidia-375/libEGL.so.1 is not a symbolic link' but we'll fix that later.
14) Do the same for the cuda-8-0.conf:
17) To fix the '/sbin/ldconfig.real: /usr/lib32/nvidia-375/libEGL.so.1 is not a symbolic link' warning, we need to write this in the terminal:
18) Run the ldconfig again and hopefully the warning has disappeared.
19) The moment of truth: Lets reboot the system and lets also hope that I didn't miss out anything that is important. Once rebooted, do 'nvidia-smi' on your terminal and hopefully it will look something like this:
20) Finally, to verify CUDA 8.0 has been installed, do 'nvcc --version', and you will see this info:
For those who managed to get the same result as mine: Congratulations, you got CUDA 8.0 up and running! All of these steps are done without going to the TTY mode and stopping the X session. Pretty neat right!
For those who still have problems with the installation, well, I'm so sorry. CUDA is very finicky to install and depending on your system's configuration, there are quite a few potential errors that I did not include in this tutorial and you may have to search for the solutions yourself. Let me reiterate that this tutorial is for the freshly installed Linux system or machines that does not have any pre-existing CUDA installed.
Now, moving on to CUDNN installation:
Make sure to download CUDNN from the NVIDIA website. Installation is very simple, all you need to do is copy the cudnn.h to the system's CUDA-8.0 'include' folder and do the same for the all the files in the lib64 folder and copy them to the corresponding lib64 folder.
But, you won't be able to copy and paste those files straight away. You will need access to the file manager as a superuser, so type:
A new window pops up and you may proceed to copy those files from there.
Thanks for reading and please feel free to drop a message if you have any questions. Cheers!