Background#
Due to the relatively low performance of computers used in typical front-end development scenarios, I previously spent less than 1k on a set of I5-6600 + Mini ITX motherboard in a small case, which has been my main development machine. After upgrading to 32G of RAM, even the integrated graphics driver is more than sufficient for daily development tasks. Additionally, I have a MacBook available for iOS development tasks. Recently, AI has been incredibly popular, and coincidentally, some of the company's projects involve using Stable Diffusion for small painting applications. After trying it out, I found it quite interesting. Since the company's development environment uses shared computing resources, I thought about upgrading the graphics card with old hardware to reach the threshold for playing with AI.
Component Selection#
AI painting requires a large amount of video memory to store models and training data. Therefore, the larger the video memory capacity, the better the AI painting results. Thus, video memory capacity is a very important factor in my selection process. After looking at many posts online, it seems that at least 8GB of video memory is needed to handle AI painting tasks, but to draw higher resolution images without running out of memory, it is recommended to choose 16GB or more.
Searching on the second-hand market, many older computing cards such as P102, P104, P106, P40, M40, and P100 offer great value for money, with 8G or even 16G of video memory available for less than a thousand yuan or even two to three hundred yuan. I initially planned to choose the P100 as the graphics card for this upgrade. However, considering that the Pascal architecture is somewhat outdated, and seeing that the V100 chip is the next generation of the P100 with better performance and not too much price difference, I decided to go for the V100 SXM2 16GB server version of the computing card (the reason for choosing SXM2 will be mentioned later).
Build Configuration#
Here is the configuration for this build:
Motherboard: Gigabyte GA-B250M-DS3H
CPU: Intel Core I5-6600
Graphics Card: NVIDIA V100 SXM2 16G
Memory: 2 x 16G 2666MHz from Guangwei
Hard Drive: Seagate FireCuda 520 1TB SSD
The motherboard was chosen to accommodate the old 6th generation I5 and to support a large graphics card, so I opted for an M-ATX motherboard considering the size, which can be found for just over a hundred. The old machine was compact and powered by DC, so it wasn't worth much, so I found a G4400T low-power chip on the second-hand market for over twenty yuan to install Debian as a NAS device. I chose cheaper memory since the old motherboard wouldn't support frequencies over 2400. The hard drive was taken from the old machine, purchased for 299 yuan before the price increase, and now I regret not buying a couple more as prices have skyrocketed.
The main highlight of the graphics card selection is the Tesla V100, specifically the SXM2 version, which requires a PCIe adapter to be used on a consumer PC motherboard. The price of the adapter is roughly the same as the graphics card, around 1300+, and I didn't choose the PCIe version because it is similarly priced but more expensive. However, if the graphics card becomes outdated in the future, I can still use the adapter and replace it with other SXM2 specification graphics cards, which are generally cheaper than PCIe versions.
Why Not Choose Modified Graphics Cards#
It is well known that currently, modified versions of the 2080TI 22G offer better cost performance for tasks like alchemy and drawing, and the Turing architecture is more advanced. However, considering that the memory chips have been replaced, the stability, based on many online reviews, is not very good. After comprehensive consideration, I still chose the server-specific computing card.
Modifying BIOS to Support Above 4G#
After assembling the components mentioned above, when I powered on, I was surprised that the machine wouldn't boot and went directly to the BIOS interface, displaying a long string of English messages indicating that it detected insufficient PCI resources and could not drive the PCI device.
At this point, I hurriedly searched online for related cases to see if there were any solutions. Indeed, there was one: find a switch in the motherboard BIOS called Above 4G and set it to enabled. If your motherboard BIOS can enable Resizable BAR, that would be even better, as it will enhance performance.
Additionally, it is important to note that enabling Above 4G means that the system boot mode must also be changed to UEFI, and the CSM option in the motherboard BIOS needs to be set to disabled. You can find many tutorials online on how to reinstall the system via UEFI; they are plentiful and simple, so I won't elaborate further.
In summary, three options need to be set:
- Enable Above 4G
- Enable Resizable BAR (if available)
- Disable CSM compatibility for system boot
The key point is that I did not expect my Gigabyte B250M motherboard to lack this option. After searching for many cases of the same motherboard online, I found no solutions, so I had to modify the hidden options in the BIOS.
Using AMIBCP to Enable Hidden Options in the Motherboard#
First, remove the graphics card from the motherboard, as you cannot enter the system otherwise.
Once successfully booted into the system, download the latest BIOS file for the motherboard from the official website as a base for modification.
Download the AMIBCP software, open the original BIOS file you just downloaded, and note that the software defaults to a limited format; select the option for all file types to find it.
Then, as shown in the image, find the Above 4G option, change Access/Use to User, and set the last two items to Enabled.
After making the changes, save or save it as a new BIOS file, and remember to distinguish it from the original BIOS to avoid confusion when trying to revert to the original BIOS later.
Using AFUWINGUI to Flash the BIOS#
Taking the Gigabyte motherboard as an example, if the modified BIOS file cannot be updated using the official method, it must be flashed using a third-party method. There are many ways to do this, such as using a programmer, but I chose the most convenient method, which is to flash it directly in Windows.
First, download and open AFUWINGUI.
Click the start button, select the modified BIOS file from above, then click the refresh button on the right, wait for the refresh status to complete, and when it shows Done, the modified BIOS has been successfully flashed.
You can use the GPU-Z software, and by following the options shown in the image below, you can see that Above 4G has been enabled.
At this point, you can power down, install the graphics card, and restart.
Postscript#
As the title suggests about the bumpy journey, it was certainly not that simple. After installing the graphics card, it was recognized normally, and after installing NVIDIA's drivers, everything regarding video memory frequency was also normal.
What puzzled me was that after rebooting, executing nvidia-smi
in the command line surprisingly indicated no devices. Upon checking the device manager, I found that the graphics card had a yellow triangle icon. I rebooted several times, but it was still ineffective, and it seemed like a failure. I then uninstalled the driver, rebooted, and reinstalled it, and to my surprise, it was successfully recognized again. However, after another reboot, the yellow triangle icon reappeared.
Could there be some missing settings? I was completely baffled and spent several days without finding the cause, and there were no related cases online.
Until one day, while looking back at the photos of the graphics card I took, I noticed that a capacitor had fallen off the upper right corner of the graphics card. I immediately contacted the seller for a replacement, suspecting that this might be the cause.
After a long wait for the replacement, I was surprised that the new graphics card had the same issue. After rebooting WIN10, I needed to uninstall the driver again and then reinstall it for the graphics card to work properly. I suspect it might be due to the motherboard being too old or some issue with the system drivers. However, since I rarely shut down my computer, I just made do with it. I plan to try installing WIN11 or find other ways to research this issue when I have time.
Later, I found that I only needed to enter the NVIDIA driver management to disable ECC detection after successfully installing the driver once, and I verified that it wouldn't lose the driver after rebooting.
A few days later, this method failed. I continued searching for solutions and ultimately found a compromise: running a script to uninstall the graphics card before shutting down. When I powered on, the graphics card could be detected and driven normally. The reason is that the motherboard is a consumer-grade platform rather than a server platform, which has fewer PCIe lanes. Once the lanes are tight, it can lead to insufficient resources for the graphics card. The solution is to uninstall the graphics card before shutting down, allowing the system to skip the detection of this graphics card during boot. When the computer detects the graphics card upon startup, it can be driven normally. We can create a one-click script to solve this:
Create a directory Scripts
on the C drive, create a .txt
file, save the following content (make sure to choose ANSI encoding format when saving), and save it as Uninstall-NVIDIA.ps1
:
$deviceName = "NVIDIA Tesla V100-SXM2-16GB"
$device = Get-PnpDevice | Where-Object { $_.FriendlyName -eq $deviceName }
if ($device) {
pnputil.exe /remove-device $device.InstanceId
}
In the current directory, create another .txt
file, save the following content, and save it as Uninstall-GPU.bat
:
@echo off
powershell.exe -ExecutionPolicy Bypass -File "C:\Scripts\Uninstall-NVIDIA.ps1"
Now, before each shutdown, just click this uninstall script Uninstall-GPU.bat
, and after rebooting or shutting down, the graphics card will be recognized normally. This is currently the best solution. If you don't want to do this, you can only switch to a motherboard with more PCIe lanes, such as X99 or X299.