Automatically adjusting GPU clock by temperature

2011-07-12 09:34 by Ian

My friend and I have built a BTC mining rig named hurrMiner. It contains two Radeon 6990 graphics cards that run very hot (~95C). The box doesn’t require regular user interaction, and it is very loud. We opted to build an exhaust system into the garage and put hurrMiner outside to contend with the Arizona summer heat.

Apparently, there is a thermal shutoff hard-coded onto the card, but we really don’t want to let the GPUs reach that point. What follows is my documentation for a script that will monitor the temperature of each GPU and adjust their clocks to approach a given target temperature.

Before we begin the setup and doc… Source (tempmonitor.php).

Overview:
This script was designed to do two things:

The stats reporting function is a secondary function that you can ignore. The core function of the script is to regulate the clocks to avoid baking the GPUs. To do this doesn’t require any computer besides the mining rig that it runs on. If you have another server someplace running PHP and Apache, you will get advanced utility from the statistics reports, but that will be covered in another post. For now, I will describe the temperature monitor.

Setup:
Prerequisites:

The script uses the exec() function to call aticonfig. So the first requirement is to edit your php.ini file to allow it to run external programs with the exec() function. Also, you need to have the ATI drivers installed and able to see all of your GPUs. If you are reading about this script, I would hope you have already come this far.

We are using the phoenix miner, so our function that parses hashrates reflects this. We switched from poclbm, so that will work as well, but you will have to make subtle changes to the log parser to get a hashrate from it. Again, this is not required to adjust the GPUs automatically. It is only used to collect performance statistics.

Features:
Adapters do not need to be manually defined. The script will parse all adapter information from aticonfig and store a representation of each GPU as an object. This allows for clean encapsulation and easy individual control over each GPU. If the temperature really starts to fly away (as it occasionally does), the script will start cutting the clock more aggressively to bring the temperature under control.

The script reports these logs to derpServ once per second for each GPU in the system:

Results:
Here is a page with graphical output from the stats collector. At the time of this writing, GPU_3 is getting the best ventilation, and so runs at 920MHz all through the hottest part of the day. The other GPUs have a tendency to hover in the low 800 range until a few hours after the sun sets.

Update 2016.02.26:

The mining rig was taken apart when it fell below RoI, and its parts were sold off to friends. A friend who bought one of the 6990 cards just informed me that it died today, and won’t be coming back. I don’t know how much of the card’s life-span can be attributed to my software, but I’d like to think it was most of it. I beat the hell out of those cards.

Previous:
Next: