Gputil
A Python module for getting the GPU status from NVIDA GPUs using nvidia-smi programmically in Python
Install / Use
/learn @anderskm/GputilREADME
GPUtil
GPUtil is a Python module for getting the GPU status from NVIDA GPUs using nvidia-smi.
GPUtil locates all GPUs on the computer, determines their availablity and returns a ordered list of available GPUs.
Availablity is based upon the current memory consumption and load of each GPU.
The module is written with GPU selection for Deep Learning in mind, but it is not task/library specific and it can be applied to any task, where it may be useful to identify available GPUs.
Table of Contents
Requirements
NVIDIA GPU with latest NVIDIA driver installed.
GPUtil uses the program nvidia-smi to get the GPU status of all available NVIDIA GPUs. nvidia-smi should be installed automatically, when you install your NVIDIA driver.
Supports both Python 2.X and 3.X.
Python libraries:
- subprocess (The Python Standard Library)
- distutils (The Python Standard Library)
- math (The Python Standard Library)
- random (The Python Standard Library)
- time (The Python Standard Library)
- os (The Python Standard Library)
- sys (The Python Standard Library)
- platform (The Python Standard Library)
Tested on CUDA driver version 390.77 Python 2.7 and 3.5.
Installation
- Open a terminal (Ctrl+Shift+T)
- Type
pip install gputil - Test the installation
- Open a terminal in a folder other than the GPUtil folder
- Start a python console by typing
pythonin the terminal - In the newly opened python console, type:
import GPUtil GPUtil.showUtilization() - Your output should look something like following, depending on your number of GPUs and their current usage:
ID GPU MEM -------------- 0 0% 0%
Old way of installation
- Download or clone repository to your computer
- Add GPUtil folder to ~/.bashrc
- Open a new terminal (Press Ctrl+Alt+T)
- Open bashrc:
gedit ~/.bashrc - Added your GPUtil folder to the environment variable
PYTHONPATH(replace<path_to_gputil>with your folder path):export PYTHONPATH="$PYTHONPATH:<path_to_gputil>" Example: export PYTHONPATH="$PYTHONPATH:/home/anderskm/github/gputil" - Save ~/.bashrc and close gedit
- Restart your terminal
- Test the installation
- Open a terminal in a folder other than the GPUtil folder
- Start a python console by typing
pythonin the terminal - In the newly opened python console, type:
import GPUtil GPUtil.showUtilization() - Your output should look something like following, depending on your number of GPUs and their current usage:
ID GPU MEM -------------- 0 0% 0%
Usage
To include GPUtil in your Python code, all you hve to do is included it at the beginning of your script:
import GPUtil
Once included all functions are available. The functions along with a short description of inputs, outputs and their functionality can be found in the following two sections.
Main functions
deviceIDs = GPUtil.getAvailable(order = 'first', limit = 1, maxLoad = 0.5, maxMemory = 0.5, includeNan=False, excludeID=[], excludeUUID=[])
Returns a list ids of available GPUs. Availablity is determined based on current memory usage and load. The order, maximum number of devices, their maximum load and maximum memory consumption are determined by the input arguments.
- Inputs
order- Deterimines the order in which the available GPU device ids are returned.ordershould be specified as one of the following strings:'first'- orders available GPU device ids by ascending id (defaut)'last'- orders available GPU device ids by descending id'random'- orders the available GPU device ids randomly'load'- orders the available GPU device ids by ascending load'memory'- orders the available GPU device ids by ascending memory usage
limit- limits the number of GPU device ids returned to the specified number. Must be positive integer. (default = 1)maxLoad- Maximum current relative load for a GPU to be considered available. GPUs with a load larger thanmaxLoadis not returned. (default = 0.5)maxMemory- Maximum current relative memory usage for a GPU to be considered available. GPUs with a current memory usage larger thanmaxMemoryis not returned. (default = 0.5)includeNan- True/false flag indicating whether to include GPUs where either load or memory usage is NaN (indicating usage could not be retrieved). (default = False)excludeID- List of IDs, which should be excluded from the list of available GPUs. SeeGPUclass description. (default = [])excludeUUID- Same asexcludeIDexcept it uses the UUID. (default = [])
- Outputs
- deviceIDs - list of all available GPU device ids. A GPU is considered available, if the current load and memory usage is less than
maxLoadandmaxMemory, respectively. The list is ordered according toorder. The maximum number of returned device ids is limited bylimit.
- deviceIDs - list of all available GPU device ids. A GPU is considered available, if the current load and memory usage is less than
deviceID = GPUtil.getFirstAvailable(order = 'first', maxLoad=0.5, maxMemory=0.5, attempts=1, interval=900, verbose=False)
Returns the first avaiable GPU. Availablity is determined based on current memory usage and load, and the ordering is determined by the specified order.
If no available GPU is found, an error is thrown.
When using the default values, it is the same as getAvailable(order = 'first', limit = 1, maxLoad = 0.5, maxMemory = 0.5)
- Inputs
order- See the description forGPUtil.getAvailable(...)maxLoad- Maximum current relative load for a GPU to be considered available. GPUs with a load larger thanmaxLoadis not returned. (default = 0.5)maxMemory- Maximum current relative memory usage for a GPU to be considered available. GPUs with a current memory usage larger thanmaxMemoryis not returned. (default = 0.5)attempts- Number of attempts the function should make before giving up finding an available GPU. (default = 1)interval- Interval in seconds between each attempt to find an available GPU. (default = 900 --> 15 mins)verbose- IfTrue, prints the attempt number before each attempt and the GPU id if an available is found.includeNan- See the description forGPUtil.getAvailable(...). (default = False)excludeID- See the description forGPUtil.getAvailable(...). (default = [])excludeUUID- See the description forGPUtil.getAvailable(...). (default = [])
- Outputs
- deviceID - list with 1 element containing the first available GPU device ids. A GPU is considered available, if the current load and memory usage is less than
maxLoadandmaxMemory, respectively. The order and limit are fixed to'first'and1, respectively.
- deviceID - list with 1 element containing the first available GPU device ids. A GPU is considered available, if the current load and memory usage is less than
GPUtil.showUtilization(all=False, attrList=None, useOldCode=False)
Prints the current status (id, memory usage, uuid load) of all GPUs
- Inputs
all- True/false flag indicating if all info on the GPUs should be shown. OverwritesattrList.attrList- List of lists ofGPUattributes to display. See code for more information/example.useOldCode- True/false flag indicating if the old code to display GPU utilization should be used.
- Outputs
- None
Helper functions
class GPU
Helper class handle the attributes of each GPU. Quoted descriptions are copied from corresponding descriptions by nvidia-smi.
- Attributes for each
GPUid- "Zero based index of the GPU. Can change at each boot."uuid- "This value is the globally unique immutable alphanumeric identifier of the GPU. It does not correspond to any physical label on the board. Does not change across reboots."load- Relative GPU load. 0 to 1 (100%, full load). "Percent of time over the past sample period during which one or more kernels was executing on the GPU. The sample period may be between 1 second and 1/6 second depending on the product."memoryUtil- Relative memory usage from 0 to 1 (100%, full usage). "Percent of time over the past sample period during which global (device) memory was being read or written. The sample period may be between 1 second and 1/6 second depending on the product."memoryTotal- "Total installed GPU memory."memoryUsed- "Total GPU memory allocated by active contexts."memoryFree- "Total free GPU memory."driver- "The version of the installed NVIDIA display driver."name- "The official product name of the GPU."serial- This number matches the serial number physically printed on each board. It is a globally unique immutable alphanumeric value.display_mode- "A flag that indicates whether a physical display (e.g. monitor) is currently connected to any of the GPU's connectors. "Enabled" indicates an attached display. "Disabled" indicates otherwise."display_active- "A flag that indicates whether a display is initialized on the GPU's
