Unable to run PyTorch with GPU with message – no kernel image is available for execution on the device

Unable to run PyTorch with GPU with message – no kernel image is available for execution on the device

Questions : Unable to run PyTorch with GPU with message – no kernel image is available for execution on the device

883

The details are mentioned as follows –

Output of nvidia-smi

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 418.40.04 Driver Version: 418.40.04 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K40c Off | 00000000:42:00.0 Off | 0 | | 23% 35C P0 65W / 235W | 0MiB / 11441MiB | 77% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+ 

Exception received –

Traceback (most recent call last): File "MetaWSD/train_wsd.py", line 108, in <module> meta_learner.training(train_episodes, val_episodes) File "/mnt/lustre/users/k21036268/wsd/MetaWSD/models/maml.py", line 121, in training ls, acc, prec, rcl, f1 = self.meta_model(epts, self.updates) File "/mnt/lustre/users/k21036268/wsd/lib64/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/mnt/lustre/users/k21036268/wsd/MetaWSD/models/seq_meta.py", line 109, in forward self.initialize_output_layer(episode.n_classes) File "/mnt/lustre/users/k21036268/wsd/MetaWSD/models/seq_meta.py", line 231, in initialize_output_layer device=self.device) + stdv RuntimeError: CUDA error: no kernel image is available for execution on the device 

Output of env_collection.py

Collecting environment information... PyTorch version: 1.4.0 Is debug build: False CUDA used to build PyTorch: 10.1 ROCM used to build PyTorch: N/A OS: CentOS Linux release 7.6.1810 (Core) (x86_64) GCC version: (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39) Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.3.4 Python version: 3.6.8 (default, Apr 2 2020, 13:34:55) [GCC 4.8.5 20150623 (Red Hat 4.8.5-39)] (64-bit runtime) Python platform: Linux-3.10.0-1062.9.1.el7.x86_64-x86_64-with-centos-7.6.1810-Core Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: Tesla K40c Nvidia driver version: 418.40.04 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Versions of relevant libraries: [pip3] numpy==1.19.5 [pip3] numpydoc==1.1.0 [pip3] pytorch-pretrained-bert==0.6.2 [pip3] pytorch-transformers==1.1.0 [pip3] torch==1.4.0 [pip3] torchtext==0.6.0 [pip3] torchvision==0.5.0 [conda] Could not collect 

I have already tried setting export in4codes_nvidia TORCH_CUDA_ARCH_LIST=All before installing in4codes_nvidia PyTorch, however, I am receiving the same in4codes_nvidia exception. As I am working on a cluster, I in4codes_nvidia am not able to build PyTorch from source in4codes_nvidia (gcc version 4.8.5 is not supported for in4codes_nvidia manual builds)

Total Answers 0