About This Page
This page is part of the Azure documentation. It contains code examples and configuration instructions for working with Azure services.
Bias Analysis
Bias Types:
⚠️
powershell_heavy
⚠️
windows_first
⚠️
windows_tools
⚠️
missing_linux_example
Summary:
The documentation demonstrates a strong Windows bias. All command-line examples are shown in PowerShell, with Windows paths and prompts. Device management commands (e.g., Get-HcsGpuNvidiaSmi, Start-HcsGpuMPS) are specific to PowerShell and Windows. There are no Linux or cross-platform command examples, and Linux tools or shell usage are not mentioned. The workflow assumes a Windows client environment throughout.
Recommendations:
- Provide equivalent Linux/bash command examples alongside PowerShell, using standard Linux shell prompts and paths.
- Document how to connect to and manage the Azure Stack Edge Pro GPU device from a Linux client, including any required tools or differences.
- Clarify whether device management commands (e.g., Get-HcsGpuNvidiaSmi, Start-HcsGpuMPS) are available or have equivalents on Linux, or if they are Windows-only.
- Update the prerequisites and instructions to explicitly support Linux clients, not just Windows.
- Where possible, use cross-platform tools and neutral language (e.g., 'terminal' instead of 'PowerShell interface') and avoid assuming C:\ paths.
- If certain features are Windows-only, clearly state this and provide alternative guidance for Linux users.
Create pull request
Flagged Code Snippets
PS C:\WINDOWS\system32> kubectl logs -n mynamesp1 cuda-sample2-db9vx
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
===========// CUT //===================// CUT //=====================
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Turing" with compute capability 7.5
> Compute 7.5 CUDA device: [Tesla T4]
40960 bodies, total time for 10000 iterations: 170368.859 ms
= 98.476 billion interactions per second
= 1969.517 single-precision GFLOP/s at 20 flops per interaction
PS C:\WINDOWS\system32>
[10.100.10.10]: PS>Get-HcsGpuNvidiaSmi
K8S-1HXQG13CL-1HXQG13:
Wed Mar 3 12:32:52 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00002C74:00:00.0 Off | 0 |
| N/A 38C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
[10.100.10.10]: PS>
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
NAME READY STATUS RESTARTS AGE
cuda-sample1-vcznt 0/1 Completed 0 5m44s
cuda-sample2-zkx4w 0/1 Completed 0 5m44s
PS C:\WINDOWS\system32> kubectl logs -n mynamesp1 cuda-sample1-vcznt
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
===========// CUT //===================// CUT //=====================
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Turing" with compute capability 7.5
> Compute 7.5 CUDA device: [Tesla T4]
40960 bodies, total time for 10000 iterations: 154979.453 ms
= 108.254 billion interactions per second
= 2165.089 single-precision GFLOP/s at 20 flops per interaction
PS C:\WINDOWS\system32> kubectl logs -n mynamesp1 cuda-sample2-zkx4w
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
===========// CUT //===================// CUT //=====================
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Turing" with compute capability 7.5
> Compute 7.5 CUDA device: [Tesla T4]
40960 bodies, total time for 10000 iterations: 154986.734 ms
= 108.249 billion interactions per second
= 2164.987 single-precision GFLOP/s at 20 flops per interaction
PS C:\WINDOWS\system32>
PS>Get-HcsGpuNvidiaSmi
K8S-1HXQG13CL-1HXQG13:
Mon Mar 3 21:59:55 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 0000E00B:00:00.0 Off | 0 |
| N/A 37C P8 9W / 70W | 28MiB / 15109MiB | 0% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 144443 C nvidia-cuda-mps-server 25MiB |
+-----------------------------------------------------------------------------+
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
No resources found.
[10.100.10.10]: PS>Start-HcsGpuMPS
K8S-1HXQG13CL-1HXQG13:
Set compute mode to EXCLUSIVE_PROCESS for GPU 00002C74:00:00.0.
All done.
Created nvidia-mps.service
[10.100.10.10]: PS>
PS C:\WINDOWS\system32> kubectl -n mynamesp1 delete -f C:\gpu-sharing\k8-gpusharing.yaml
job.batch "cuda-sample1" deleted
job.batch "cuda-sample2" deleted
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
No resources found.
PS C:\WINDOWS\system32> kubectl -n mynamesp1 apply -f C:\gpu-sharing\k8-gpusharing.yaml
job.batch/cuda-sample1 created
job.batch/cuda-sample2 created
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
NAME READY STATUS RESTARTS AGE
cuda-sample1-vcznt 1/1 Running 0 21s
cuda-sample2-zkx4w 1/1 Running 0 21s
PS C:\WINDOWS\system32> kubectl -n mynamesp1 describe job.batch/cuda-sample1; kubectl -n mynamesp1 describe job.batch/cuda-sample2
Name: cuda-sample1
Namespace: mynamesp1
Selector: controller-uid=ed06bdf0-a282-4b35-a2a0-c0d36303a35e
Labels: controller-uid=ed06bdf0-a282-4b35-a2a0-c0d36303a35e
job-name=cuda-sample1
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"cuda-sample1","namespace":"mynamesp1"},"spec":{"backoffLimit":1...
Parallelism: 1
Completions: 1
Start Time: Wed, 03 Mar 2021 21:51:51 -0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=ed06bdf0-a282-4b35-a2a0-c0d36303a35e
job-name=cuda-sample1
Containers:
cuda-sample-container1:
Image: nvidia/samples:nbody
Port: <none>
Host Port: <none>
Command:
/tmp/nbody
Args:
-benchmark
-i=10000
Environment:
NVIDIA_VISIBLE_DEVICES: 0
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 46s job-controller Created pod: cuda-sample1-vcznt
Name: cuda-sample2
Namespace: mynamesp1
Selector: controller-uid=6282b8fa-e76d-4f45-aa85-653ee0212b29
Labels: controller-uid=6282b8fa-e76d-4f45-aa85-653ee0212b29
job-name=cuda-sample2
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"cuda-sample2","namespace":"mynamesp1"},"spec":{"backoffLimit":1...
Parallelism: 1
Completions: 1
Start Time: Wed, 03 Mar 2021 21:51:51 -0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=6282b8fa-e76d-4f45-aa85-653ee0212b29
job-name=cuda-sample2
Containers:
cuda-sample-container2:
Image: nvidia/samples:nbody
Port: <none>
Host Port: <none>
Command:
/tmp/nbody
Args:
-benchmark
-i=10000
Environment:
NVIDIA_VISIBLE_DEVICES: 0
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 47s job-controller Created pod: cuda-sample2-zkx4w
PS C:\WINDOWS\system32>
PS>Get-HcsGpuNvidiaSmi
K8S-1HXQG13CL-1HXQG13:
Mon Mar 3 21:54:50 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 0000E00B:00:00.0 Off | 0 |
| N/A 45C P0 68W / 70W | 242MiB / 15109MiB | 100% E. Process |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 144377 M+C /tmp/nbody 107MiB |
| 0 N/A N/A 144379 M+C /tmp/nbody 107MiB |
| 0 N/A N/A 144443 C nvidia-cuda-mps-server 25MiB |
+-----------------------------------------------------------------------------+
[10.100.10.10]: PS>Get-HcsGpuNvidiaSmi
K8S-1HXQG13CL-1HXQG13:
Wed Mar 3 12:24:27 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00002C74:00:00.0 Off | 0 |
| N/A 34C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
[10.100.10.10]: PS>
PS C:\WINDOWS\system32> kubectl apply -f -n mynamesp1 C:\gpu-sharing\k8-gpusharing.yaml
job.batch/cuda-sample1 created
job.batch/cuda-sample2 created
PS C:\WINDOWS\system32>
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
NAME READY STATUS RESTARTS AGE
cuda-sample1-27srm 1/1 Running 0 28s
cuda-sample2-db9vx 1/1 Running 0 27s
PS C:\WINDOWS\system32>
PS C:\WINDOWS\system32> kubectl -n mynamesp1 describe job.batch/cuda-sample1; kubectl -n mynamesp1 describe job.batch/cuda-sample2
Name: cuda-sample1
Namespace: mynamesp1
Selector: controller-uid=22783f76-6af1-490d-b6eb-67dd4cda0e1f
Labels: controller-uid=22783f76-6af1-490d-b6eb-67dd4cda0e1f
job-name=cuda-sample1
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"cuda-sample1","namespace":"mynamesp1"},"spec":{"backoffLimit":1...
Parallelism: 1
Completions: 1
Start Time: Wed, 03 Mar 2021 12:25:34 -0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=22783f76-6af1-490d-b6eb-67dd4cda0e1f
job-name=cuda-sample1
Containers:
cuda-sample-container1:
Image: nvidia/samples:nbody
Port: <none>
Host Port: <none>
Command:
/tmp/nbody
Args:
-benchmark
-i=10000
Environment:
NVIDIA_VISIBLE_DEVICES: 0
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 60s job-controller Created pod: cuda-sample1-27srm
Name: cuda-sample2
Namespace: mynamesp1
Selector: controller-uid=e68c8d5a-718e-4880-b53f-26458dc24381
Labels: controller-uid=e68c8d5a-718e-4880-b53f-26458dc24381
job-name=cuda-sample2
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"batch/v1","kind":"Job","metadata":{"annotations":{},"name":"cuda-sample2","namespace":"mynamesp1"},"spec":{"backoffLimit":1...
Parallelism: 1
Completions: 1
Start Time: Wed, 03 Mar 2021 12:25:35 -0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=e68c8d5a-718e-4880-b53f-26458dc24381
job-name=cuda-sample2
Containers:
cuda-sample-container2:
Image: nvidia/samples:nbody
Port: <none>
Host Port: <none>
Command:
/tmp/nbody
Args:
-benchmark
-i=10000
Environment:
NVIDIA_VISIBLE_DEVICES: 0
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 60s job-controller Created pod: cuda-sample2-db9vx
PS C:\WINDOWS\system32>
[10.100.10.10]: PS>Get-HcsGpuNvidiaSmi
K8S-1HXQG13CL-1HXQG13:
Wed Mar 3 12:26:41 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00002C74:00:00.0 Off | 0 |
| N/A 64C P0 69W / 70W | 221MiB / 15109MiB | 100% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 197976 C /tmp/nbody 109MiB |
| 0 N/A N/A 198051 C /tmp/nbody 109MiB |
+-----------------------------------------------------------------------------+
[10.100.10.10]: PS>
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
NAME READY STATUS RESTARTS AGE
cuda-sample1-27srm 1/1 Running 0 70s
cuda-sample2-db9vx 1/1 Running 0 69s
PS C:\WINDOWS\system32>
PS C:\WINDOWS\system32> kubectl get pods -n mynamesp1
NAME READY STATUS RESTARTS AGE
cuda-sample1-27srm 0/1 Completed 0 2m54s
cuda-sample2-db9vx 0/1 Completed 0 2m53s
PS C:\WINDOWS\system32>
PS C:\WINDOWS\system32> kubectl logs -n mynamesp1 cuda-sample1-27srm
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
===========// CUT //===================// CUT //=====================
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Turing" with compute capability 7.5
> Compute 7.5 CUDA device: [Tesla T4]
40960 bodies, total time for 10000 iterations: 170398.766 ms
= 98.459 billion interactions per second
= 1969.171 single-precision GFLOP/s at 20 flops per interaction
PS C:\WINDOWS\system32>
PS C:\WINDOWS\system32> kubectl delete -f 'C:\gpu-sharing\k8-gpusharing.yaml' -n mynamesp1
deployment.apps "cuda-sample1" deleted
deployment.apps "cuda-sample2" deleted
PS C:\WINDOWS\system32>