Skip to content

Instantly share code, notes, and snippets.

@jasondavies
Created November 13, 2023 14:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jasondavies/d45e3fafd5440a22705a2318d570e9de to your computer and use it in GitHub Desktop.
Save jasondavies/d45e3fafd5440a22705a2318d570e9de to your computer and use it in GitHub Desktop.
M3 Max clpeak
Platform: Apple
Device: Apple M3 Max
Driver version : 1.2 1.0 (Macintosh)
Compute units : 40
Clock frequency : 1000 MHz
Global memory bandwidth (GBPS)
float : 361.17
float2 : 380.13
float4 : 385.89
float8 : 377.79
float16 : 374.10
Single-precision compute (GFLOPS)
float : 6942.33
float2 : 6944.58
float4 : 7003.88
float8 : 7008.85
float16 : 6984.37
No half precision support! Skipped
No double precision support! Skipped
Integer compute (GIOPS)
int : 3524.74
int2 : 3524.80
int4 : 3524.73
int8 : 3523.65
int16 : 3524.47
Integer compute Fast 24bit (GIOPS)
int : 2414.25
int2 : 2891.56
int4 : 2895.91
int8 : 2892.89
int16 : 2889.45
Transfer bandwidth (GBPS)
enqueueWriteBuffer : 56.08
enqueueReadBuffer : 57.99
enqueueWriteBuffer non-blocking : 64.46
enqueueReadBuffer non-blocking : 64.75
enqueueMapBuffer(for read) : 1481023.12
memcpy from mapped ptr : 43.59
enqueueUnmap(after write) : 273564.81
memcpy to mapped ptr : 50.76
Kernel launch latency : 1.11 us
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment