Debian Linux GCC 8 Benchmark -mindirect-branch=thunk

GCC 8 benchmarking of user-space with -mindirect-branch=thunk and -mindirect-branch=thunk-inline for retpolines. Tests by Michael Larabel for a future article on Phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1801161-PTS-DEBIANTE65&rdt&grr.

Debian Linux GCC 8 Benchmark -mindirect-branch=thunkProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay DriverOpenGLCompilerFile-SystemScreen Resolution-mindirect-branch=thunkStock-mindirect-branch=thunk-inlineIntel Core i9-7980XE @ 4.40GHz (18 Cores / 36 Threads)ASUS PRIME X299-A (1004 BIOS)Intel Device 202016384MB120GB Force MP500LLVMpipeRealtek ALC1220Acer B286HKIntel ConnectionDebian 9.34.15.0-rc8-retpo-underflow (x86_64) 20180115GNOME Shell 3.22.3modesetting 1.19.23.3 Mesa 13.0.6 Gallium 0.4 (LLVM 3.9 256 bits)GCC 8.0.1 20180115ext43840x2160OpenBenchmarking.orgEnvironment Details- -mindirect-branch=thunk: CXXFLAGS=-O3-march=native-mindirect-branch=thunk CFLAGS=-O3-march=native-mindirect-branch=thunk- Stock: CXXFLAGS=-O3-march=native CFLAGS=-O3-march=native- -mindirect-branch=thunk-inline: CXXFLAGS=-O3-march=native-mindirect-branch=thunk-inline CFLAGS=-O3-march=native-mindirect-branch=thunk-inline Compiler Details- --disable-multilib --enable-checking=releaseDisk Details- -mindirect-branch=thunk, Stock: NONE / data=ordered,errors=remount-ro,relatime,rwProcessor Details- Scaling Governor: intel_pstate powersavePython Details- -mindirect-branch=thunk, Stock: Python 2.7.13 + Python 3.5.3Security Details- KPTI Full retpoline with underflow protection Protection

Debian Linux GCC 8 Benchmark -mindirect-branch=thunkredis: SETredis: GETpgbench: Buffer Test - Heavy Contention - Read Writepgbench: Buffer Test - Normal Load - Read Writeffmpeg: H.264 HD To NTSC DVbullet: Convex Trimeshbullet: Prim Trimeshbullet: 1000 Convexbullet: 1000 Stackbullet: 3000 Fallbullet: Raytestsstockfish: Total Timetscp: AI Chess Performancehpcg: hpcc: G-Fftehpcc: G-HPLmpcbench: Multi-Precision Benchmark-mindirect-branch=thunkStock-mindirect-branch=thunk-inline1280528.472187123.3311147.8911104.8613.891.231.004.654.574.112.78307411853571.355.5856486.0462096431399046.622222262.5811290.8611387.0913.291.080.924.374.443.882.53290413867941.385.8847585.93090100131457026.922160702.1310540.6011460.4314.131.281.034.934.836.532.89322811164211.385.5602285.975479830OpenBenchmarking.org

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: SET-mindirect-branch=thunkStock-mindirect-branch=thunk-inline300K600K900K1200K1500KSE +/- 160590.96, N = 6SE +/- 34299.85, N = 6SE +/- 2554.75, N = 31280528.471399046.621457026.921. (CC) gcc options: -ggdb -rdynamic -lm -pthread

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GET-mindirect-branch=thunkStock-mindirect-branch=thunk-inline500K1000K1500K2000K2500KSE +/- 40180.95, N = 6SE +/- 43689.11, N = 3SE +/- 41655.36, N = 62187123.332222262.582160702.131. (CC) gcc options: -ggdb -rdynamic -lm -pthread

PostgreSQL pgbench

Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write-mindirect-branch=thunkStock-mindirect-branch=thunk-inline2K4K6K8K10KSE +/- 326.63, N = 6SE +/- 264.90, N = 6SE +/- 43.65, N = 311147.8911290.8610540.601. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -fPIC -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 10.0Scaling: Buffer Test - Test: Normal Load - Mode: Read Write-mindirect-branch=thunkStock-mindirect-branch=thunk-inline2K4K6K8K10KSE +/- 276.43, N = 6SE +/- 221.41, N = 6SE +/- 63.39, N = 311104.8611387.0911460.431. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -fPIC -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 3.3.3H.264 HD To NTSC DV-mindirect-branch=thunkStock-mindirect-branch=thunk-inline48121620SE +/- 0.30, N = 6SE +/- 0.31, N = 6SE +/- 0.32, N = 613.8913.2914.131. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lxcb -lxcb-shm -lxcb-xfixes -lxcb-shape -lasound -lm -llzma -lbz2 -pthread -O3 -march=native -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT

Bullet Physics Engine

Test: Convex Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Convex Trimesh-mindirect-branch=thunkStock-mindirect-branch=thunk-inline0.2880.5760.8641.1521.44SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.00, N = 31.231.081.281. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: Prim Trimesh

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Prim Trimesh-mindirect-branch=thunkStock-mindirect-branch=thunk-inline0.23180.46360.69540.92721.159SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 31.000.921.031. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 1000 Convex

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 Convex-mindirect-branch=thunkStock-mindirect-branch=thunk-inline1.10932.21863.32794.43725.5465SE +/- 0.11, N = 3SE +/- 0.17, N = 3SE +/- 0.06, N = 34.654.374.931. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 1000 Stack

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 1000 Stack-mindirect-branch=thunkStock-mindirect-branch=thunk-inline1.08682.17363.26044.34725.434SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.05, N = 34.574.444.831. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 3000 Fall

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 Fall-mindirect-branch=thunkStock-mindirect-branch=thunk-inline246810SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 2.39, N = 34.113.886.531. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: Raytests

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: Raytests-mindirect-branch=thunkStock-mindirect-branch=thunk-inline0.65031.30061.95092.60123.2515SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 32.782.532.891. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Stockfish

Total Time

OpenBenchmarking.orgms, Fewer Is BetterStockfish 2014-11-26Total Time-mindirect-branch=thunkStock-mindirect-branch=thunk-inline7001400210028003500SE +/- 44.96, N = 3SE +/- 6.11, N = 3SE +/- 56.05, N = 33074290432281. (CXX) g++ options: -lpthread -O3 -march=native -fno-exceptions -fno-rtti -ansi -pedantic -msse -msse3 -mpopcnt -flto

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess Performance-mindirect-branch=thunkStock-mindirect-branch=thunk-inline300K600K900K1200K1500KSE +/- 10473.03, N = 5SE +/- 18404.85, N = 6SE +/- 17216.49, N = 51185357138679411164211. (CC) gcc options: -O3 -march=native

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.0-mindirect-branch=thunkStock-mindirect-branch=thunk-inline0.31050.6210.93151.2421.5525SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 31.351.381.38

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ffte-mindirect-branch=thunkStock-mindirect-branch=thunk-inline1.32412.64823.97235.29646.6205SE +/- 0.01968, N = 3SE +/- 0.18843, N = 3SE +/- 0.02991, N = 35.585645.884755.560221. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. BLAS + Open MPI 2.0.2

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ffte-mindirect-branch=thunkStock-mindirect-branch=thunk-inline1.32412.64823.97235.29646.6205SE +/- 0.01968, N = 3SE +/- 0.18843, N = 3SE +/- 0.02991, N = 35.585645.884755.560221. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. BLAS + Open MPI 2.0.2

HPC Challenge

Test / Class: G-HPL

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPL-mindirect-branch=thunkStock-mindirect-branch=thunk-inline20406080100SE +/- 0.10, N = 3SE +/- 0.29, N = 3SE +/- 0.17, N = 386.0585.9385.981. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. BLAS + Open MPI 2.0.2

GNU MPC

Multi-Precision Benchmark

OpenBenchmarking.orgGlobal Score, More Is BetterGNU MPC 1.1.0Multi-Precision Benchmark-mindirect-branch=thunkStock-mindirect-branch=thunk-inline2K4K6K8K10KSE +/- 84.52, N = 3SE +/- 43.72, N = 3SE +/- 75.72, N = 396431001398301. (CC) gcc options: -lm -O3 -march=native -MT -MD -MP -MF


Phoronix Test Suite v10.8.5