Tests
Suites
Latest Results
Search
Register
Login
Popular Tests
Flexible IO Tester
Blender
PostgreSQL
Timed Linux Kernel Compilation
Stress-NG
AOM AV1
Newest Tests
OpenVINO GenAI
Rustls
LiteRT
WarpX
Epoch
Recently Updated Tests
Stress-NG
Llama.cpp
NCNN
srsRAN Project
VVenC
x265
New & Recently Updated Tests
Recently Updated Suites
Machine Learning
Server Motherboard
HPC - High Performance Computing
New & Recently Updated Suites
Component Benchmarks
CPUs / Processors
GPUs / Graphics
OpenGL
Disks / Storage
Motherboards
File-Systems
Operating Systems
OpenBenchmarking.org
Corporate / Organization Info
Bug Reports / Feature Requests
Apache Spark TPC-H 1.0.0
pts/spark-tpch-1.0.0
- 04 December 2023 -
Add Apache Spark TPC-H benchmark.
downloads.xml
<?xml version="1.0"?> <!--Phoronix Test Suite v10.8.4--> <PhoronixTestSuite> <Downloads> <Package> <URL>https://archive.apache.org/dist/spark/spark-3.5.0/spark-3.5.0-bin-hadoop3.tgz</URL> <MD5>4a3b0cfc251827d78ff468db7016f2ee</MD5> <SHA256>ac064ec620129b4b9630e180e25ac2b71a8ccd4c6912bf5b5631943b742f0379</SHA256> <FileName>spark-3.5.0-bin-hadoop3.tgz</FileName> <FileSize>400395283</FileSize> </Package> <Package> <URL>https://github.com/ssavvides/tpch-spark/archive/7d64aa57368f092969a855fdd10781c00813c9c4.zip</URL> <MD5>1f981d33ba100fc31c51204156037bab</MD5> <SHA256>55a82833b0e76e77c1d0bc95ff3a04f5c9bee9b8bc02a5c64193ad4cbc9e4a0b</SHA256> <FileName>tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4.zip</FileName> <FileSize>1705811</FileSize> </Package> <Package> <URL>https://github.com/sbt/sbt/releases/download/v1.9.7/sbt-1.9.7.zip</URL> <MD5>9e4cdee25e6076a78810f02f0bda77ef</MD5> <SHA256>74f36c2b044b63f3a24300b8e32eea20afc462b16974bf13f6e393a64530116d</SHA256> <FileName>sbt-1.9.7.zip</FileName> <FileSize>47154988</FileSize> </Package> </Downloads> </PhoronixTestSuite>
install.sh
#!/bin/sh tar -xf spark-3.5.0-bin-hadoop3.tgz unzip -o tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4.zip unzip -o sbt-1.9.7.zip export PATH=$HOME/spark-3.5.0-bin-hadoop3/bin:$HOME/sbt/bin:$PATH # Avoid out of memory errors echo "spark.driver.memory 16g spark.executor.memory 16g" > ~/spark-3.5.0-bin-hadoop3/conf/spark-defaults.conf cd tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4/dbgen/ make cd ~/tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4/ sbt package echo $? > ~/install-exit-status cd ~ echo "#!/bin/bash export PATH=\$HOME/spark-3.5.0-bin-hadoop3/bin:\$PATH cd tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4/ rm -f tpch_execution_times.txt spark-submit --class \"main.scala.TpchQuery\" target/scala-2.12/spark-tpc-h-queries_2.12-1.0.jar > \$LOG_FILE 2>&1 echo \$? > ~/test-exit-status cat tpch_execution_times.txt >> \$LOG_FILE " > spark-tpch chmod +x spark-tpch
post.sh
#!/bin/sh export PATH=$HOME/spark-3.5.0-bin-hadoop3/bin:$PATH cd tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4/ rm -f dbgen/*.tbl
pre.sh
#!/bin/sh export PATH=$HOME/spark-3.5.0-bin-hadoop3/bin:$PATH cd tpch-spark-7d64aa57368f092969a855fdd10781c00813c9c4/ rm -f tpch_execution_times.txt rm -f dbgen/*.tbl cd dbgen ./dbgen $@
results-definition.xml
<?xml version="1.0"?> <!--Phoronix Test Suite v10.8.4--> <PhoronixTestSuite> <ResultsParser> <OutputTemplate>Q05 #_RESULT_#</OutputTemplate> <LineHint>Q</LineHint> <MultiMatch>GEOMETRIC_MEAN</MultiMatch> <AppendToArgumentsDescription>Geometric Mean Of All Queries</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> </ResultsParser> <ResultsParser> <OutputTemplate>Q01 #_RESULT_#</OutputTemplate> <LineHint>Q01</LineHint> <AppendToArgumentsDescription>Q01</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q02 #_RESULT_#</OutputTemplate> <LineHint>Q02</LineHint> <AppendToArgumentsDescription>Q02</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q03 #_RESULT_#</OutputTemplate> <LineHint>Q03</LineHint> <AppendToArgumentsDescription>Q03</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q04 #_RESULT_#</OutputTemplate> <LineHint>Q04</LineHint> <AppendToArgumentsDescription>Q04</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q05 #_RESULT_#</OutputTemplate> <LineHint>Q05</LineHint> <AppendToArgumentsDescription>Q05</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q06 #_RESULT_#</OutputTemplate> <LineHint>Q06</LineHint> <AppendToArgumentsDescription>Q06</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q07 #_RESULT_#</OutputTemplate> <LineHint>Q07</LineHint> <AppendToArgumentsDescription>Q07</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q08 #_RESULT_#</OutputTemplate> <LineHint>Q08</LineHint> <AppendToArgumentsDescription>Q08</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q09 #_RESULT_#</OutputTemplate> <LineHint>Q09</LineHint> <AppendToArgumentsDescription>Q09</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q10 #_RESULT_#</OutputTemplate> <LineHint>Q10</LineHint> <AppendToArgumentsDescription>Q10</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q11 #_RESULT_#</OutputTemplate> <LineHint>Q11</LineHint> <AppendToArgumentsDescription>Q11</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q12 #_RESULT_#</OutputTemplate> <LineHint>Q12</LineHint> <AppendToArgumentsDescription>Q12</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q13 #_RESULT_#</OutputTemplate> <LineHint>Q13</LineHint> <AppendToArgumentsDescription>Q13</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q14 #_RESULT_#</OutputTemplate> <LineHint>Q14</LineHint> <AppendToArgumentsDescription>Q14</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q15 #_RESULT_#</OutputTemplate> <LineHint>Q15</LineHint> <AppendToArgumentsDescription>Q15</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q16 #_RESULT_#</OutputTemplate> <LineHint>Q16</LineHint> <AppendToArgumentsDescription>Q16</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q17 #_RESULT_#</OutputTemplate> <LineHint>Q17</LineHint> <AppendToArgumentsDescription>Q17</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q18 #_RESULT_#</OutputTemplate> <LineHint>Q18</LineHint> <AppendToArgumentsDescription>Q18</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q19 #_RESULT_#</OutputTemplate> <LineHint>Q19</LineHint> <AppendToArgumentsDescription>Q19</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q20 #_RESULT_#</OutputTemplate> <LineHint>Q20</LineHint> <AppendToArgumentsDescription>Q20</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q21 #_RESULT_#</OutputTemplate> <LineHint>Q21</LineHint> <AppendToArgumentsDescription>Q21</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> <ResultsParser> <OutputTemplate>Q22 #_RESULT_#</OutputTemplate> <LineHint>Q22</LineHint> <AppendToArgumentsDescription>Q22</AppendToArgumentsDescription> <DeleteOutputBefore>Query Time (seconds)</DeleteOutputBefore> <Importance>Secondary</Importance> </ResultsParser> </PhoronixTestSuite>
test-definition.xml
<?xml version="1.0"?> <!--Phoronix Test Suite v10.8.4--> <PhoronixTestSuite> <TestInformation> <Title>Apache Spark TPC-H</Title> <AppVersion>3.5</AppVersion> <Description>This is a benchmark of Apache Spark using TPC-H data-set. Apache Spark is an open-source unified analytics engine for large-scale data processing and dealing with big data. This test profile benchmarks the Apache Spark in a single-system configuration using spark-submit. The test makes use of https://github.com/ssavvides/tpch-spark/ for facilitating the TPC-H benchmark.</Description> <ResultScale>Seconds</ResultScale> <Proportion>LIB</Proportion> <TimesToRun>3</TimesToRun> </TestInformation> <TestProfile> <Version>1.0.0</Version> <SupportedPlatforms>Linux</SupportedPlatforms> <SoftwareType>Application</SoftwareType> <TestType>System</TestType> <License>Free</License> <Status>Verified</Status> <ExternalDependencies>java, python</ExternalDependencies> <EnvironmentSize>5000</EnvironmentSize> <ProjectURL>https://spark.apache.org/</ProjectURL> <RepositoryURL>https://github.com/apache/spark</RepositoryURL> <Maintainer>Michael Larabel</Maintainer> </TestProfile> <TestSettings> <Option> <DisplayName>Scale Factor</DisplayName> <Identifier>scale</Identifier> <ArgumentPrefix>-s </ArgumentPrefix> <Menu> <Entry> <Name>1</Name> <Value>1</Value> <Message>About 1GB of data</Message> </Entry> <Entry> <Name>10</Name> <Value>10</Value> <Message>About 10GB of data</Message> </Entry> <Entry> <Name>50</Name> <Value>50</Value> <Message>About 50GB of data</Message> </Entry> <Entry> <Name>100</Name> <Value>100</Value> <Message>About 100GB of data</Message> </Entry> <Entry> <Name>3000</Name> <Value>3000</Value> <Message>About 3TB of data; many hours of testing</Message> </Entry> <Entry> <Name>10000</Name> <Value>10000</Value> <Message>About 10TB of data, extremely time consuming!</Message> </Entry> </Menu> </Option> </TestSettings> </PhoronixTestSuite>