5 Pitfalls of Benchmarking Big Data Systems

Benchmarking can be valuable in virtually any circumstance, but it is especially important when it comes to big data systems, where speed is everything. You should thus be striving for as much accuracy as possible with measuring. In an article for Cloud Data Architect, Justin Kestelyn highlights five pitfalls to avoid to get more reliable results:

  1. Comparing apples to oranges
  2. Not testing at scale
  3. Believing in miracles
  4. Using unrealistic benchmarks
  5. Communicating results poorly

Data Deception

In the first place, a benchmark that is run may not be as objective as you would think. You might intend to change one parameter between two tests but find that many parameters have changed, resulting in data that can no longer be meaningfully compared. For instance, perhaps a benchmark is unable to compress data as well in one case as it is in another due to the way the benchmark was designed. Mistakes like these can be hard to find, but you need to try your best to keep an eye out for them anyway.

Another problem is to neglect to test at scale. If you do not push your execution framework to see how well it performs under heavy jobs and diverse conditions, you cannot be sure it fully works. Scale in this case can refer to data scale, concurrency scale, cluster scale (referring to nodes and racks), and node scale (regarding node hardware size).

Next, believing in “miracles” is a cute way of saying that you believe something that is too good to be true. Kestelyn gives this example:

A customer came to us and declared that Impala performs more than 1000x better than its existing data warehouse system, and wanted us to help it set up a new cluster to handle a growing production workload. The 1000x difference is orders of magnitude larger than our own measurements, and immediately made us skeptical. Following much discussion, we realized that the customer was comparing very simple queries running on a proof-of-concept Impala cluster versus complex queries running on a heavily loaded production system. We helped the customer do an apples-to-apples comparison, yet it turns out Impala still has an advantage. We left the customer with realistic plans for how to grow its data management systems.

Something else to watch out for is blatantly unrealistic benchmarks. Keep an eye out for misleading workloads touted by vendors, or data that is bolstered by premium hardware unused by the general public, or queries that seem to be generally cherry-picked.

Lastly, your benchmarks need to tell people something explicit and useful. And the reasoning behind how benchmarks are conducted should be sound too. Ideally, you would be able to invite independent audit, but no widely accepted audit/verification processes yet exist in big data. Give it another year or two.

For further examples of these points, you can view the original article here:

Show More