Rate this page del.icio.us  Digg slashdot StumbleUpon

Winning benchmarks before it’s released:
Red Hat Enterprise Linux 5

by Nick Carr

Note: To ensure accuracy in this article, we’ve made a few edits and reposted.

While many people are aware of SPEC benchmarks, they may be unfamiliar with the SPECompM and SPECompL series of benchmarks. These are used for characterizing the OpenMP shared memory performance in medium and large systems. OpenMP itself is a specification for compilers and libraries to make use of parallel directives. The types of problems this benchmark models are largely scientific in nature, covering everything from automobile crash simulation to ocean modeling to computational chemistry to genetics.

Many of these problems cannot be solved efficiently in a grid and still require shared memory ultiprocessing resources. While Red Hat Enterprise Linux has been hugely successful in grid computing, its success in Symmetric Multiprocessing (SMP) has not been publicized as much.

SPEC OMP is of interest to High-performance computing (HPC) users, providing an objective and representative benchmark suite for measuring the performance of SMP systems. The focus is to deliver systems performance results appropriate for real scientific and engineering applications, so the benchmark places heavy demands on the processor, shared memory architecture, compiler and the OpenMP implementation. Companies have published results with OMPM2001 on systems up to 128 cores, while the OMPL2001 suite contains larger working sets and longer run times.

The Red Hat Enterprise Linux 5 SPECompM2001 result is the world record for a 16 core SMP configuration, and used an IBM POWER system that delivered a result of 45,895 - an incredible 78% faster than the previous Linux record holder. It is also a whole lot faster than the previously published Unix results: 210% over Sun Solaris, 132% over HP/UX.

Many people are unaware of Red Hat Enterprise Linux’s ability to optimize large memory configurations, schedule across large numbers of CPU’s, and offer the compilers and libraries tuned to this problem space. So this benchmark is a terrific proof point for people who were waiting for Linux to mature in SMP space.

While commodity multiprocessors and server designs are cost optimized for the best price performance, large SMP systems are designed with performance as the prime goal. The ability to simulate an automobile crash in a computer, rather than building an actual model allows engineers to design a lightweight, yet strong car and iterate the design many times. This gives maximum safety and yet the light weight allows great fuel efficiency. Likewise the ability to model chemical properties in a computer allows tests for strength, toxicity, and cost of manufacture saves many times the costs of the computing systems.
This great result, beating all previous 16 core Linux results, shows the power and suitability of Red Hat Enterprise Linux in the scientific environment.

SPEC® and the benchmark name SPEComp® are registered trademarks of the Standard Performance Evaluation Corporation. Competitive benchmark results stated above reflect 16-core results published on http://www.spec.org as of Feb 14, 2007. IBM System p5 560Q (1.8 GHz) 16-core (8 chips, 32 threads) SPEC OMPM2001 result running RHEL 5 of 45,895 PECompMpeak2001 vs. HP AlphaServer 16-core (16 chips, 16 threads) SPEC OMP2001 result of 25,932 SPECompMpeak2001 vs. Sun V890 16-core (8 chips, 16 threads) SPEC OMP2001 result of 14,789 SPECompMpeak2001. Source: http://www.spec.org/

16 responses to “Winning benchmarks before it’s released:
Red Hat Enterprise Linux 5”

  1. Julian Yap says:

    Nick, I couldn’t find the link to RHEL results on the SPEC OMP Results page.

  2. Georgi says:

    How about comparing the RHEL5 and AIX on the exact same hardware.

  3. Allan says:

    Where can we learn more about the compilers, libraries, etc. mentioned? The RHEL section on this site doesn’t say much (other than the price for the HPC subscriptions - guys, please get your priorities straight).

  4. Stefan says:

    I fully agree with the previous comment that these numbers are rather meaningless if you don’t provide any facts about what was compared? Some details such as was the improvement due to GCC optimizations or kernel level changes? How about other compilers? How about Xeon or Opteron platforms?

  5. Otheus says:

    I want to run these SPECs on our 16-core Sun x4600, and several times, I’ve emailed SPEC, but I never hear back from them. Does anyone know how to get a hole of these benchmark suites?

  6. Nick Carr says:

    The result was submitted to SPEC.org on 2/14. However, there’s a 2 week SPEC review cycle before approved results are posted on the web. So we expect the results to be approved and go public on Feb. 28 or March 1.

    Once it’s at spec.org you will be able to find full details about the benchmark environment.

  7. Sigurd Urdahl says:

    Congrats on the big number!

    But, isn’t the comparison you do quite useless? I’d assume comparing 16-way systems that are completely different is a lot like comparing the horsepower rating of a four cylinder motorcycle engine (e.g. Mazda OHV 586 outputting 28HP) and a four cylinder Porsche engine (e.g. Porsche 968 outputting 236 HP) and claiming that it’s impressive with the 843% difference in horsepower? (numbers stolen off wikipedia)

    After a quick glance at the spec.org site it seems that the interesting comparison would be with IBM System p5 575 system from March 2006 that you beat with around 1.3%. I’m looking forward to comparing the specifications from your benchmark with that of IBM’s when the review cycle is over. Hopefully that will actually compare OSes on the same hardware.

  8. Fred Jones says:

    Sounds like the New spec is far from an apples to apples config with the prior Solaris record.. (older HW.. they should redo the same test independently on the latest Sun HW w Solaris .. vs. IBM / RHEL .. that would be one that I don’t feel would be swayed towards linux)
    .. I’m a perf expert and have seen Solaris 10u3 on Sun’s latest HW.. blow the socks off RHEL… We need Fair and Accurate comparisons.. where’s the data ?

  9. Riiight... says:

    “I’m a perf expert and have seen Solaris 10u3 on Sun’s latest HW.. blow the socks off RHEL… We need Fair and Accurate comparisons.. where’s the data ?”

    And we take your word on that why?

  10. Bill Buros says:

    Technically, with SPEC.org claims you’re supposed to provide the footnotes and the specific comparisons. I’ve copied that from the IBM press release which announced the new system.

    IBM System p5 560Q (1.8 GHz)16-core (8 chips, 32 threads) SPEC OMPM2001 result running RHEL 5 of 45,895 SPECompMpeak2001 vs. SGI Altix 16-core (16 chips, 16 threads) SPEC OMPM2001 result running SGI Propack of 25,789 SPECompMpeak2001 vs. HP AlphaServer 16-core (16 chips, 16 threads) SPEC OMP2001 result of 25,932 SPECompMpeak2001 vs. Sun V890 16-core (8 chips, 16 threads) SPEC OMP2001 result of 14,789 SPECompMpeak2001. Source: http://www.spec.org/

    All results current as of 2/14/07. IBM SPEC results were submitted to SPEC on 2/14/07.

    As Nick noted earlier, the results are still under the normal review cycle.

    RHEL 5 is a very good performance base for HPC type workloads like SPECompM2001 on the Power 5+ systems. RHEL 5’s kernel uses the 64KB memory pages available on these systems, which provides easy performance improvements with no application source code changes or kernel re-compiles.

    The IBM compilers were used for the result, which are the same compilers used by AIX on Power systems.

  11. Bill Buros says:

    Technically, when making SPEC references a footnote of the
    comparisons being made is provided. So I’ve copied the text from
    the IBM press release for the new system. The intent is to compare
    16-core systems.

    IBM System p5 560Q (1.8 GHz)16-core (8 chips, 32 threads)
    SPEC OMPM2001 result running RHEL 5 of 45,895 SPECompMpeak2001
    vs. SGI Altix 16-core (16 chips, 16 threads) SPEC OMPM2001
    result running SGI Propack of 25,789 SPECompMpeak2001
    vs. HP AlphaServer 16-core (16 chips, 16 threads) SPEC OMP2001
    result of 25,932 SPECompMpeak2001
    vs. Sun V890 16-core (8 chips, 16 threads) SPEC OMP2001
    result of 14,789 SPECompMpeak2001.
    Source: http://www.spec.org/

    All results current as of 2/14/07.
    IBM SPEC results were submitted to SPEC on 2/14/07.
    As Nick pointed out earlier, the submission is under review.

    The RHEL 5 release is ideal for performance for Power with its
    new support of 64KB memory page sizes which is built into the
    kernel and operating system. It allows HPC applications like
    SPEComp2001 to show pretty good performance with no application
    or kernel changes.

    The compilers used were the IBM compilers, which has the same
    base as the compilers AIX uses for Power systems.

  12. Big Walt says:

    Looking fwd to RHEL5 - thought it was due out on the 28 Feb. :)

  13. Sigurd Urdahl says:

    The results are available on spec.org now:-)

    The system that I find most interesting to compare with is IBM’s AIX-based benchmark from 2006Q1. That system should be quite comparable to the Redhat one. Unfortunately the resultpage for IBM [1]
    and the config page [2] are inconsistent as to what kind of CPU frequency the tested system had (1900 MHz vs 2200 MHz). I have sent an email to spec-org about this.

    The 560Q with Redhat should IMHO be quite comparable to a 575 with AIX architecture wise, and thus the results should be comparable in a meaningful way (at least more so than comparing with e.g a 16-way Sparc-based system). I’d expect (but I’m no expert here) that most of the differences between them should be due to OS and CPU frequency.

    Without taking hardware differences between the p575 and the p560Q into consideration the former with AIX beats the latter with RHEL5. RHEL5/p560Q get 82% of AIX/p575 score in SPECompMpeak2001 and 78% in SPECompMbase2001. If the p575 was run with 2200 MHz CPUs the difference in results is the same as the difference in CPU speed (82%). I’d really love to see this done with the exact same hardware…

    Bill: Isn’t comparing a 16-way Power5+ system with a 16-way Sparc system a lot like the comparison of the four cylinder engines, technically speaking? I can see that it makes sense in a marketing perspective, but to me it really feels like comparing apples and toasters. How comparable are the p575 and the p560Q?

    But by all means, I look forward to testing RHEL5 on one of our p720’s, I’m confident that there are performance enhancing features in RHEL5 and I congratulate Redhat on releasing it.

    [1] http://www.spec.org/omp/results/res2006q1/omp2001-20060213-00211.html

    [2]
    http://www.spec.org/omp/results/res2006q1/omp2001-20060213-00211.cfg

  14. Sigurd Urdahl says:

    The results are available on spec.org now:-)

    The system that I find most interesting to compare with is IBM’s AIX-based benchmark from 2006Q1. That system should be quite comparable to the Redhat one. Unfortunately the resultpage for IBM [1]
    and the config page [2] are inconsistent as to what kind of CPU frequency the tested system had (1900 MHz vs 2200 MHz). I have sent an email to spec-org about this.

    The 560Q with Redhat should IMHO be quite comparable to a 575 with AIX architecture wise, and thus the results should be comparable in a meaningful way (at least more so than comparing with e.g a 16-way Sparc-based system). I’d expect (but I’m no expert here) that most of the differences between them should be due to OS and CPU frequency.

    Without taking hardware differences between the p575 and the p560Q into consideration the former with AIX beats the latter with RHEL5. RHEL5/p560Q get 82% of AIX/p575 score in SPECompMpeak2001 and 78% in SPECompMbase2001. If the p575 was run with 2200 MHz CPUs the difference in results is the same as the difference in CPU speed (82%). I’d really love to see this done with the exact same hardware…

    Bill: Isn’t comparing a 16-way Power5+ system with a 16-way Sparc system a lot like the comparison of the four cylinder engines, technically speaking? I can see that it makes sense in a marketing perspective, but to me it really feels like comparing apples and toasters. How comparable are the p575 and the p560Q?

    But by all means, I look forward to testing RHEL5 on one of our p720’s, I’m confident that there are performance enhancing features in RHEL5 and I congratulate Redhat on releasing it.

    [1] http://www.spec.org/omp/results/res2006q1/omp2001-20060213-00211.html

    [2]
    http://www.spec.org/omp/results/res2006q1/omp2001-20060213-00211.cfg

  15. Bill Buros says:

    imho, the p5 575 is a sweeter (for HPC) hardware implementation designed more for the HPC world and customers buying racks of those systems. The p5 575 referenced above is really a 1.90Ghz system (as specified in the html file).

    The updated p5 560Q seems more general purpose and would be a better choice for a mix of workloads, including HPC if desired, and is very popular for the virtualized customer environments.

    RHEL 5 does very nicely on both systems. On a peak level, AIX and RHEL 5 have very similar performance characteristics when we run them side by side for customer assessments. The base level differences are due to some differences in the approach used for the base runs.

    The SPEC.org cfg files for the older benchmarks are mis-leading with respect to the comments content (anything that doesn’t control the compilation and the run). The cfg files published is literally what was used to test with. For submissions, companies are allowed (actually, have to) edit the comments to reflect what was really run and to clarify details. This allows the runs to be made independently of the “polish” clarifications needed for publishing. So the html and pdf files are the “correct” content.

  16. JA says:

    About the claim that the IBM RHEL5 result is 210% faster than the Sun Solaris result:

    IBM: 45,895 (IBM p560Q) for 32 threads, 16 cores. Cost of the system, according to the result located at notesbench, is $161K.

    Sun: 21,167 (Sun Fire x4600) for 8 threads, 8 cores. Cost of the system according to store.sun.com for $25K.

    So, for 1/2 the performance, and 1/4 the parallelism complexity at less than 1/6th the cost, Sun is clearly the better choice.

Leave a reply