Jump to content UNITED STATES
hp.com home products and services support and drivers solutions how to buy
» contact hp


 
hp.com home


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance





» hp Alpha systems

site information

» what's new on our site
» send us your comments

evolving business value

»  Alpha RetainTrust Program

news and promotions

» announcements
» news
»
Content starts here

June 2003

A D.H. Brown Associates, Inc. White Paper Prepared for

Hewlett-Packard


This document is copyrighted by D.H. Brown Associates, Inc. (DHBA) and is protected by U.S. and international copyright laws and conventions. This document may not be copied, reproduced, stored in a retrieval system, transmitted in any form, posted on a public or private website or bulletin board, or sublicensed to a third party without the written consent of DHBA. No copyright may be obscured or removed from the paper. D.H. Brown Associates, Inc. and DHBA are trademarks of D.H. Brown Associates, Inc. All trademarks and registered marks of products and companies referred to in this paper are protected.

This document was developed on the basis of information and sources believed to be reliable. This document is to be used “as is.” DHBA makes no guarantees or representations regarding, and shall have no liability for the accuracy of, data, subject matter, quality, or timeliness of the content. The data contained in this document are subject to change. DHBA accepts no responsibility to inform the reader of changes in the data. In addition, DHBA may change its view of the products, services, and companies described in this document.

DHBA accepts no responsibility for decisions made on the basis of information contained herein, nor from the reader’s attempts to duplicate performance results or other outcomes. Nor can the paper be used to predict future values or performance levels. This document may not be used to create an endorsement for products and services discussed in the paper or for other products and services offered by the vendors discussed.


TABLE OF CONTENTS

   
     
EXECUTIVE SUMMARY 1  
     
INTRODUCTION 2  
     
KEY RAS IMPROVEMENTS 4  
   ON-CHIP L2 4  
   RAID MEMORY 4  
   SWITCHLESS INTERCONNECT 5  
      Figure 1: System Interconnects - EV68 GS160 and EV7 Marvel 6  
   SYSTEM CLOCK 7  
   I/O HARDWARE 8  
   POWER AND COOLING INFRASTRUCTURE 8  
   SOFTWARE FEATURES ENHANCING PLATFORM RAS 9  
FINAL THOUGHTS 9  


DH Brown logoEV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

EXECUTIVE SUMMARY

Launched in early 2003, the AlphaServer ES47, ES80, and GS1280 systems represent the latest generation of high-performance servers to satisfy the demanding requirements of Tru64 and OpenVMS customers. Known under the project code name “Marvel” while under development first at Digital Equipment, then Compaq, and now Hewlett-Packard (HP), these systems offer the top-notch performance that AlphaServer customers have come to expect. Every bit as important, these new servers feature enhanced RAS (Reliability, Availability, Serviceability) capabilities that position them to be the centerpiece of business-critical computing.

Historically, the blazing performance leadership of the Alpha processor proved irresistible to customers dealing with compute-intensive workloads, especially the high-performance technical computing (HPTC) community. The new EV7 systems do not disappoint this constituency – improved memory and I/O bandwidths enable the Alpha processor to deliver impressive performance. Particularly noteworthy, the new servers achieve powerful performance with an innovative system design that enhances their RAS characteristics.

By implementing switchless system interconnect, L2 cache, and memory controllers on the EV7 die, the AlphaServer developers dramatically reduced the number of components and connections that could contribute to failure. That architectural simplification is estimated to extend the average interval between failures by as much as 30%. Innovative features such as RAID memory and memory troller background correction help avoid uncorrectable memory errors even as the amount of memory grows very large. Error Correcting Codes (ECC) are widely used to protect data paths – HP indicates that over 90% of the EV7’s signal pins are covered by ECC or parity, and over 90% of the chip’s circuitry is protected by ECC. Extensive use of N+1 and hot-plug capabilities in the power and cooling subsystem helps to minimize failures of this underlying infrastructure. On top of the hardware features, enhancements in Tru64 UNIX and OpenVMS support multi-path I/O and predictive failure analysis.

Copyright 2003 D.H. Brown Associates, Inc.

1


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

INTRODUCTION

Although certainly not immune to the effects of the economic environment, demand for computing continues to grow. In fact, computing plays an increasingly critical role for technical users and mainstream business customers. Engineering, scientific, and academic developers, and researchers look to take advantage of improved price/performance to attack larger, more complex problems. Enterprises seek a differentiated edge over their competitors, for example, through increased productivity or enhanced customer relationship applications. Additionally, consolidation of distributed servers into an efficiently managed, centralized infrastructure further drives the need for larger, more powerful systems.

Reflecting the ever-growing demand for computing power, the bragging rights of benchmark leadership often dominate the IT headlines. Performance is significant, for both HPTC and enterprise customers. At the same time, these critical computing resources must remain up and running to deliver that performance. Customers often take for granted that their systems will be robust and resilient. And indeed, those assumptions are well founded; today’s platforms are highly reliable workhorses. However, such dependable servers do not “just
happen.” They result from a well-planned design that often remains unrecognized. This paper looks at some of the “behind-the-headlines” design decisions that assure the latest AlphaServers will meet increasing customer expectations for robust and resilient computing.

Let us begin by placing server failures into perspective. Customers recognize that most causes of unavailability can be traced to human operational errors. While the proportion of human-induced failures varies considerably, many customers feel about 60% of their failures can be attributed to human error. Some are just plain mistakes that should have been easily avoidable. Others involve confused decisions stemming from unclear or complex operational procedures. Part of the solution can involve automating some of the routine procedures. But because so much is not merely routine, detailed procedures and objectives need to be
documented clearly and followed by comprehensive staff training. While vendors can guide customers in this process, improving operational errors lies with the customer.

The remaining 40% or so of failures typically splits fairly evenly between software and hardware. Many of the software failures relate to integration incompatibilities or change management issues. Since most customers manage somewhat unique workloads, resolution falls primarily on the customer. The customer needs a disciplined approach involving thorough stress testing of applications under conditions that represent production environments, as well as a clear process for regression testing of patches and fixes before applying to the production environment. Hardware failures represent the remaining 20%. While not a large number, for the most part these problem areas can be addressed by the hardware vendor, rather than by the end user.

Copyright 2003 D.H. Brown Associates, Inc.

2


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

In addition to designing for high performance, the Marvel development team was chartered to enhance the RAS characteristics of this new generation of AlphaServers. The developers examined predecessor AlphaServer systems to understand field failure scenarios, and reviewed failure rate statistics of the underlying components.

 

Performance of the New AlphaServers

Announced in January 2003, the 1.0 GHz EV7 Marvel midrange includes the up-to-four-way ES47 and the up-to-eight-way ES80. At the high end, the 1.15 GHz GS1280 currently ships in eight- and sixteen-processor configurations. The GS1280 is the new flagship of the AlphaServer Line.

Building on a long-standing reputation for high performance, the EV7-powered AlphaServers can be expected to deliver competitive performance even in the face of increased pressure from other recently updated chips. Benchmarks have been reported for the new AlphaServer systems that illustrate performance leadership across a broad range of applications. Compared to predecessor EV68-based systems, the EV7 servers are projected to offer 35% to 50% or greater performance under Tru64 UNIX. OpenVMS users are expected to see even greater performance gains. In general, larger configurations benefit more from the high system and memory bandwidth coupled with low access latency of the mesh switchless interconnect. Thus, larger GS1280 configurations deliver even more performance relative to other systems.

On the Oracle 11i Application Standard Benchmark, an eight -way GS1280 measured 7728 users, which leads all other eight-way systems. The GS1280’s SPECint_rate/SPECfp_rate of 536/313 currently leads all other measured 32-processor systems. In the STREAM memory bandwidth benchmark, the GS1280 delivers the
best memory bandwidth for non-vector 16-processor systems, and far exceeds the memory bandwidth performance of any other non-supercomputer RISC-based
systems with configurations containing twice as many processors. On the SAP SD 2-Tier Standard Application Benchmark, a 32-way GS1280 beat all other 32-way systems. Reported results for OpenVMS customers in telecommunications and healthcare show performance gains of 100% over previous-generation AlphaServer systems.

In the ever-leapfrogging benchmarking race, others may slip ahead of the EV7 AlphaServer systems at some point. One point is clear – the new AlphaServers yield impressive memory and I/O bandwidths as well as lower latencies; characteristics that will deliver strong performance for both technical and commercial computing environments.

 

 

Copyright 2003 D.H. Brown Associates, Inc.

 

The resulting EV7-based AlphaServers rely on a simple, straightforward architecture that reduces chip count to inherently improve reliability. Supporting infrastructure, such as power supplies and fans, have also been redesigned to improve RAS. AlphaServer developers estimate that Mean Time Between Failure (MTBF) has been improved by 15% to 30% thanks to the simplified architecture. Furthermore, single system availability has been extended, since virtually all hardware failures are backed up by redundant components. In addition, many of the components with the lowest MTBF – power supplies, fans, and management system service processors – can be hot-swapped while the system continues to run. Details of the enhancements form the subject of this paper.

Even with increased efforts to reduce outages, unexpected failures caused by hardware, software, or personnel will occasionally result in unplanned downtime. Failover clustering remains an important option for continued application availability. Single-system image clustering, as found in TruCluster Server from HP, dramatically simplifies cluster management. Reducing the complexity of the cluster by managing it as a single system, the opportunities for human error are greatly reduced. Note however that clustering is not true fault tolerance, and application users may still suffer some disruption before the backup system takes over fully. Improved single-system RAS can reduce the occurrences when failover clustering is called upon. That is, enhanced single-system RAS complements failover clustering. The combination of a robust and resilient single system and failover clustering provides a solid, dependable computing environment.

3


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

KEY RAS IMPROVEMENTS

EV68 Alpha chips use 15.2 million circuits; the EV7 chips contain over 150 million circuits. Certainly a primary goal of the Alpha designers was to employ the additional circuits to drive performance higher. At the same time, using those circuits in a way that simplified overall system design to deliver a highly available, dependable platform stood out as a key issue. The descriptions below highlight how the AlphaServer team enhanced RAS at the same time it created the highperformance server sought by its customers. While not an exhaustive list of all
RAS features, the breadth of areas described indicates the concerted drive to create a powerful and reliable server.

ON-CHIP L2

The EV68 processing core remains a highly respected, high-performance compute engine. Alpha designers wished to preserve that core so that applications would not need the re-optimization that typically accompanies the introduction of a new core design. Since the processor core was not limiting performance, the Alpha team addressed the task of feeding data to the core fast enough, instead of implementing a new core. EV68 employed an off-chip Level 2 (L2) cache, a common design choice when the EV6 was first designed.
Exploiting the larger circuit capacity of the EV7, Alpha designers brought L2 onchip,
substantially boosting performance, thanks to the low latencies of an onchip cache. Equally important, the elimination of off-chip L2 memory chips, chip carriers, and sockets dramatically reduces the number of parts and interconnections with a corresponding reduction in possible failures. As with the L1 caches, the on-chip L2 is covered by ECC that corrects single-bit failures and detects double bit failures.

RAID MEMORY

The Alpha designers also brought the control circuitry for addressing main memory onto the EV7 chip. Once again performance improves: the on-chip memory controllers allow much lower memory access times and higher memory bandwidths than an external memory controller. In addition, reliability is enhanced: the elimination of external chips and interconnections also removes potential points of failure. Furthermore, the EV7 team added another layer of hardware error correction beyond ECC - a RAID memory option.

In most servers today, memory is protected by means of a Single Error Correct, Double Error Detect (SECDED) ECC scheme. Memory error rates are fairly low and SECDED is usually adequate since it is reasonably rare that more than one error will occur in the same word of memory. However, high-end servers can have vast amounts of memory installed, hundreds of gigabytes today, soon stretching into the range of 512 GB - 1 TB. Although the failure of an individual memory location may be tiny, when trillions of bits are considered, the chance of a double-bit error somewhere raises a concern.

4

Copyright 2003 D.H. Brown Associates, Inc.


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

Most memory errors are soft errors, the inadvertent loss of a data bit due to electrical interference, including cosmic rays. The memory circuits themselves remain fully functional and can be reused to hold data. In contrast, a hard error signifies permanent failure of the memory circuitry for that bit. If a soft error can be corrected before another error might occur in the same memory word, most double-bit errors can be avoided. Tru64 UNIX features a memory “troller”mechanism that proactively scans AlphaServer memory for single bit errors, allowing the hardware to automatically correct them using ECC. By cleaning up
these intermittent soft errors, the troller significantly reduces the probability that two errors will eventually appear in the same memory word. By keeping track of where single bit errors were located, the troller also can identify memory locations suffering repeated failures, and allows the operating system to avoid using that memory before it turns into a permanent, hard failure.

Once in a while, hard errors do occur, and may even disable multiple memory bits. For those occasions, something beyond SECDED ECC or memory troller is called for. To address such problems, the EV7 on-chip memory controller incorporates a provision for RAID memory.

Similar to a RAID disk configuration, the RAID memory option adds a redundant-memory RIMM (RDRAM In-line Memory Module) that can be used to correct for multiple failures within one of the other RIMMs. The base EV7 AlphaServer memory configuration spreads data across four RIMMs per memory port. RAID memory adds a fifth RIMM that can be used to recreate the data from any one of the other four RIMMs. The memory controller uses the standard memory ECC to identify which RIMM has failed, and passes reconstructed, correct data along to the processing core. Since the entire bad RIMM is being bypassed, maintenance can be scheduled at a time convenient to the customer’s operational and service-level agreement requirements.

SWITCHLESS INTERCONNECT

A major challenge to system developers is to design a scaleable interconnect that efficiently extends to large configurations but that does not penalize smaller systems with the costs of an elaborate interconnect infrastructure. Marvel designers addressed this challenge by integrating the processor interconnect within the EV7 chip. Each EV7 chip contains an internal router that directly connects the processor core to local memory, a high-performance I/O bus, and
four connections to the routers on adjacent EV7 chips; all as passive interconnect without the need for additional external logic chips.

Figure 1 illustrates sixteen-processor configurations of the EV68 GS160 system, and the EV7 “Marvel” GS1280. As the left side of the figure shows, the EV68 systems employed a hierarchy of switches to interconnect processors, memory, and I/O. The right side highlights the Marvel design, which does not require any external active electronic switches for system interconnect. Every EV7 processor brings its own portion of the interconnect. A four-processor Marvel configuration (one row of processors in the figure) could easily expand to 64-

Copyright 2003 D.H. Brown Associates, Inc.

5


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

processor systems by adding rows and columns of processors to the mesh. (To be accurate, the left and right edges of the mesh connect to each other, as do the top and bottom edges. Thus, with the edges wrapped around, the EV7 interconnect is properly called a 2D torus topology, or donut in more conventional terms.) Switch-based systems typically cover a narrower range of configuration sizes since switches cannot easily be expanded in small increments. For example, the GS160 illustrated in Figure 1 carries the cost of the full Global
Switch needed to support a 32-processor GS320. 64-processor EV68 systems are not offered since a switch to interconnect twice as many processors would would require far more circuitry.

FIGURE 1:

System Interconnects -EV68 GS160 and EV7 Marvel

system interconnects - EV68 GS160 system interconnects - EV7 Marvel

 

Not only does the EV7 internal router permit efficient scalability from small-to-large configurations, it also improves performance, since the internal router circuitry operates faster than the external switches. And, the elimination of local and global switches vastly reduces the number of chips and connections that carry inherent failure rates.

The Marvel interconnect is protected by SECDED ECC to recover from intermittent bit failures. The ECC correction is applied “on the fly,” so as not to affect overall system interconnect bandwidth or latency.

ECC is intended to overcome individual bit errors and cannot correct for permanent interconnect failures. Since the GS160/320 switches did not contain redundant paths, a switch failure could isolate processors and memory, necessitating a switch repair to regain full use of the system. Even hard partitions on EV68 systems could all fail together, if the switch failed. On the other hand,

Copyright 2003 D.H. Brown Associates, Inc.

6


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

failure of an interconnect link in Marvel systems may cause a crash in the affected hard partition, but nowhere else in the system. Upon reboot, Marvel’s 2D torus automatically reroutes around defects.

As part of normal operation, Marvel’s interconnect network monitors traffic on each of the links to detect congestion hot spots. If the network seems overly busy along a particular path because of a large number of data packets, subsequent packets of data will be adaptively routed to their final destination along an alternate set of interconnect paths. An actual router failure will be detected by the same network traffic monitor, which will then reroute around the failure during reboot.1

The failure of the computing core of a processor does not require the router portion of the EV7 to be taken out of service. Rather, the core processor functionality can be disabled while the memory, I/O connections, and interconnect links remain in use. In that way, other processors in the system can continue to access that memory and I/O, and the network can go on routing packets along the paths passing through that chip. Similarly, a failure in the I/O
access circuitry can be isolated to permit memory and interconnect to remain accessible.2

SYSTEM CLOCK

Typical large systems require elaborate clock distribution circuitry to guarantee that clock signals are precisely aligned across the entire system. The interconnect mechanism of the EV7 router function has been designed to tolerate misalignment of clock signals between the EV7 chips. (In technical terms, the EV7 uses pseudo-synchronous clocking to transfer data between chips using clock forwarding.) While clock distribution failures are uncommon, the use of clock forwarding illustrates that the EV7 team devised simple, straightforward
designs. They did not rely on complex techniques that might carry higher failure rates.

In essence, each EV7 processor incorporates its own clock that supports the EV7 processor, interconnect, memory, and I/O. The predecessor EV68 systems used a single clock for the entire system, albeit implemented with extremely reliable components. Because the GS160/GS320 clocking used very low failure rate parts, with a mean time between failure calculated in decades, there were no customer complaints regarding the clocking. Nonetheless, for Marvel, the AlphaServer took the extra step of replicating clocks for simplicity of design and enhanced availability.

1
  
When the system reboots, firmware performs an interconnect integrity test. If that test fails then the system will map around the failed link. If the interconnect integrity test passes, then the link is used.
2
  
If the system reboots and the core fails self-test, then its memory and I/O will be unavailable. If the core fails after the system is up and running, and the indictment software determines it is safe to off-line the core and leave the system running, then the I/O and memory will be accessible. If the core fails in a more catastrophic manner, then it will crash the system, and more than likely all of its memory and I/O will not be accessible on reboot.

Copyright 2003 D.H. Brown Associates, Inc.

7


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

I/O HARDWARE

The EV7 chip was not alone in taking advantage of large circuit count to absorb functions previously contained on external chips. The I/O chip “IO7” ASIC (Application-Specific Integrated Circuit) integrates the functionality previously contained on eight separate chips in the GS160/GS320 implementation. Marvel also added the functionality of AGP support in this chip, which the EV68 systems did not implement. Once again, fewer chips and fewer chip interconnections can improve reliability dramatically.

Each IO7 chip drives a set of paths to remote I/O drawers containing PCI/PCIX and AGP slots. ECC protects all I/O data paths within the drawers as well as those connecting to IO7 chips. The I/O drawers offer hot-plug capability so the I/O adapter cards can be repaired/replaced without taking the system down.

POWER AND COOLING INFRASTRUCTURE

As discussed above, the scaleable interconnect of the EV7 allows the same design point to serve low-end through high-end configurations. While the electronics remain common across systems, there is an added advantage: the power distribution and heat removal technologies can also be common across the product line.

Power supplies, voltage regulators, and fans typically carry higher failure rates than the logic circuitry. Specifying components to achieve high reliability at reasonable cost involves a tricky set of tradeoffs. (Money spent on power and cooling does not increase the benchmark performance and only raises overall system price.) Deploying a common power/cooling technology across low- to high-end systems can increase procurement volumes enough to drive down costs even when specifying premium components. For example, AlphaServer
developers indicate that a common design point allowed them to employ a more reliable power supply, at an attractive price. By comparison with the more standard selection of different power supplies for each model in the product line, AlphaServer chose the cost-effective route.

All Marvel systems, from two processors to sixty-four processors, possess redundant or N+1 power supplies, voltage regulator modules (VRM), and fans. Should one of these infrastructure components fail, the system continues to run. Except for VRMs, hot-swap capabilities allow the failed unit to be removed and replaced while the system continues to operate. Dual AC input allows configuring a second AC power source as backup.

In a similar vein to the single/dual clock design change, the EV68 systems employed a single high-performance blower with a calculated mean time between failures that far exceeded other server components. However, even a minuscule chance of failure concerned some customers since the GS160/GS320’s blower was not duplexed. Now, with a redundant, hot-plug fan assembly used across all Marvel systems, a cost-effective alternative removes any single point of failure concern.

Copyright 2003 D.H. Brown Associates, Inc.

8


EV7 AlphaServers Deliver Enhanced RAS and Powerful Performance

June 2003

SOFTWARE FEATURES ENHANCING PLATFORM RAS

The Tru64 memory troller was highlighted earlier. Most of the discussion so far has focused on hardware implementations specific to the Marvel platforms. There are additional software features, in both Server Management software and in the operating system, that complement the hardware to further enhance system RAS.

As mentioned earlier, a failure of part of the system interconnect 2D torus can be overcome by adaptive rerouting. Thus, I/O circuitry within an EV7 should not become isolated due to a system-interconnect failure. But if a critical I/O device is connected to only one IO7 controller, the device can be lost due to a failure in that controller. The preferred alternative is to configure multiple paths to critical I/O devices, each from a different EV7 I/O controller. Then, in case of failure, the Tru64 or OpenVMS operating system can reach the device through its Dynamic Multi-path I/O support.

In conjunction with the server management hardware embedded within the systems, Tru64 5.1B can monitor resources and track intermittent failures of processors, memory, disks, and various I/O adapters and devices. Recognizing that repeated failures may indicate an impending permanent, non-recoverable failure, the operating system can suspend further use of the suspect resource and log it for deferred diagnosis and/or maintenance.

FINAL THOUGHTS

Historically, Alpha processors were renowned for their fast clock rates delivering blazing floating-point performance. But AlphaServer developers understand that high performance derives not just from clock frequency or complex superscalar design; it also requires a robust cache/memory system to feed the insatiable appetite of high-performance processors. For Marvel, the AlphaServer design team focused on using EV7 circuits to provide high bandwidth and low latency paths to memory, I/O, and other processors in the system. In addition to boosting performance, the simple design enhanced RAS as well by eliminating
failure-prone chips and connections. In addition, N+1, hot plug, power, and cooling further enhance dependability of the systems. This double win – higher performance and enhanced RAS – positions the EV7 AlphaServer as an attractive platform for existing Tru64 and OpenVMS customers.

In 2004, a fabrication shrink will allow the EV79 to achieve faster clock rates and corresponding higher performance. Since the EV79 processors can be added to EV7 systems, investing in Marvel today allows additional growth next year.

In the longer term, the AlphaServer road map entails an evolution to Itanium-based
systems running HP-UX or OpenVMS. AlphaServer users should be planning their testing, pilot, and rollout scenarios and can begin that transition at any time. While such testing and piloting are underway, those customers want to ensure that their production workloads continue running on solid, dependable, high-performance platforms. The new EV7 AlphaServers are those platforms.

Copyright 2003 D.H. Brown Associates, Inc.

9


 
call 1.800.282.6672
privacy statement using this site means you accept its terms feedback to webmaster