The Power10 processor-based E1050 server introduces a new 4U tall DDIMM, which has a new OpenCAPI memory interface that is known as OMI for resilient and fast communication to the processor. This new memory subsystem design delivers solid resiliency features, as described below:
Ê Memory buffer: The DDIMM contains a memory buffer with key resiliency features, including protection of critical data and address flows by using cyclic redundancy check (CRC), error correction code (ECC), and parity; a maintenance engine for background memory scrubbing and memory diagnostics; and a Fault Isolation Register (FIR) structure, which enables firmware attention-based fault isolation and diagnostics.
Ê OMI: The OMI interface between the memory buffer and processor memory controller is protected by dynamic lane calibration and a CRC retry/recovery facility to retransmit lost frames to survive intermittent bit flips. A complete lane fail can also be survived by triggering a dynamic lane reduction from eight to four independently for both up and downstream directions. A key advantage of the OMI interface is that it simplifies the number of critical signals that must cross connectors from processor to memory compared to a typical industry-standard DIMM design.
Ê Memory ECC: The DDIMM includes a robust 64-byte Memory ECC with 8-bit symbols, which can correct up to five symbol errors (one x4 chip and one more symbol) and retry for data and address uncorrectable errors.
Ê Dynamic row repair: To further extend the life of the DDIMM, the dynamic row repair feature can restore full use of a dynamic RAM (DRAM) for a fault that is contained to a DRAM row while the system continues to operate.
Ê Spare temperature sensors: Each DDIMM provides spare temperature sensors such that the failure of one does not require a DDIMM replacement.
Ê Spare DRAMs: 4U DDIMMs include two spare x4 memory modules (DRAMs) per rank, which can be substituted for failed DRAMs during runtime operation. Combined with ECC correction, the two spares allow the 4U DDIMM to continue to function with three bad DRAMs per rank compared to 1 (single device data correct) or 2 (double device data correct) bad DRAMs in a typical industry-standard DIMM design. This setup extends
self-healing capabilities beyond what is provided with dynamic row repair capability.
Ê Spare Power Management Integrated Circuits (PMICs): 4U DDIMMs include PMICs such that the failure of one PMIC does not require a DDIMM replacement.
Note: DDIMMs are also available in a 2U form factor. These 2U DDIMMs are not supported in the Power E1050 server.
Chapter 2. Architecture and technical overview 57
Figure 2-10 shows a 4U DDIMM with a plug that connects into an OMI slot.
Figure 2-10 4U DDIMM feature
Maximum memory and maximum theoretical memory bandwidth
The OMI physical interface enables low latency, high-bandwidth, and technology-neutral host memory semantics to the processor and allows attaching established and emerging memory elements. With the Power10 processor-based E1050 server, OMI initially supports one main tier, low-latency, and enterprise-grade DDR4 DDIMM per OMI link. This architecture yields a total memory module capacity of 16 DDIMMs per populated processor module (64 with all four modules populated).
The memory bandwidth and the total memory capacity depend on the DDIMM density and the associated DDIMM frequency that are configured for the Power E1050 server. Table 2-7 list the maximum memory and memory bandwidth per populated socket and the maximum values for a fully populated server.
Table 2-7 Maximum theoretical memory and memory bandwidth for the Power E1050 server
58 IBM Power E1050: Technical Overview and Introduction