TSMCDRAM2014

 

Memory Architectures vs bandwidth, power and price (Courtesy of TSMC)

 

From Phil Garrou for Yole Développement - Sept. 29, 2014. Since emerging DRAM architectures offer different approaches to bandwidth (BW), power consumption and density. iMicronews thought that a comparison of technologies was worth....A Closer Look.

The challenges for DRAM are to:, (1) reduce power consumption, (2) satisfy bandwidth requirements and (3) satisfy density (miniaturization) requirements all while maintaining low cost . Applications are evolving with different demands on these basic requirements. For example graphics in a smartphone may require bandwidth of 15GB/sec which LPDDR4 can meet, but a networking router may require 300GB/sec which would require one of the new high bandwidth technologies such as HMC or HBM.

DRAM bandwidth (BW) is primarily limited by I/O speed. Increasing I/O speed has power, cost, and area penalties. For portable devices increased power means higher temperatures and lower battery life. Memory is also known to be the biggest user of power in server farms thus the requirement in both portable devices and networking and server applications for low power memory solutions.

Many expect the current DDR SDRAMs, both the compute variety (DDR3 / DDR4) and the mobile variety (LPDDR3/LPDDR4) to reach the end of their road soon as the DDR interface cannot be expected to run at data rates higher than 3.2 Gbps in a traditional computer main memory environment (more).

Emerging DRAM technologies such as wide I/O, HMC and HBM are being optimized for different applications and present different approaches to address bandwidth, power, and area challenges. The common element to HMC, HBM and Wide IO are 3D technologies, i.e. Stacked chips, interposers and TSV.

First some simple definitions:
DDR (double data rate) transfers twice the data per clock as SRDAM. Similarly, DDR3 and DDR4 transfer data 3x and 4x in a clock cycle. DDR and wide IO are complementary technologies for increasing bandwidth.
LPDDR is a low power volatile memory with architecture and interfaces optimized to reduce power consumption and be used in mobile products.
GDDR used mainly in graphics boards and is optimized to function with GPUs.
Bandwidth (BW) = # pins x data rate per pin expressed in GBps (Giggabytes/sec). For instance wide IO with 512 pins and a transfer speed of 1.066 Gbps (Giggabites / sec) would have a bandwidth of 68 GB/sec (recall there are 0.125GB/Gb)
DIMMs (dual inline memory modules) are a memory configuration. They provide two pathways from the memory module to the system, one on the front and one on the backside. DIMMs can be built with standard memory or using stacked TSV based memory as in the recent announcement by Samsung that they will start mass producing 64 GB DDR4 DIMMs that use TSV technology for enterprise servers and cloud-based applications (more).

LPDDR3 and LPDDR4: Addressing the Mobile Market
The 2012 JEDEC standards for LPDDR3 were designed to meet requirements for 4G networks. LPDDR3 was designed to meet the performance and memory density requirements of mobile devices. LPDDR3 provides a higher data rate (2.1Gbps), more bandwidth (12.8GB/s), higher memory densities, and lower power then LPDDR2.
The 2014 JEDEC LPDDR4 standard JESD209-4 is optimized to meet DRAM bandwidth requirements for advanced mobile devices (more). LPDDR4 offers twice the bandwidth of LPDDR3 at similar power and cost points.

Wide I/O 2: Supporting 3D-IC Packaging for PC and Server Applications
Wide I/O increases the bandwidth between memory and its driver IC logic by increasing the IO data bus between the two circuits. Wide I/O typically uses TSVs, interposers and 3D stacking technologies.
The 2014 Wide I/O 2 standard JESD229-2 from JEDEC, is designed for high-end mobile applications that require high bandwidth at the lowest possible power (more). Wide I/O 2 provides up to 68GBps bandwidth, at lower power consumption (better bandwidth/Watt) with 1.1V supply voltage. From a packaging standpoint, the Wide I/O 2 is optimized to stack on top of a system on chip (SOC) to minimize power consumption and footprint. This standard trades a significantly larger I/O pin count for a lower operating frequency. Stacking reduces interconnect length and capacitance. The overall effect is to reduce I/O power while enabling higher bandwidth.

In the 2.5D-stacked configuration, cooling solutions can be placed on top of the two dies. With the 3D-stacked form of Wide I/O 2, heat dissipation can be an issue since there is no standard way to cool stacked die. The Hybrid Memory Cube is a specialized form of the wide I/O architecture.

The Hybrid Memory Cube (HMC) developed by Micron, and IBM is expected to be in mass production in 2014. This architecture consists of a 3D stacked DRAMs on top of logic. For example, 4 DRAM die are divided into 16 "cores" and then stacked. The logic base is at the bottom has 16 different logic segments, each controlling the four DRAMs cores that sit directly on top of it . This type of memory architecture supports a very large number of I/O pins between the logic and DRAM cores, which deliver bandwidths as high as 400GB/s. According to the Hybrid Memory Cube Consortium, a single HMC can deliver more than 15x the performance of a DDR3 module and consume 70 per cent less energy per bit than DDR3.

In addition to Micron and IBM the HMC architecture developer members include Samsung, Hynix, ARM, Open Silicon, Altera, and Xilinx.

Intel recently announced that their Xenon Phi processor "Knights Landing" which will debut in 2015 will use 16GB of Micron HMC stacked DRAM on-package, providing up to 500GB/sec of memory bandwidth (more).

The HMC specs can be found here.

High Bandwidth Memory (HBM)
The 2013 JEDEC HBM memory standard, JESD235 was developed for high end graphics and gaming applications (more). HBM consisting of stacked DRAM die, built with Wide I/O and TSV, supports 128GB/s to 256GB/s bandwidths.

Architecture Choice Depends on Application
Different applications have different requirements in terms of bandwidth, power consumption, and footprint (more).

Because thermal characteristics are critical in high end smartphones, the industry consensus has turned to Wide I/O 2 as the best choice. Wide I/O 2 meets heat dissipation, power, bandwidth, and density requirements. However, it is more costly than LPDDR4.

Given its lower silicon cost, LPDDR4 may be more ideal for tablets and low end smart phones, less cost-sensitive mobile markets.

For high-end computer graphics processing, which are less constrained by cost then mobile devices, HBM memory may be the correct choice.

High performance computing (HPC) or a networking router requiring 300GBps BW is best matched to the HMC.

TSMC has compared the different memory architectures graphically in terms of bandwidth vs power and price.


Table 1 compares some of the properties of these standardized memory architectures.

Memory Std Bandwidth (GBps) Voltage Data rate/pin max (Gbps) I/O Applications
LPDDR3 12.8 1.2 2.133 48 Smartphones, tablets
LPDDR4 25.6 1.1 3.200 64 Smartphones, tablets
Wide IO 2 68 1.1 1.0 512 High end smart phones
HMC 160 1.2 10, 12.5 or 15 (SerDes) - High end servers, networking, graphics
HBM 128 (gen 1)
256 (gen 2)
1.2 1 (gen 1) 
2 (gen 2)
1024 High end graphics, networking and HPC

 

Figure 1 Memory Architecture comparisons


Source: http://www.yole.fr 
arrow
arrow
    全站熱搜

    Shacho San 發表在 痞客邦 留言(0) 人氣()