Home » , » Nvidia's Tegra 4 GPU: Doubling Down On Efficiency

Nvidia's Tegra 4 GPU: Doubling Down On Efficiency

Written By TahaS. on Monday, March 4, 2013 | 9:20 AM

Nvidia's Tegra 4 GPU: Doubling Down On Efficiency







Table of contents
After fending off barbs from its competition about Tegra 3's power consumption under load, Nvidia wanted to show off the architectural efficiency of Tegra 4. We sat down with the company for a deep-dive on the SoC family's unique GPU implementations.
From the announcement of its first Tegra SoC back in 2008, Nvidia’s greatest advantages in the mobile segment were its background with GPUs and platforms. The company’s close relationships with game developers was destined to be a boon too, since most mobile titles are easily characterized as mainstream. Increasingly, though, ISVs are utilizing powerful architectures to rival current-gen consoles, benefiting from more mature tools to fully exploit potent graphics engines.
Although we saw companies like Intel go after the power consumption of Nvidia's Cortex-A9-based Tegra 3, the company is eager to show its Tegra 4 as a solution designed with performance per square millimeter and, in turn, performance per watt in mind. In fact, Nvidia already has a reference phone design based on its Tegra 4i SoC called Phoenix. The two boards below fit into the 5" device and host different implementations of Tegra 4.

We already know that Tegra 4's GPU isn’t a unified shader design. Nvidia claims the time is simply not yet right to make that transition. And so, we still have separate programmable pixel and vertex shaders. The company also isn’t able to declare OpenGL ES 3.0 compatibility, though it’s emphatic this doesn’t adversely affect what developers are able to do with Tegra 4.
And so, the GPU in its newest SoC looks a lot like an evolution of Tegra 3, plus a number of improvements.

Tegra 4Tegra 4iTegra 3
Vertex Processing Engines631
Pixel Pipes422
MADs726012
Clock Rate672 MHz660 MHz416 /520 MHz
Fill Rate2.68 Gpix/s1.32 Gpix/s1.04 Gpix/s
Memory Interface2 x 32-bit1 x 32-bit1 x 32-bit
Memory SupportDDR3L-1866, LPDDR3-1866DDR3L-1866, LPDDR3-2133DDR3-1600, LPDDR2-1066
Manufacturing28 nm28 nm40 nm

Tegra 3 employs a single vertex shading unit with four FP32-capable cores. It also includes two fragment pipes, each with four cores capable of FP20 precision. The four vertex and eight pixel shaders is how we come to call Tegra 3’s GPU a 12-core design.

In contrast, Tegra 4 has six vertex processing engines with four “cores” each. Factor in clock rate differences (using 672 MHz for Tegra 4 and 520 MHz for Tegra 3), and that adds up to about 7.75x more vertex shading performance this generation.
Its four pixel pipes contain 12 shader “cores” each (that’s three ALUs per pipe, and four multiply-add units per ALU), adding up to 48. Assuming the same clocks, you’re again looking at 7.75x more fragment shader performance.

 Excerpt

Thank you for reading!

Share this article :

Post a Comment

 
Support : Creating Website | Johny Template | Mas Template
Copyright © 2011. World Tech News - All Rights Reserved
Template Created by Creating Website Published by Mas Template
Proudly powered by Blogger