NVIDIA Discloses Next-Generation Tegra SoC; Parker Inbound?by Joshua Ho on January 5, 2016 4:31 AM EST
While NVIDIA has been rather quiet about the SoC portion of the DRIVE PX 2, it’s unmistakable that a new iteration of the Tegra SoC is present.
The GPUs and SoCs of the DRIVE PX 2 are fabricated on TSMC’s 16nm FinFET processes, which is something that we haven’t seen yet from NVIDIA. The other obvious difference is the CPU configuration. While Tegra X1 had four Cortex A57s and four Cortex A53s, this new SoC (Tegra P1?) has four Cortex A57s and two Denver CPUs. As of now it isn’t clear whether this is the same iteration of the Denver architecture that we saw in the Tegra K1. However, regardless of what architecture it is we’re still looking at a CPU architecture that is at least partially an ARM in-order core with a wide, out of order VLIW core that relies on dynamic code optimization to translate ARM instructions into the VLIW core ISA.
Based on the description of the SoC, while NVIDIA is not formally announcing this new SoC or giving it a name at this time, the feature set lines up fairly well with the original plans for the SoC known as Parker. Before it was bumped to make room for Tegra X1, it had been revealed that Parker would be NVIDIA's first 16nm FinFET SoC, and would contain Denver CPU cores, just like this new SoC.
NVIDIA's Original 2013 Tegra Roadmap, The Last Sighting of Parker
Of course Parker was also said to include a Maxwell GPU, whereas NVIDIA has confirmed that this new Tegra is Pascal based. Though with Parker's apparent delay, an upgrade to Pascal makes some sense here. Otherwise we have limited information on the GPU at present besides its Pascal heritage; NVIDIA is not disclosing anything about the number of CUDA cores or other features.
|NVIDIA Tegra Specification Comparison|
|CPU Cores||4x ARM Cortex A57 +
4x ARM Cortex A53
|2x NVIDIA Denver +
4x ARM Cortex A57
|Memory Clock||1600MHz (LPDDR4)||?|
|Memory Bus Width||64-bit||?|
|FP16 Peak||1024 GFLOPS||?|
|FP32 Peak||512 GFLOPS||?|
|Manufacturing Process||TSMC 20nm SoC||TSMC 16nm FinFET|
But for now the bigger story is the new Tegra's CPU configuration. Needless to say, this is at least somewhat of an oddball architecture. As Denver is a custom CPU core, we’re looking at a custom interconnect by NVIDIA to make the Cortex A57 and Denver cores work together. The question then is why would NVIDIA want to pair up Denver CPU cores with also relatively high performng Cortex A57 cores?
At least part of the answer is going to rely on whether NVIDIA’s software stack either uses the two clusters in a cluster migration scheme or some kind of HMP scheme. Comments made by NVIDIA during their press conference indicate that they believe the Denver cores on the new Tegra will offer better single-threaded performance than the A57s. Without knowing more about the version of Denver in the new Tegra, this is somewhat surprising as it’s pretty much public that Denver has had issues when dealing with code that doesn’t resemble a non-branching loop, and more troublesome yet code generation for Denver can take up a pretty significant amount of time. As we saw with the Denver TK1, Cortex A57s can actually be faster clock for clock if the code is particularly unfavorable to Denver.
Consequently, if NVIDIA is using a traditional cluster migration or HMP scheme where Denver is treated as a consistently faster core in all scenarios, I would be at least slightly concerned if NVIDIA decided to ship this configuration with the same iteration of Denver as in the Tegra K1. Though equally likely, NVIDIA has had over a year to refine Denver and may be rolling out an updated (and presumably faster) version for the new Tegra. Otherwise it also wouldn’t surprise me if the vast majority of CPU work for PX 2 is run on the A57 cluster while the Denver cluster is treated as a co-processor of sorts, in which only specific cases can even access the Denver CPUs.
Post Your CommentPlease log in or sign up to comment.
View All Comments
phoenix_rizzen - Tuesday, January 5, 2016 - linkFrom the article:
"this new SoC (Tegra P1?) has four Cortex A57s and two Denver CPUs. As of now it isn’t clear whether this is the same iteration of the Denver architecture that we saw in the Tegra K1. However, regardless of what architecture it is we’re still looking at a CPU architecture that is at least partially an ARM in-order core with a wide, out of order VLIW core that relies on dynamic code optimization to translate ARM instructions into the VLIW core ISA."
The Cortex-A57 CPU is an out-of-order CPU architecture, so the phrase "an ARM in-order core with a wide, out of order VLIW core" seems out-of-place and incorrect.
Krysto - Tuesday, January 5, 2016 - linkNvidia seems to always be behind one way or another. Denver is a 2014 core, and A57 is a first half of 2015 core. This will will arrive in the second half of 2016 most likely, and will have to compete against Kryo and A72.
I blame this on Nvidia's lazy execution. They'd rather keep being lazy and use an old core, instead of running a tighter ship over there and launch a more contemporary CPU like everyone else.
Yojimbo - Tuesday, January 5, 2016 - linkDenver is a project name, and NVIDIA doesn't seem to use it the way you are using it here by putting it in the same category as the term "A57" and assigning it a particular date (2014). I think it's safe to assume that the 2016 Denver CPU will not be the same core as the 2014 Denver CPU. Besides, speculation on Parker based on Drive PX 2 is dubious being that NVIDIA didn't mention any SOC name when talking about Drive PX 2. Drive PX 2 has particular demands which are different from the demands of a tablet or smartphone device. Even if the Denver used in Drive PX 2 is the 2014 Denver, there is no good reason to assume that NVIDIA's next tablet/console SOC, assuming there is one, will use the same core or resemble Drive PX 2 in any way.