» » » » » » » » Nvidia’s GTX 680 emphasizes efficiency, pours on the speed

By: Hugo Repetto Posted date: April 20, 2012 Comments: 0

nvidia_logo2-321x250Nvidia’s GTX 680 emphasizes efficiency, pours on the speed
Nvidia takes the wraps off Kepler, its next-generation GPU, and quite possibly the greatest step forward for graphics processing since it unveiled the G80 in November 2006. 

The first desktop card out of the gate is the GTX 680. Unlike the 5xx series, which were based on a refined version of the Fermi architecture Nvidia debuted back in 2010, Kepler uses a new GK104 GPU — and its design is a sharp departure from Nvidia’s previous architectures.

Over the past five years, Nvidia’s GPU strategy has more-or-less amounted to “Everything+Kitchen Sink and we’ll sort things out when we do the refresh.” After the disastrous debut of its R600 architecture in 2006, AMD adopted a strategy of building smaller, mid-range oriented parts and doubling them up to address the high end of the market — Nvidia, in contrast, adamantly stuck to its monolithic guns. Until Kepler.

The transistor counts below are from Nvidia; Kepler’s die size is estimated but should be close to the mark. Kepler’s die size and transistor count are notable achievements in and of themselves, but we’ve barely scratched the surface of the new core. Here’s a table comparing the vitals of NV’s GT200 that debuted in 2008, (the “Tesla” moniker refers to the GPU family, not the high-end scientific computing cards), Fermi, and GK104.
Kepler-Comparison
Shaders are now clocked at the same speed as the graphics core. Kepler is clocked 30% higher than Fermi and packs 3x as many cores, but we want to highlight a change Nvidia wouldn’t explain during its presentation — the GK104′s cores aren’t as capable as the GF110′s. With 3x the core count and a 30% clock speed boost, Kepler “only” offers twice the GFLOP throughput. Not that that’s a bad thing.
A number of other GPU resources have been shuffled around as well.
Kepler2
Nvidia’s ratio column is remarkably unhelpful; it only describes the increase between Fermi and Kepler rather than how resources are distributed relative to each other. GK104 packs four times the special function units (SFUs) and twice the texture units as GF110; the core is capable of processing twice as many instructions per clock (though it has three times as many cores to fill with those instructions).
 www.zanox.com
One area Nvidia did shed some light on are the changes it made to its warp scheduler. In weaving (with a loom), the term “warp” refers to the longitudinal threads in a pattern; Nvidia uses the term to mean a group of threads. For our purposes it roughly corresponds to the thread scheduler.
InstructionRegister
Fermi’s scheduler was designed with hardware stages to “prevent data hazards in the math datapath itself.” Registers were tracked and checked before data was issued to ensure that they were ready for new instructions, while decoded instructions were kept available for fast dispatch when applicable. Kepler simplifies this structure and handles some of the checking in software; dispatch latency instructions are now issued alongside the instructions themselves.

The company also notes that “We also developed a new design for the processor execution core, again with a focus on best performance per watt. Each processing unit was scrubbed to maximize clock gating efficiency and minimize wiring and retiming overheads.”
What all this adds up to is a rearchitected GPU with a focus on power efficiency that’s been notably lacking from the company’s previous high-end efforts. Those of you familiar with Nvidia’s historic naming schemes will recognize the GK104 moniker as one that Team Green typically would reserve for a mid-range GPU. Thus far, there’s no indication of a higher-end part in the works, and no obvious places where NV might have disabled compute units to improve yields, as it did with GF100.

Nvidia wasn’t able to ship us a card for testing — heck, the company wasn’t even able to brief us until less than 24 hours ago — so we have to preface our data with the hefty caveat that these figures are drawn from Nvidia’s own testing. The only good news is that these figures are from the company’s whitepaper rather than poorly labeled slides, meaning we were able to at least check the Appendices for config details. The company also included results in a wide range of titles and two prominent resolutions. Generally speaking there’s an inverse relationship between a company’s confidence in a product’s performance and the results they’ll hand you on launch day.
GTX680-Perf
GTX680-Perf2
For those of you who are curious, the GTX 680 is a consistent 14% faster than the highest-end Radeon 7970. That’s not a variance that flat blows the doors down, but there are other factors to consider. The GTX 680 is priced at $499 (we’ll see if NV can hold the price there post-launch), while the Radeon 7970 is $50 more. This time around, Nvidia appears to have beaten AMD on both die size and transistor count. Other factors, such as opting for 2GB of RAM instead of 3GB, and using a smaller 256-bit memory bus instead of the Radeon’s 384-bit option, also tilt cost structures in Nvidia’s favor.

Its been years since Nvidia has been able to claim to have a GPU that decisively took home both the power efficiency and performance crown, but the GK104-based GTX 680 appears to have done just that. We’ll reserve final judgment until we’re able to run our own numbers, but this chip is impressive on multiple fronts. The one fly in the ointment is its GPU compute performance; figures on that front were very noticeably absent from Nvidia’s briefing yesterday, but the technical data available suggests that GK104 trades some raw math muscle for its new gaming oomph. Then again, that’s not necessarily a bad thing — AMD has effectively left the GPGPU compute field (at least where scientific computing is concerned).
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:
windows_xp

Do you consider this article interesting? Share it on your network of Twitter contacts, on your Facebook wall or simply press "+1" to suggest this result in searches in Google, Linkedin, Instagram or Pinterest. Spreading content that you find relevant helps this blog to grow. Thank you!

stampa la pagina
«
Next
Newer Post
»
Previous
Older Post

No comments:

Leave a Reply

Do not insert clickable links or your comment will be deleted. Checkbox Send me notifications to be notified of new comments via email.

My Favorites in Instagram

Catan, Games, Windows 10 Office 2013, Windows, Microsoft Windows 8, Keyboard, Microsoft Windows Phone 8, Microsoft, Mobile windows 7, apps, Microsoft Windows 10, Microsoft, Upgrade Windows Store, Apps, Microsoft Avidemux, Musica, Multimedia Windows, Silverlight, Microsoft Windows 7, Migration, Windows Vista

Recent Posts

Find Us On Pinterest

Labels

Find Us On Facebook

Popular Posts

Archive

My Ping in TotalPing.com follow us in feedly

Random Posts