Skip to main content

Nvidia’s GTX 680 emphasizes efficiency, pours on the speed

nvidia_logo2-321x250Nvidia’s GTX 680 emphasizes efficiency, pours on the speed
Nvidia takes the wraps off Kepler, its next-generation GPU, and quite possibly the greatest step forward for graphics processing since it unveiled the G80 in November 2006. 

The first desktop card out of the gate is the GTX 680. Unlike the 5xx series, which were based on a refined version of the Fermi architecture Nvidia debuted back in 2010, Kepler uses a new GK104 GPU — and its design is a sharp departure from Nvidia’s previous architectures.

Over the past five years, Nvidia’s GPU strategy has more-or-less amounted to “Everything+Kitchen Sink and we’ll sort things out when we do the refresh.” After the disastrous debut of its R600 architecture in 2006, AMD adopted a strategy of building smaller, mid-range oriented parts and doubling them up to address the high end of the market — Nvidia, in contrast, adamantly stuck to its monolithic guns. Until Kepler.

The transistor counts below are from Nvidia; Kepler’s die size is estimated but should be close to the mark. Kepler’s die size and transistor count are notable achievements in and of themselves, but we’ve barely scratched the surface of the new core. Here’s a table comparing the vitals of NV’s GT200 that debuted in 2008, (the “Tesla” moniker refers to the GPU family, not the high-end scientific computing cards), Fermi, and GK104.
Kepler-Comparison
Shaders are now clocked at the same speed as the graphics core. Kepler is clocked 30% higher than Fermi and packs 3x as many cores, but we want to highlight a change Nvidia wouldn’t explain during its presentation — the GK104′s cores aren’t as capable as the GF110′s. With 3x the core count and a 30% clock speed boost, Kepler “only” offers twice the GFLOP throughput. Not that that’s a bad thing.
A number of other GPU resources have been shuffled around as well.
Kepler2
Nvidia’s ratio column is remarkably unhelpful; it only describes the increase between Fermi and Kepler rather than how resources are distributed relative to each other. GK104 packs four times the special function units (SFUs) and twice the texture units as GF110; the core is capable of processing twice as many instructions per clock (though it has three times as many cores to fill with those instructions).
 www.zanox.com
One area Nvidia did shed some light on are the changes it made to its warp scheduler. In weaving (with a loom), the term “warp” refers to the longitudinal threads in a pattern; Nvidia uses the term to mean a group of threads. For our purposes it roughly corresponds to the thread scheduler.
InstructionRegister
Fermi’s scheduler was designed with hardware stages to “prevent data hazards in the math datapath itself.” Registers were tracked and checked before data was issued to ensure that they were ready for new instructions, while decoded instructions were kept available for fast dispatch when applicable. Kepler simplifies this structure and handles some of the checking in software; dispatch latency instructions are now issued alongside the instructions themselves.

The company also notes that “We also developed a new design for the processor execution core, again with a focus on best performance per watt. Each processing unit was scrubbed to maximize clock gating efficiency and minimize wiring and retiming overheads.”
What all this adds up to is a rearchitected GPU with a focus on power efficiency that’s been notably lacking from the company’s previous high-end efforts. Those of you familiar with Nvidia’s historic naming schemes will recognize the GK104 moniker as one that Team Green typically would reserve for a mid-range GPU. Thus far, there’s no indication of a higher-end part in the works, and no obvious places where NV might have disabled compute units to improve yields, as it did with GF100.

Nvidia wasn’t able to ship us a card for testing — heck, the company wasn’t even able to brief us until less than 24 hours ago — so we have to preface our data with the hefty caveat that these figures are drawn from Nvidia’s own testing. The only good news is that these figures are from the company’s whitepaper rather than poorly labeled slides, meaning we were able to at least check the Appendices for config details. The company also included results in a wide range of titles and two prominent resolutions. Generally speaking there’s an inverse relationship between a company’s confidence in a product’s performance and the results they’ll hand you on launch day.
GTX680-Perf
GTX680-Perf2
For those of you who are curious, the GTX 680 is a consistent 14% faster than the highest-end Radeon 7970. That’s not a variance that flat blows the doors down, but there are other factors to consider. The GTX 680 is priced at $499 (we’ll see if NV can hold the price there post-launch), while the Radeon 7970 is $50 more. This time around, Nvidia appears to have beaten AMD on both die size and transistor count. Other factors, such as opting for 2GB of RAM instead of 3GB, and using a smaller 256-bit memory bus instead of the Radeon’s 384-bit option, also tilt cost structures in Nvidia’s favor.

Its been years since Nvidia has been able to claim to have a GPU that decisively took home both the power efficiency and performance crown, but the GK104-based GTX 680 appears to have done just that. We’ll reserve final judgment until we’re able to run our own numbers, but this chip is impressive on multiple fronts. The one fly in the ointment is its GPU compute performance; figures on that front were very noticeably absent from Nvidia’s briefing yesterday, but the technical data available suggests that GK104 trades some raw math muscle for its new gaming oomph. Then again, that’s not necessarily a bad thing — AMD has effectively left the GPGPU compute field (at least where scientific computing is concerned).
If you liked this article, subscribe to the feed by clicking the image below to keep informed about new contents of the blog:
windows_xp

Comments

Popular posts from this blog

How to change the size of the touch and on-screen keyboard in Windows 10

Windows 10 PCs come with two keyboard apps, one is the OnScreen Keyboard , and the other is the Touch Keyboard . Basically, you don't need a touch screen to use the on-screen keyboard. It displays a virtual keyboard on the screen and you can use the mouse to select and press the keys. Although the on-screen keyboard app is very useful when we don't have a physical keyboard, its size is always a problem for users. You can move or enlarge the virtual keyboard from the icons in the upper right corner. If you want, you can also easily resize it. Changing the size of the on-screen keyboard is very easy. Type On-Screen Keyboard in your Windows search and run the desktop app, or you can also go via Settings > Ease of Access > Keyboard> Turn on the On-screen keyboard.   To change the size of the on-screen keyboard, move the cursor to the corner and drag it to the desired size. Resizing the touch keyboard is as simple as doing it! Just drag it and resize it us...

Designing the Windows 8 touch keyboard.

When we began planning how touch and new types of PCs might work on Windows 8, we recognized the need to provide an effective method for text entry on tablets and other touch screen PCs. Since Windows XP SP1, which had Tablet PC features built in, Windows has included a touchable on-screen keyboard. But those features were designed as extensions to the desktop experience.  For Windows 8, we set out to improve on that model and introduce text input support that meets people’s needs, matches our design principles, and works well with the form factors we see today and expect to see in the future. I’m writing this blog post on our Windows 8 touch keyboard using the standard QWERTY layout in English. As I look at it, the keyboard seems very simple and sort of obvious. This comes partly from having worked on it for a while, but also because keyboards are familiar to us. But there is more here than meets the eye (or, fingertips). We started planning this feature area with no preco...

How to install offline .NET Framework 3.5 on Windows 10 using DISM.

Windows 10 comes with .NET framework 4.5 pre-installed, but many apps developed in Vista and Windows 7 era require the .NET framework v3.5 installed along with 4.5. These apps will not run unless you will install the required version. When you try to run any such app, Windows 10 will prompt you to download and install .NET framework 3.5 from the Internet. However, this will take a lot of time. You can save your time and install .NET Framework 3.5 from the Windows 10 installation media. This method is much faster and does not even require an Internet connection. Here is how to install it. How to install offline .NET Framework 3.5 on Windows 10 using DISM. Contents: [ hide ] How to install offline .NET Framework 3.5 on Windows 10 using DISM. To install .NET Framework 3.5 in Windows 10, do the following: Insert your Windows 10 DVD, or double click its ISO image, or insert your bootable flash drive with Windows 10, depending on what you have. Open 'This PC' in File...