GEAR: The Titan Supercomputer
posted on: 6 march 2014
o
Cover
AT SOME POINT DURING DEVELOPMENT OF A KINECT BASED APP I faced an annoying issue. I was working on a HP Elitebook 8530w, a high end workstation model my university advised me (and offered with a hefty discount in 2009) for the practicals of my study Computer Science.

The model features Windows 8.1 pro, 4 GB DDR2 RAM, Intel Core2Duo T9600@2.8 Ghz, 512 MB dedicated Quadro FX770m graphics card and a 250Gb old school platterdisk. When playing (read: manipulating) with the Kinect output in Visual Studio (2013) often my screen just froze. Other times it became very unresponsive. Obviously my computer was not able to handle the computations. The solution to the unresponsiveness was offloading stuff from the UI thread to a separate one. After implementing a simple frame counter I was able to measure performance - around 1-5 frames per second. And I was not even doing fancy things! Just the usual image processing like depth slicing (from 3D to 2D), pixel corrections over the whole image, etc. Next was playing with pattern recognition algorithms and that's when thing became really sour. At that point experimenting with code samples went from fast trial-and-error to slow think-before-trying.

The solution was a supercomputer. As a student, I tend not to have that or the Bitcoins around. Sure I could opt for time at the TU Delft, but that would be impractical - they work good with batch processing instead of real time development. So with a fair bit of help of my family I was able to buy my dream computer. I wanted to play with the SIRF/SURF algorithms and offload the calculations to my GPU. At that point I started to love working with the EMGU CV framework, so it required the best CUDA graphics card out there. It became the Nvidia GTX Titan, a 6GB DDR 5, CUDA version 3.5 capable card. Oh and performing 5 teraFLOPS of single precision numbers. That's crazy considering that 20 years ago you needed a house + power plant for that kind of power. Next was the processor. Ideally that would have been TheNextTel 128 core @ 4GHz processor, but alas, it's not out yet. Second best was the XEON E7 series with 15 cores, but at E 6000,- for just the CPU not really feasible. Third best was the new Extreme edition or the i4770k Haswell. They perform toe to toe, but in my case 2 extra cores would benefit with the Extreme edition. Still the lack of native usb 3.0 support and the new features Haswell brings I choose to go for the i4770k. To make a long story short, I went for 32 GB of low profile RAM (allowing for a RAM disk :D ) and a RAID 0 SSD array of 2x 250GB Crucials with power loss protection. 1000Mb/s throughput. Woop Woop! Other components are: Dark Rock Pro 2 + artic MX-4 cooling paste + ArtiClean purifier, MSI Z87-G45, Seasonic Platinum 660 Watt, Fractal Design Define R4 Black w/ Window and a Acer T232HLbmidz 23" Touchscreen IPS LED.

Well the system flies. I had to refactor my code into tasks and really embrace the parallel programming paradigm. The new .NET 4.5 with async/await, tasks and the TPL flow library really help to that end, maybe more on that in another post. There is still much performance gains to be made, exploiting the GPU more and using smart caching strategies. But for prototyping, this machine is really the ultimate development workstation.

Cover

Cover

Cover

Cover

Cover

Cover

Cover

Cover

Steven Bos

Share !
comments powered by Disqus
TOP
TOP
TOP