Saturday, 16 November 2013

CUDA

CUDA™ is a parallel computing platform and programming model invented by NVIDIA. It is a Proprietary technology for GPGPU programming from Nvidia. It enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).CUDA is  Not just API and tools, but name for the whole architecture.

CUDA or Compute Unified Device Architecture is CUDA is the computing engine in Nvidia graphics processing units (GPUs) that is accessible to software developers through variants of industry standard programming languages. Programmers use 'C for CUDA' (C with Nvidia extensions and certain restrictions), compiled through a PathScale Open64 C compiler, to code algorithms for execution on the GPU. CUDA architecture shares a range of computational interfaces with two competitors the Khronos Group's OpenCL and Microsoft's DirectCompute. Third party wrappers are also available for Python, Perl, Fortran, Java, Ruby, Lua, Haskell, MATLAB, and IDL, and native support exists in Mathematica.(Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing)

CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs. Using CUDA, the latest Nvidia GPUs become accessible for computation like CPUs. Unlike CPUs however, GPUs have a parallel throughput architecture that emphasizes executing many concurrent threads slowly, rather than executing a single thread very quickly. This approach of solving general purpose problems on GPUs is known as GPGPU.

In the computer game industry, in addition to graphics rendering, GPUs are used in game physics calculations (physical effects like debris, smoke, fire, fluids) examples include PhysX and Bullet. CUDA has also been used to accelerate non-graphical applications in computational biology, cryptography and other fields by an order of magnitude or more. An example of this is the BOINC distributed computing client.CUDA provides both a low level API and a higher level API. The initial CUDA SDK was made public on 15 February 2007, for Microsoft Windows and Linux. Mac OS X support was later added in version 2.0,which supersedes the beta released February 14, 2008.CUDA works with all Nvidia GPUs from the G8x series onwards, including GeForce, Quadro and theTesla line. CUDA is compatible with most standard operating systems. Nvidia states that programs developed for the G8x series will also work without modification on all future Nvidia video cards, due to binary compatibility.





With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA. Here are a few examples:

Identify hidden plaque in arteries:  Heart attacks are the leading cause of death worldwide. Harvard Engineering, Harvard Medical School and Brigham & Women's Hospital have teamed up to use GPUs to simulate blood flow and identify hidden arterial plaque without invasive imaging
GPU-PerfStudio

GPU PerfStudio is a real-time performance analysis tool which has been designed to help tune the graphics performance of your DirectX 9, DirectX 10,DirectX 11 and OpenGL applications. GPU PerfStudio displays real-time API, driver and hardware data which can be visualized using extremely flexible plotting and bar chart mechanisms. The application being profiled maybe executed locally or remotely over the network. GPU PerfStudio allows the developer to override key rendering states in real-time for rapid bottleneck detection. An auto-analysis window can be used for identifying performance issues at various stages of the graphics pipeline. No special drivers or code modifications are needed to use GPU PerfStudio.

GPU PerfStudio  gives developers control with seamless workflow integration. Spend more time writing code and less time debugging. Identify performance and algorithm issues early in the development cycle, and meet your quality and performance goals.
Key Features:
  • Integrated Frame Profiler
  • Integrated Frame Debugger
  • Integrated Shader Debugger with support for DirectX™ HLSL and ASM
  • Integrated API Trace with CPU timing information
  • Client / Server model
  • GPU PerfStudio 2 Client runs locally or remotely over the network
  • GPU PerfStudio 2 Server supports 32-bit and 64-bit applications
  • Supports DX11, DX10.1, DX10 and OpenGL 4.0 applications
  • No special build required for your application
  • Customizable Client GUI, define and save your own window layouts
  • Drag and drop your application onto the server to start debugging
  • No installation required – copy and run anywhere – your settings go with you.
Integrated tools:
GPU PerfStudio  integrates four tools that are key for the contemporary graphics developer;
  • Frame Debugger: The Frame Debugger gives you access to the drawcalls within your application and allows you to view their states and resources. It is possible to pause your application on any given frame and analyze the elements that make up the current frame buffer image. The user may scrub through the draw calls to locate the draw call of interest. The Frame Debugger specializes in viewing the active state for any draw call in a frame and has specialized data viewers for image and description based data. Each data viewer is a dockable window that can be placed and resized by the user to make custom layouts that can be saved and reloaded.
  • Frame Profiler: The Frame Profiler provides both a simple overview of the current frame profile – along with more in-depth analysis tools. The initial overview allows you to determine if your application is bound by the CPU or GPU. The in-depth analysis provides access to individual counters and allows you to save custom selections for specialized workflow.
  • Shader Debugger: The Shader Debugger allows you to debug HLSL and ASM Pixel, Compute, and Vertex Shaders inside your application. It allows you to step through your shaders one line at a time and view the registers, variables, and constant values at each pixel. It is even possible to insert breakpoints in the code so that you can quickly jump to a particular line and start debugging. To aid in understanding the flow control of your shader, a Draw Mask image visualizes which pixels were written by the previous instruction.
  • Shader Editing: PerfStudio 2.5 introduces shader editing as a new feature to help the developer author and debug shaders from inside a running applications. The user is able to edit DirectX11 HLSL code in the shader Code Window, re-compile it using the original or modified compiler flags, and insert the new shader into the application being debugged. This can significantly speed up the edit/save/app-restart cycle as multiple edits can applied in one debug session without having to restart the app or the debug tools. Re-insertion of the modified shader into the running application allows the user to immediately see the results of their edits and quickly assess their impact. Coupled with the profiler it is possible to measure the performance impact of an edit by doing before and after edit profiles and comparing the results.
  • API Trace: The API Trace allows you to see all the API calls made by your application in a single frame. If your application uses Direct3D markers the API Trace will use them to create a navigation tree to help you explore the trace.
This screen shot shows the Frame Profiler and, Frame Debugger in use at the same time. In this scenario the profiler was used to identify an expensive draw call. The draw call was selected in the blue list on the right hand side causing the Frame Debugger to jump to that draw call. The vertex and index buffer, the texture assets, and depth buffer for this draw call are currently displayed. The pixel shader code can be stepped through where the relationship between the code and assets can be thoroughly explored to identify costly aspects of the shader.
GPU PerfStudio 2.9:
GPU PerfStudio 2.9 is a fully featured Performance Tool with Integrated Frame Debugger, Frame Profiler and Shader Debugger.

  • This release focuses on improving stability of the product and fixes critical issues in Frame Capture and the Frame Debugger
  • Several issues with Vertex Shader debugging have also been resolved
  • The Shader Debugger constants table can now be saved to disk