Category Archives: HPC

GPU Power outperforming CPU Power

According to the latest
F@H statistics
, graphics cards processors are pretty much
outperforming traditional CPU’s.
The current factor is about 75 GFlops on GPU against an average of
1GFlops on a CPU. (this varies because of very old processors and
modern processors with about 14 GFlops).

Anyway at the moment we see 370 GPUs doing the same amount of work
as 29000 CPUs. This is really great for the F@H project, although the GPU
cores at the moment are only capable of doing basic calculations
with the gromacs application. Other applications have to be heavily
rewritten, and/or are not able to run every calculation on GPUs and
therefore have to use the main CPU for those parts of the
calculation.

Science Sports – Calculate PI

And another game for the next birthday
party:dart-pi
Let your guests calculate Pi by playing some rounds of
“darts”.

Do this by targeting the blue shaped square on the above figure.
And then simply count the darts you are throwing and that will be
inside the square, and additionally the darts that miss the board
and hit inside the marked blue area.

And after a few rounds, let’s say 1.000-100.000 Winking should be enough
to see some good results.
Now use this formula:

dart-pi-formula

A small hint for the frustrated, invite people that are able to aim
the upper left square, but are not so much pros that they will
always hit the board only.

Supercomputer on your graphicsboard

Most people don’t even know they’re
already owning awesome computing power.
They take a look on their CPU, compare the speeds on a
supercomputing archive like Top
500.org
and after that, they’re disappointed.But compared to CPU’s (the main processor), modern graphic boards
GPU’s are already a lot faster than their complex instruction
processing pendants.

Fast GPU’s (graphics processors) are needed for the latest 3D games
which are getting more and more hype. They even bundle lots of
GPU’s to virtual graphics boards for maximum performance. For the
consumer that’s called SLI or crossfire, for graphics offices
Nvidia introduced quadroplex
recently.

nvidia_quadro_plex

Now back to supercomputing, ok GPU’s are faster ? why still use
CPU’s ?
A GPU has the limit that it can only handle simple instructions
which are highly optimized, further they work from their own board
based but extreme fast memory banks. So, a CPU is needed for
complex calculations and system management and flexible functions,
whereas a GPU can do the simple but effective calculations.

Some smart people started to use GPU’s to accelerate physics
calculations for games, making Ageia’s Physx accelerator
board
a nonsense investment.

More smart people started to figure out how to use GPU’s for their
own purposes.
A programmer needs an API to abstract function calls to the
hardware.
The common API for GPU’s is OpenGL, and this leads to a very
uncommon thinking
in programming, as a graphics API like OpenGL has graphics
functions.
Lets get closer to some expression differences.

On graphics boards you know an expression like:
– texture (pixels in x,y space)
– drawing
– shader program
the same on a CPU langauge would simply be:
– array (2 dimensional x,y)
– computing
– algorithm / calculation formula

Now if we want to give the GPU work to do we copy input data from
the main memory to the graphics board texture buffer. Like we do as
graphics textures for game-models.
The next thing is, we implement an algorithm workflow in assembly
as a shader
program. (The graphics card thinks we will animate a flickering
fire for example, and does the calculation according to the shader
program from the input texture buffer to an output texture
buffer.)
And now we transfer back the output texture buffer to the main
systems memory, and wonder wonder…. the calculation was done by
the GPU.
Depending on the shader program and implementation possibility the
GPU will be lightyears faster in calculation as the CPU.

The big advantage of GPU’s is of course graphical calculation, this
is why makes sense to use them for floating point and limit
calculations.
See here for some implemented examples:

Lineare Algebra auf GPU’s


GPU Tutorials

GPGPU
Implementations

Now it’s only a matter of time when distributed computing projects
like
setiathome, faah, hpf, hpf2 and so
on will run a hundred times faster on peoples
home computers graphics hardware.

Hallelujah !