When playing with SIMD intrinsic, it is a matter of finding the right instructions to do what you want. But sometimes it is tricky because there are various possibilities to do it. And sometimes, I forgot that this or this… Read more[SSE][AVX][SIMD] Horizontal Sum (sum simd vector – intrinsic)
Just for the fun, I put here the code of an application that moves to clipboard what it receives from input.
You might arrive on this page because you are facing segfault or any strange problem “it works white 1 thread but not with 2” in a mix of OpenMP/Cuda application. It was the case for me, let have a look… Read more[OpenMP][Cuda] How to manage CUDA GPU from OpenMP threads
Another example of SpMV but with cuSparse this time. For the same reason, I was not able to find a basic example on the internet, so I suppose this one can be useful to others.
Here is a code sample of using the MKL to perform SpMV (gemv), I put it in different functions but the code is not clean (mix of C and C++). However it is easy to understand, there are the conversions… Read more[C/C++] Sparse matrix MKL examples (C00, CSR, DIA, BCSR) gemv and conversions
I did a presentation about C++11 to give an overview (targeting non C++ users) here are the slides and the code example.
It is true that I am the kind of guy that sometime like to create what already exist. But this time it was because I was not completely satisfy by what exists since there is no standard double linked list… Read more[C] Double Linked list in C with iterator (OpenSource LGPL)
Gcc provides the usual operator (+,-,/,x) for the SSE types. But intel was (I just wrote was because it seems that now it dos). So we implemented quickly these operators to be able do “c=a+b”.
Using MPI_Type_create_struct and MPI_Type_commit, here is a small example to create a type based on a struct. It is clear that it is more safe to do this instead of using the size of the struct and cast to unsigned… Read more[C++][MPI] Create custom data type in mpi
A quick sample of code to replace several lines of text in lots a files.
Maybe you’re trying to put some sse code into a (host) function in a .cu file, well you will not be able to compile.
In this post I present some function taken from different books and rewritten by myself (the first objective was to refresh my memory with some BLAS stuff a long time ago). It composed of 3 modules: Utils, Matrix/vector operations, Linear… Read more[C++/SIMD] Basic Linear Algebra Functions (some with SSE acceleration)
Openmp give a barrier for all threads. Here is a class to perform a barrier with only a group of threads.
This quicksort class is a copy of the one from ScalFMM.
I read (and wrote) 1 month ago some algorithms about pattern matching in text. You can find plenty of this on the web anyway here is my code.
I few weeks ago I wanted to use unit test. But when I was searching for a framework easy to use, fast, that do not need to be installed 10 libs to make it working etc… Well I did not… Read moreC++ – Unit test – easy, one file, basic, simple
How to create an application that allows only one instance at a time. Here is my solution inspired from : http://www.qtcentre.org/wiki/index.php?title=SingleApplication