Here is my implementation of the paper: “Fast Exponential Computation on SIMD Architectures” https://www.researchgate.net/publication/272178514_Fast_Exponential_Computation_on_SIMD_Architectures This code has been taken from Inastemp ( https://gitlab.mpcdf.mpg.de/bbramas/inastemp ) which is under MIT licence.
When playing with SIMD intrinsic, it is a matter of finding the right instructions to do what you want. But sometimes it is tricky because there are various possibilities to do it. And sometimes, I forgot that this or this… Read more[SSE][AVX][SIMD] Horizontal Sum (sum simd vector – intrinsic)
This tool is a must know application. It lets manage different versions of a program very easily. However, some people still do not know about it, mainly because there are a lot of by-hand solutions proposed in various forums.
Just for the fun, I put here the code of an application that moves to clipboard what it receives from input.
You might arrive on this page because you are facing segfault or any strange problem “it works white 1 thread but not with 2” in a mix of OpenMP/Cuda application. It was the case for me, let have a look… Read more[OpenMP][Cuda] How to manage CUDA GPU from OpenMP threads
Another example of SpMV but with cuSparse this time. For the same reason, I was not able to find a basic example on the internet, so I suppose this one can be useful to others.
Here is a code sample of using the MKL to perform SpMV (gemv), I put it in different functions but the code is not clean (mix of C and C++). However it is easy to understand, there are the conversions… Read more[C/C++] Sparse matrix MKL examples (C00, CSR, DIA, BCSR) gemv and conversions
The most common way to find the median is to sort the array and take the value in the middle. Another way is to partition several times the array, until we are sure that the value in the middle is… Read more[Algorithm] Finding the median in (maybe) less than O(nlog(n))
I was developing an OpenMP code which is using nested parallelism. And I realized that I have some problems with threads affinity (even if my number of threads was lower or equal my number of cores) so I looked to… Read more[C++][OpenMP] Thread affinity manual (set CPU affinity and bind thread by hand)
I was working on a project where MPI and OpenMP were used and where everything about compilation was done. And I had to include some Cuda code to this.
It is true that I am the kind of guy that sometime like to create what already exist. But this time it was because I was not completely satisfy by what exists since there is no standard double linked list… Read more[C] Double Linked list in C with iterator (OpenSource LGPL)
Using MPI_Type_create_struct and MPI_Type_commit, here is a small example to create a type based on a struct. It is clear that it is more safe to do this instead of using the size of the struct and cast to unsigned… Read more[C++][MPI] Create custom data type in mpi
A quick sample of code to replace several lines of text in lots a files.
For the fun I developed an inplace merge algorithm and describe it in a pdf file. Quentin helps me for the proof.
Maybe you’re trying to put some sse code into a (host) function in a .cu file, well you will not be able to compile.
A simple legendre polynomial computation in C/C++.
In this post I put the code of a bitonic sort in distributed memory. The method are templatized so you can use it as you like. Be aware that this version needs a number of processes that is a power… Read more[C++][Mpi] Bitonic parallel Sort (Bitonic Sorting network in parallel)
This quicksort class is a copy of the one from ScalFMM.
In this post I put the code of a small program I developed a week ago about an OpenMP server for linux socket. So this server is using a thread pool and tasks. Also I wrote a minimalist client that… Read more[C++] A tcp/ip server using OpenMP (with Linux socket)
I read (and wrote) 1 month ago some algorithms about pattern matching in text. You can find plenty of this on the web anyway here is my code.
I few weeks ago I wanted to use unit test. But when I was searching for a framework easy to use, fast, that do not need to be installed 10 libs to make it working etc… Well I did not… Read moreC++ – Unit test – easy, one file, basic, simple
I use this post in order to register to Google my MS reports. They are hosted at http://hpcpp.com/data/academique/: 2010, 2009, 2008. It includes my work during my internship in Kaist and Inria.
How to create an application that allows only one instance at a time. Here is my solution inspired from : http://www.qtcentre.org/wiki/index.php?title=SingleApplication