This tool is a must know application. It let’s manage different versions of a program very easily. However, some people still do not know about it, mainly because there are a lot of by-hand solutions proposed on various forums.
Just for the fun, I put here the code of an application that moves to clipboard what it receives from input.
You might arrive on this page because you are facing segfault or any strange problem “it works white 1 thread but not with 2”. It was the case for me, let have a look to why and how to make… Read more
Another example of SpMV but with cuSparse this time. For the same reason, I was not able to find a basic example on the internet, so I suppose this one can be useful to others.
Here is a code sample of using the MKL to perform SpMV (gemv), I put it in different functions but the code is not clean (mix of C and C++). However it is easy to understand, there are the conversions… Read more
Because the standard says : All critical constructs without a name are considered to have the same unspecified name.
I did a presentation about C++11 to give an overview (targeting non C++ users) here are the slides and the code example.
You can find the jobs here in French But French is not required! So if you feel the descriptions look interesting do not hesitate to contact me, (Please be aware that we prefer European student for Visa reason – getting… Read more
The most common way to find the median is to sort the array and take the value in the middle. Another way is to partition several times the array, until we are sure that the value in the middle is… Read more
I was developing an OpenMP code which is using nested parallelism. And I realized that I have some problems with threads affinity (even if my number of threads was lower or equal my number of cores) so I looked to… Read more
I was working on a project where MPI and OpenMP were used and where everything about compilation was done. And I had to include some Cuda code to this.
It is true that I am the kind of guy that sometime like to create what already exist. But this time it was because I was not completely satisfy by what exists since there is no standard double linked list… Read more
Gcc provides the usual operator (+,-,/,x) for the SSE types. But intel was (I just wrote was because it seems that now it dos). So we implemented quickly these operators to be able do “c=a+b”.
Using MPI_Type_create_struct and MPI_Type_commit, here is a small example to create a type based on a struct. It is clear that it is more safe to do this instead of using the size of the struct and cast to unsigned… Read more
A quick sample of code to replace several lines of text in lots a files.
For the fun I developed an inplace merge algorithm and describe it in a pdf file. Quentin helps me for the proof.
Maybe you’re trying to put some sse code into a (host) function in a .cu file, well you will not be able to compile.
In this post I present some function taken from different books and rewritten by myself (the first objective was to refresh my memory with some BLAS stuff a long time ago). It composed of 3 modules: Utils, Matrix/vector operations, Linear… Read more
A simple legendre polynomial computation in C/C++.
If you want to know more about flops (on CPU or on GPU) a good first (but good step) is to use this link: https://folding.stanford.edu/home/faq/faq-flops/ They give lots of details and are very clear, in bref, a good reference.
CMake provides a find CUDA (http://www.cmake.org/cmake/help/v3.0/module/FindCUDA.html) and I just show here a small example.
Quick resume about CUDA __constant__ type
You may want to know what version of openmp you are using at compile time in order to activate or not some functionalities. This is possible using the _OPENMP Macro/directive.
In this post I put the code of a bitonic sorting in distributed parallel. The method are templatized so you can use it as you like. Be aware that this version needs a number of processes that is a power… Read more
Openmp give a barrier for all threads. Here is a class to perform a barrier with only a group of threads.
This quicksort class is a copy of the one from ScalFMM. It uses Omp tasks if supported otherwise it uses a homemade stack of intervals.
In this post I put the code of a small program I developed a week ago about an OpenMP server for linux socket. So this server is using a thread pool and tasks. Also I wrote a minimalist client that… Read more
I read (and wrote) 1 month ago some algorithms about pattern matching in text. You can find plenty of this on the web anyway here is my code.
Because it took me 5 minutes to make the right command (do we use a pipe or not, do we put at the end or not, etc….), I write it here (for me like a post-it): So this command will… Read more
In this post I present an example to profile openmp (or pthread) and mpi application.
Of course there is a difference between static and dynamic scheduling (everyone knows that) but if you want to see how it can make a difference look at the example above.
Here are 3 lines to use openmp in your cmake code
I few weeks ago I wanted to use unit test. But when I was searching for a framework easy to use, fast, that do not need to be installed 10 libs to make it working etc… Well I did not… Read more
I use this post in order to register to Google my MS reports. They are hosted at http://hpcpp.com/data/academique/: 2010, 2009, 2008. It includes my work during my internship in Kaist and Inria.
A popular question is to ask why one is better or to understand their main differences.
Here is an example of a reduction on a variable to sum the result from each thread.
How to create an application that allows only one instance at a time. Here is my solution inspired from : http://www.qtcentre.org/wiki/index.php?title=SingleApplication
You want to start learning OpenMp and you already know how to program in C/C++? Here is a good extract of code to understand how it works.