Ibis: My test compressible computational fluid dynamics code GPUs

4 minute read

Introduction to Ibis

My research in my PhD focussed on using Eilmer to computationally model the physical processes in the gas flowing around hypersonic vehicles. Most of the development I did was in the gas modules of the code, with only a few adventures into the core computational fluid dynamics routines. But I wanted a better understanding of the entire code, especially methods like the Jacobian-Free Newton-Krylov solver. I also had an itch to stratch of learning GPU programming and C++. So I decided to scratch all my itches at once, and write a GPU CFD code in C++. And so Ibis was born.

   MM      
  <' \___/| 
   \_  _/           _ _     _     
     ][            (_) |   (_)    
 ___/-\___          _| |__  _ ___ 
|---------|        | | '_ \| / __|
 | | | | |         | | |_) | \__ \
 | | | | |         |_|_.__/|_|___/
 | | | | |     
 | | | | |                
 |_______|

The name warrents some explanation… This is the long version of the story. My supervisor’s PhD supervisor recalled a question posed at a CFD conference in the early 2000’s, “would you rather your cart be pulled by 2 strong Ox, or one thousand chickens?” This was meant as a rhetorical question to answer the question of doing CFD on CPU or GPUs. For a few decades, the 2 strong Ox own the debate, and CFD was done on relatively few, but powerful, CPUs. Recently, the tide is shifting and CFD is being done on GPUs. So when our research group started playing with GPU codes, the code was obviously called “chicken”. This also fit with the long history of codes named after birds (Seagul, Puffin, Chicken, Spatz…). So when I decided to write my own GPU code, “Ibis”, or “bin chicken” seemed like an appropriate choice for what would likely be a worse version of chicken (I work on it in my spare time). This also tied in nicely with all the Ibis’s on campus at UQ.

So, with that important piece of information out of the way, what is ibis? It is a compressible CFD code. I used Kokkos to enable the code to run on any GPU that Kokkos supports (currently NVidia, AMD, and Intel).
Ibis uses fully unstructured, mixed element grids, and can operate in two or three dimensions. It is capable of solving the Navier-Stokes equations with an ideal gas equation of state with second order accuracy. It can use any explicit Runge-Kutta scheme to march forwards in time, or I also have Jacobian-Free Newton-Krylov for steady-state acceleration. I use dual numbers for automatic differentiation, and I have a genuinely matrix free method using flexible GMRES for the linear solver.

It is currently only single block; but because it is unstructured this doesn’t limit the complexity of geometry. The only limitation of this is that it can only run on a single GPU at once, but I plan on making it multi-GPU capable if I ever get time. The other limitation (for hypersonic simulations at least), is that it only has ideal gas equations of state.

I’m keeping a catelog of example simulations, so check that out for the latest cool simulations I’ve done that I could be bothered to upload. But my favourite simulation I’ve done so far is this Apollo simulation. The simulation doesn’t have a lot of scientific value (ideal gas, coarse, first order), but it converges quickly and it makes cool pictures!

Apollo capsule

Performance

Once I had the code working, we thought it would be intereseting to compare it with chicken (written in pure CUDA), to see if there is much penalty to using an abstraction like Kokkos. So I did some performance testing of Ibis. The test problem was gaseous injection into a Mach 4 cross-flow. The following image shows the flow field after 5ms of simualted time.

Flow field after 5ms

I found speedups comparable to others: Ibis on an Nvidia H100 was equivalent to about 200 CPU cores.

GPU acceleration

I haven’t done a great deal of profiling of Ibis to try and speed it up as of the moment, but that is something that I definitely want to do! I do have some performance numbers for it though. The following image is called a roofline plot; the black line is the ceiling of achievable performance. The higher the dots the better, but its impossible to go above the line. Ibis is not too shabby, but there is room for improvement!

GPU resource usage

Next steps

I’m working on implementing steady-state shock fitting at the moment. If the user can mark interfaces they want a shock to lie on, then Ibis will try to move the grid so that the Rankine-Hugoniot equations are satisfied on those interfaces. This will require some prior knowledge from the user on where the shock should be (perhaps from an initial simulation), but should produce much higher quality simulations. Using this in conjunction with GridPro’s shock orientated grid generation capability could provide a nice workflow for high quality simulations. I’ll probably write more about this in a future blog once I get it working nicely :)

After that, I have vague plans for multi-block (to enable multi-GPU) and non-ideal gases, but I’ll need to find time for that first, I do work on this in my spare time after all!

Updated: