I was wondering if someone has had some experience in optimizing large simulations on HPC clusters. I am currently running a Coupled Eulerian Lagrangian simulation (~3 million DOF) on a gigabit ethernet HPC cluster spanning over 12 nodes (20 processors per node) using Abaqus/Explicit 2020. I have the following questions:
I noticed that when the simulation runs fastest when the number of domains is equal to the number of processors. if I over decompose and use dynamic load balancing, then the simulation is usually slower. Has anyone had any luck with running simulations with number of domains less than the number of processors?
I want to take advantage of the Hybrid MPI feature of Abaqus 2020. However, I noticed that the Hybrid MPI usually performs slower than the pure MPI. I do not fully understand why this is the case, I suspect I might not be setting up the problem correctly. Has anyone had some experience with this issue?