Is Meshing Run in Parallel in COMSOL Multiphysics®?
Have you ever wondered if the topology of your model geometry (in other words, the decomposition of the geometry into geometric entities like domains, boundaries, edges, and vertices) somehow influences how the mesh generation in the COMSOL Multiphysics® software makes use of your computational resources? If so, then this blog post will be of interest to you…
Parallelized Meshing in COMSOL Multiphysics®
The following operations make use of shared memory parallelism:
- Free Tetrahedral
- Free Triangular
- Free Quad
- Boundary Layers
The Free Tetrahedral operation parallelizes across domains and faces. This means that when the operation generates the tetrahedra on a geometry with four domains, it can utilize a maximum of four cores, while the same operation can only use one core if there is one domain in the geometry — regardless of the number of cores available. Similarly, the Free Triangular, Free Quad, and Mapped operations are parallelized across domains in 2D and faces in 3D.
The Boundary Layers operation is partly parallelized. Unlike the other operations, this mesh operation is parallelized within each domain.
Lastly, the meshing of linking faces (done with the Mapped operation) is part of the Swept operation, which is parallelized.
Note that the mesh is built, one operation at a time, from the top down in the sequence of mesh operations. The parallelization is therefore done per operation, one at a time.
A Benchmark Performance Test
How much does parallelized meshing speed up the meshing time in an actual modeling scenario? Let’s take a 6×1×1-m block and mesh it on a regular desktop computer with 6 cores. One Free Tetrahedral operation is added under Mesh. The following mesh element size parameters are used throughout the test:
- Maximum element size:
- Minimum element size:
Mesh Size settings used in the benchmark test.
These settings give a very fine mesh, which results in roughly 13 million tetrahedral elements.
We set up three test cases in which the model has:
- One domain
- Six domains, partitioning the block into six equally sized domains
- Six domains, restricting the software to run on only one core
Left: A 6×1×1-m block with 1 domain (case 1). Right: A geometry with the same outer size as case 1, but divided into 6 domains of size 1×1×1 m (cases 2 and 3).
To restrict the software to run on only one core, you can add the option
-np 1 to the start command. It is also possible to do this on the Multicore and Cluster Computing page in the Preferences dialog box.
Benchmark Results and Discussion
The results from the 3 tests are gathered in the table below. The meshing in cases 1 and 3 will run on only 1 core and take about the same time to mesh. The geometry with 6 domains (cases 2 and 3) has more boundaries and therefore results in more triangle elements, which also means that this geometry requires slightly more meshing work. In case 2, where the mesh algorithm uses all 6 cores, meshing is reduced to less than 25% of the time it takes when using only one core.
|Case||Domains||Time||Triangle Elements||Tetrahedral Elements|
|3 (1 core)||6||147 s||1.81e3||13.0e6|
Results from a benchmark test run on a six-core desktop computer. The test shows a significant speedup from one core (first and third rows in the table) to all six cores (second row in the table).
You might be wondering if you should partition your geometry into as many domains as you have cores, assuming you have fewer domains to begin with. The answer? Not necessarily…
Partitioning a domain leads to more boundaries. This sets more constraints on the mesh, which, in turn, might give a more complex situation to mesh. The increase in boundaries will take a longer time to mesh, which is most clearly seen by comparing the times and number of triangle elements in cases 1 and 3. You should also consider that partitioning a domain can lead to narrow regions that require a finer mesh size. Note that the situation in this benchmark is idealized; there are six equal domains on six cores. In a real case, it is more likely that some domains are more complicated and dominate meshing time, leading to less speedup.
Partitioning Domains with Boundary Layers
The Boundary Layers operation moves points in parallel when inserting boundary layer elements and can do so even when operating on a single domain. Therefore, partitioning domains will not improve performance, but rather the opposite, since the extra partitioning faces need more processing.
How Many Mesh Operations Can I Use Without Losing Performance?
Let’s use case 2 in the benchmark example above (six domains meshed on at maximum six cores). The parallelization is done per mesh operation so if we now add another Free Tetrahedral operation and mesh three domains in each operation, a maximum of three cores can be used. In general, it is recommended to use as few mesh operations (of the same type) as possible. This will not only make it possible to parallelize as much as possible, but it will also allow for a good optimization of the quality of the mesh. To set different size settings on different domains/boundaries, use several global or local size attributes and only one Free Tetrahedral operation. The Using Meshing Sequences tutorial discusses the details about setting global and local size attributes.
Concluding Thoughts on Meshing Run in Parallel
In this blog post, we have discussed the ways in which the meshing algorithm in COMSOL Multiphysics is parallelized. The results from a simple benchmark test show that meshing run in parallel can significantly speed up the algorithm by distributing the meshing of domains on more cores.
For more information about the Free Tetrahedral operation, and possible ways to modify a geometry, check out these resources:
- COMSOL Now
- Today in Science