{"id":224,"date":"2022-09-27T14:34:42","date_gmt":"2022-09-27T05:34:42","guid":{"rendered":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en2\/?post_type=case&#038;p=224"},"modified":"2022-09-27T17:26:39","modified_gmt":"2022-09-27T08:26:39","slug":"openmp","status":"publish","type":"case","link":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/case\/openmp\/","title":{"rendered":"Parallel computing capability with OpenMP"},"content":{"rendered":"<h3>Summary<\/h3>\n<p>EMSolution has been developed with the aim of achieving large scale and high speed analysis of electromagnetic field. In recent years, as multi-CPU and multi-core machines have become more common, we have added a parallel function, albeit partially, to EMSolution.  <\/p>\n<p>Note that parallelization is based on OpenMP, which can be performed on a single node (one WS), so parallel computing beyond the number of cores installed is not possible. However, it is suitable for relatively large 3D analyses because of the speedup effect that can be easily achieved by the number of units. The following three parallel functions are introduced in this issue. We plan to parallelize the other parts of the system as well.  <\/p>\n<ul>\n<li>Parallel computation of ICCG method<\/li>\n<li>Parallel computation of B_INTEG<\/li>\n<li>Parallel Processing in MeshedCOIL<\/li>\n<\/ul>\n<h3>Explanation<\/h3>\n<h4>Parallel computation of ICCG method<\/h4>\n<p>The ICCG method is a linear symmetric matrix solution method used for ordinary static and transient magnetic field analysis. It can be used regardless of whether the magnetization characteristics are linear or nonlinear. Please note that it is not applicable to two-dimensional magnetic anisotropy where the matrix is asymmetric. The BlockICCG method is used for parallelization of the ICCG method, and a part of the Newton-Raphson method used in the calculation of nonlinear magnetization properties is also parallelized.  <\/p>\n<p>As an example, we show the results of applying this method to the generator model shown in Fig. 1 and the three-dimensional analysis of the concentrated winding IPMSM (D1 model), which is the benchmark model of the Institute of Electrical Engineers of Japan. The analysis is a nonlinear transient magnetic field analysis with a current source; Table 1 and Fig. 2 show the generator model, and Table 2 and Fig. 3 show the number of units effect for the concentrated winding IPMSM. For the generator model with good convergence, the number effect is almost linear for 2 and 4 parallel systems. For the D1 model, the convergence is not so good, and the number effect drops slightly for the 4-parallel model. This is because the BlockICCG method performs BlockIC decomposition, so convergence becomes worse as the number of parallelism increases, and the number of ICCG iterations tends to increase. However, the calculation is 1.9 times faster than with a single (one-parallel) calculation.  <\/p>\n<p>Thus, it can be said that parallelization by the BlockICCG method is useful because the computation time can be shortened simply by increasing the number of parallels. Note that the effect of the number of units seems to strongly depend on the performance of the CPUs installed. The machine used in this example has an Intel Xeon E5520 (64bit, Nehalem) CPU, 2.26GHz, 4 cores, and 12GB memory. Hyperthreading is not used. It seems to be a CPU that can easily produce a unit number effect.  <\/p>\n<div class=\"img col2\">\n<div>\n    <a href=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp01.png\" class=\"modal\"><br \/>\n    <img decoding=\"async\" src=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp01.png\" alt=\"\" \/><\/a><br \/>\n<!--        \n\n<p class=\"text01\">(a)\u3000\u767a\u96fb\u6a5f\u30e2\u30c7\u30eb\uff1a<br \/>\u8981\u7d20\u6570143,180<\/p>\n\n--><\/p>\n<p style=\"text-align:center\">(a) Generator model: Number of elements 143,180<\/p>\n<\/p><\/div>\n<div>\n    <a href=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp02.png\" class=\"modal\"><br \/>\n    <img decoding=\"async\" src=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp02.png\" alt=\"\" \/><\/a><br \/>\n<!--        \n\n<p class=\"text01\">(b)\u3000\u96c6\u4e2d\u5dfbIPMSM\u30e2\u30c7\u30eb\uff1a<br \/>\u8981\u7d20\u6570412,776<\/p>\n\n--><\/p>\n<p style=\"text-align:center\">(b) Concentrated Volume IPMSM Model: Number of elements 412,776<\/p>\n<\/p><\/div>\n<p class=\"caption\">Fig.1 Analysis model<\/p>\n<\/div>\n<p><\/p>\n<h2 id=\"tablepress-34-name\" class=\"tablepress-table-name tablepress-table-name-id-34\">Table1 Effect of Number of Generators Model<\/h2>\n\n<table id=\"tablepress-34\" class=\"tablepress tablepress-id-34\" aria-labelledby=\"tablepress-34-name\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Number of parallels<\/th><th class=\"column-2\">Computation time (s)<\/th><th class=\"column-3\">Number of ICCG iterations<\/th><th class=\"column-4\">Number of NR iterations<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">1<\/td><td class=\"column-2\">184.4<br \/>\n(1.00)<\/td><td class=\"column-3\">733<br \/>\n(1.00)<\/td><td class=\"column-4\">15<br \/>\n(1.00)<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">2<\/td><td class=\"column-2\">118.3<br \/>\n(1.56)<\/td><td class=\"column-3\">763<br \/>\n(1.04)<\/td><td class=\"column-4\">15<br \/>\n(1.00)<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">4<\/td><td class=\"column-2\">85.8<br \/>\n(2.15)<\/td><td class=\"column-3\">824<br \/>\n(1.12)<\/td><td class=\"column-4\">15<br \/>\n(1.00)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-34 from cache -->\n<p><\/p>\n<h2 id=\"tablepress-33-name\" class=\"tablepress-table-name tablepress-table-name-id-33\">Table 2: Number Effects of the Concentrated Volume IPMSM Model<\/h2>\n\n<table id=\"tablepress-33\" class=\"tablepress tablepress-id-33\" aria-labelledby=\"tablepress-33-name\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\">Number of parallels<\/th><th class=\"column-2\">Computation time (s)<\/th><th class=\"column-3\">Number of ICCG iterations<\/th><th class=\"column-4\">Number of NR iterations<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">1<\/td><td class=\"column-2\">391.8<br \/>\n(1.00)<\/td><td class=\"column-3\">2946.2<br \/>\n(1.00)<\/td><td class=\"column-4\">7.5<br \/>\n(1.00)<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">2<\/td><td class=\"column-2\">232.6<br \/>\n(1.68)<\/td><td class=\"column-3\">2982.1<br \/>\n(1.01)<\/td><td class=\"column-4\">7.4<br \/>\n(0.99)<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">4<\/td><td class=\"column-2\">206.8<br \/>\n(1.89)<\/td><td class=\"column-3\">3451.7<br \/>\n(1.17)<\/td><td class=\"column-4\">8.9<br \/>\n(1.18)<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-33 from cache -->\n<p><\/p>\n<div class=\"img col2\">\n<div>\n        <a href=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp03.png\" class=\"modal\"><br \/>\n        <img decoding=\"async\" src=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp03.png\" alt=\"\" \/><\/a><br \/>\n<!--        \n\n<p class=\"text01\">Fig.2\u3000\u767a\u96fb\u6a5f\u30e2\u30c7\u30eb\u306e<br \/>\u53f0\u6570\u52b9\u679c\u30b0\u30e9\u30d5<\/p>\n\n--><\/p>\n<p style=\"text-align:center\">Fig.2 Effect of number of cores on computation time <br \/>&#8211; Generator model \u2013<\/p>\n<\/p><\/div>\n<div>\n        <a href=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp04.png\" class=\"modal\"><br \/>\n        <img decoding=\"async\" src=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp04.png\" alt=\"\" \/><\/a><br \/>\n<!--        \n\n<p class=\"text01\">Fig.3\u3000\u96c6\u4e2d\u5dfbIPMSM\u30e2\u30c7\u30eb\u306e<br \/>\u53f0\u6570\u52b9\u679c\u30b0\u30e9\u30d5<\/p>\n\n--><\/p>\n<p style=\"text-align:center\">Fig.3 Effect of number of cores on computation time <br \/>&#8211; Concentrated winding IPMSM model &#8211;<\/p>\n<\/p><\/div>\n<\/div>\n<h4>Parallel computation of B_INTEG<\/h4>\n<p>A parallel calculation function has also been added to <a href=\"\/product\/EMSolution\/en\/case\/integral\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;B_INTEG (Calculation function of spatial fields generated by magnetization and currents by integration)&quot;<\/font><\/a>. This allows the output of the magnetic flux density at any position with high accuracy, independent of the mesh geometry, for spatial magnetic fields other than those of magnetic materials and conductors. This is very suitable for parallel computation, and thus produces the ideal number of units effect.  <\/p>\n<p>As an example, we will verify the effect of the number of units on the model used in the <a href=\"\/product\/EMSolution\/en\/case\/b_integ_by_coil_without_mesh\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;magnetic field distribution calculation for coils only&quot;<\/font><\/a> shown in Fig. 4. The evaluation points are arranged in a grid of 0 to 250 mm in each axis direction, divided into (1) 50 equal parts and (2) 100 equal parts. Table 3 shows the effect of the number of units. It can be seen that the effect of the number of units is almost ideal. The machine used is a Quad Core Intel Xeon 3.00GHz x 2 CPU (total 8Core). It seems that this function can obtain the number effect regardless of the CPU.  <\/p>\n<div class=\"img col1\">\n<div>\n        <a href=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp05.png\" class=\"modal\"><br \/>\n        <img decoding=\"async\" src=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp05.png\" alt=\"\" \/><\/a><br \/>\n<!--        \n\n<p class=\"text01\">Fig.4\u3000COIL\u3068\u78c1\u5834\u8a55\u4fa1\u30e1\u30c3\u30b7\u30e5<\/p>\n\n--><\/p>\n<p style=\"text-align:center\">Fig.4 COIL and magnetic field evaluation mesh<\/p>\n<\/p><\/div>\n<\/div>\n<p><\/p>\n<h2 id=\"tablepress-32-name\" class=\"tablepress-table-name tablepress-table-name-id-32\">Table3 Effect of the number of parallels<\/h2>\n\n<table id=\"tablepress-32\" class=\"tablepress tablepress-id-32\" aria-labelledby=\"tablepress-32-name\">\n<thead>\n<tr class=\"row-1\">\n\t<th class=\"column-1\"><strong>B<\/strong><\/th><th colspan=\"2\" class=\"column-2\">100 divisions<\/th><th colspan=\"2\" class=\"column-4\">50\u5206\u5272<\/th>\n<\/tr>\n<\/thead>\n<tbody class=\"row-hover\">\n<tr class=\"row-2\">\n\t<td class=\"column-1\">Number of Threads<\/td><td class=\"column-2\">Computation time (sec)<\/td><td class=\"column-3\">number effect<\/td><td class=\"column-4\">Computation time (sec)<\/td><td class=\"column-5\">number effect<\/td>\n<\/tr>\n<tr class=\"row-3\">\n\t<td class=\"column-1\">1<\/td><td class=\"column-2\">24.656<\/td><td class=\"column-3\">1.000<\/td><td class=\"column-4\">3.282<\/td><td class=\"column-5\">1.000<\/td>\n<\/tr>\n<tr class=\"row-4\">\n\t<td class=\"column-1\">2<\/td><td class=\"column-2\">12.516<\/td><td class=\"column-3\">1.970<\/td><td class=\"column-4\">1.656<\/td><td class=\"column-5\">1.982<\/td>\n<\/tr>\n<tr class=\"row-5\">\n\t<td class=\"column-1\">4<\/td><td class=\"column-2\">6.344<\/td><td class=\"column-3\">3.887<\/td><td class=\"column-4\">0.844<\/td><td class=\"column-5\">3.889<\/td>\n<\/tr>\n<tr class=\"row-6\">\n\t<td class=\"column-1\">8<\/td><td class=\"column-2\">3.188<\/td><td class=\"column-3\">7.734<\/td><td class=\"column-4\">0.438<\/td><td class=\"column-5\">7.493<\/td>\n<\/tr>\n<tr class=\"row-7\">\n\t<td class=\"column-1\">Rating Points<\/td><td colspan=\"2\" class=\"column-2\">1,030,301<\/td><td colspan=\"2\" class=\"column-4\">132,651<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<!-- #tablepress-32 from cache -->\n<p><\/p>\n<h4>Parallel Processing of MeshedCOIL<\/h4>\n<p>Parallel processing has also been added to the processing of MeshedCOIL described in <a href=\"\/product\/EMSolution\/en\/case\/meshedcoil\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;Definition of COIL (external current field source) by hexahedral element meshes&quot;<\/font><\/a>. This function allows the definition of COIL with hexahedral finite element meshes, and the amount of calculation is proportional to the number of hexahedral elements and integration points. During this process, parallel processing is performed COIL_INDUCTANCE described in <a href=\"\/product\/EMSolution\/en\/case\/coil_force\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;COIL inductance and electromagnetic force calculation&quot;<\/font><\/a>, REGULARIZATION function that guarantees current continuity as described in <a href=\"\/product\/EMSolution\/en\/case\/convergebceoptions\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;Nonlinear Option Comparisons&quot;<\/font><\/a>, and the magnetic flux density output.   <\/p>\n<p>As an example, the armature coil of the generator model shown in Fig. 5 (different from Fig. 1) can be modeled with MeshedCOIL, including the coil ends. The field coil of the rotor is modeled with COIL. The meshedCOIL of the armature coil is modeled with a full circumference model because the reduced potential region in the finite element mesh includes periodic boundary conditions, so symmetry as in the finite element mesh cannot be applied. The COIL of the field coil can be modeled only in the finite element region because the Reduced Potential region is only around the COIL that does not touch the periodic boundary. Since COIL is used in the rotor and stator, Multi-potential method described in <a href=\"\/product\/EMSolution\/en\/case\/multi_potential\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;About multi-potential method&quot;<\/font><\/a> is used. Due to the long  time-constant resulting from the use of a voltage source, the TP-EEC method is required for this model, and the parallel functions of the Meshed_COIL and ICCG methods can be applied.  <\/p>\n<div class=\"img col1\">\n<div>\n        <a href=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp06.png\" class=\"modal\"><br \/>\n        <img decoding=\"async\" src=\"\/product\/EMSolution\/en\/wp-content\/uploads\/openmp06.png\" alt=\"\" \/><\/a><br \/>\n<!--        \n\n<p class=\"text01\">Fig.5\u3000\u767a\u96fb\u6a5f\u30e2\u30c7\u30eb<\/p>\n\n--><\/p>\n<p style=\"text-align:center\">Fig.5 Generator model<\/p>\n<\/p><\/div>\n<\/div>\n<p>Although the scope of application of the parallel computing capability with OpenMP is still limited, we believe that we have demonstrated that the parallel computing capability with OpenMP is useful. As further large-scale analysis is becoming essential, we hope you will try it out. We also accept evaluations before installation, so please feel free to contact us from <a href=\"\/product\/EMSolution\/en\/contact\/\" target=\"_blank\" rel=\"noopener noreferrer\" style=\"display:inline\"><font color=\"Red\">&quot;here&quot;<\/font><\/a>.  <\/p>\n<p><!--more--><\/p>\n<h3>How to use<\/h3>\n<h4>Parallel computing capability with OpenMP<\/h4>\n<p>The option PARALLEL_NO (Number of parallel computations) has been added to the Handbook &quot;4 Order of the Shape Function and Added Features&quot;. This single option allows you to set whether or not parallel calculations are performed.   <\/p>\n<p>This will cause parallel computation if the number of cores is less than or equal to the number of cores on the node performing the computation (but it depends on your license. Parallel module is required.). PARALLLEL_OPTION is an option when NON_LINEAR=1 (Nonlinear magnetization property) is selected for parallel computation of ICCG method, and you can select either =0: Speed-oriented and large memory usage or =1: Memory-oriented and small memory usage. If you do not have enough memory, it is recommended to use =1. Note that this calculation requires the Parallel module.<br \/>\nSince this function is a parallel calculation using OpenMP, it is a parallel calculation within a single node.  <\/p>\n<p class=\"slideText\"><span>* NODE_ORDER * EDGE_ORDER *  METRIC_MOD * QUAD_TRI * CALC_IND *  THIN_ELEM <font color=\"Red\">* PARALLEL_NO * PARALLEL_OPTION<\/font><\/span><br \/>\n<span>               1                            1                         0                       0                    0                   0                       <font color=\"Red\">2                                0<\/font><\/span><br \/>\n<span>***** Number of parallel threads: 2 ****<font color=\"Red\">   \u2190<\/font><\/span><br \/>\n<span>***         priority : speed\u3000\u3000\u3000\u3000\u3000\u3000\u3000\u3000\u3000<font color=\"Red\">\u2190Confirmation comments such as the number of parallels are output to the check file.<\/font><\/span>\n<\/p>\n<h3>Download<\/h3>\n<h4>ICCG method parallel computation data<\/h4>\n<h5>D1 Model<\/h5>\n<p>Static magnetic field analysis for initial value calculation : \u3000\u3000<button type=\"button\" class=\"btn btn-danger btn-lg\"><a href=\"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-content\/uploads\/OpenMP01.zip\">Sample data DL<\/a><\/button><br \/>\n\u30fb input3D_static20deg.ems : Input file<br \/>\n\u30fb pre_geom2D.neu : Stator mesh data<br \/>\n\u30fb rotor_mesh2D.neu : Rotor mesh data<br \/>\n\u30fb 2D_to_3D : Rotor mesh data    <\/p>\n<p>Transient Analysis : \u3000\u3000<button type=\"button\" class=\"btn btn-danger btn-lg\"><a href=\"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-content\/uploads\/OpenMP02.zip\">Sample Data DL<\/a><\/button><br \/>\n\u30fb input3D_transient20deg.ems : Input file<br \/>\n\u30fb pre_geom2D.neu : Stator mesh data<br \/>\n\u30fb rotor_mesh2D.neu : Rotor mesh data<br \/>\n\u30fb 2D_to_3D : Rotor mesh data    <\/p>\n<h4>B_INTEG Parallel calculation data<\/h4>\n<h5>Coil Model (COIL)<\/h5>\n<p>For 50 divisions : \u3000\u3000<button type=\"button\" class=\"btn btn-danger btn-lg\"><a href=\"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-content\/uploads\/OpenMP03.zip\">Sample data DL<\/a><\/button><br \/>\n\u30fb input.txt : Input file<br \/>\n\u30fb B_integ_mesh.NEU : Mesh data  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summary EMSolution has been developed with the aim of achieving large scale and high speed analysis of electromagnetic field. In recent years, as multi-CPU and multi-core machines have become more common, we have added a parallel function, albeit partially, to EMSolution. Note that parallelization is based on OpenMP, which can be performed on a single [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","tags":[],"case_cat":[23],"class_list":["post-224","case","type-case","status-publish","hentry","case_cat-shusoku"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/case\/224"}],"collection":[{"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/case"}],"about":[{"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/types\/case"}],"version-history":[{"count":7,"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/case\/224\/revisions"}],"predecessor-version":[{"id":4897,"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/case\/224\/revisions\/4897"}],"wp:attachment":[{"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/media?parent=224"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/tags?post=224"},{"taxonomy":"case_cat","embeddable":true,"href":"https:\/\/www.ssil.co.jp\/product\/EMSolution\/en\/wp-json\/wp\/v2\/case_cat?post=224"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}