Page 1 of 1
Nodes, cores and yambo parallelization
Posted: Sun Jul 17, 2016 11:45 pm
by Flex
Hello,
I finally updated to yambo 4.0.2 and I am trying to use the new parallelization system. I'm working on 16 nodes*16cores (so, 256 cpu's ?) and I plan on using more later. The batch script is attached.
So, I read carefully the tutorials and set the para variables (for a GW calculation) to
X_all_q_CPU= "4 8 8 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
X_all_q_nCPU_invert=0 # [PARALLEL] CPUs for matrix inversion
SE_CPU= "2 32 4" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
(see also input attached)
That should be powers of 2 and the products equal to 256
I still get this error :
<06s> P0001: CPU structure provided for the Response_G_space ENVIRONMENT is incomplete. Switching to defaults
P0001: [ERROR] STOP signal received while in :[05] Dynamic Dielectric Matrix (PPA)
P0001: [ERROR]Impossible to define an appropriate parallel structure
(see also log attached, the other 16 are similar)
Did I miss something in my cpu count ? Is Yambo configured the right way ?
BTW, is my processor distribution of the different tasks good enough ?
Thanks in advance
Thierry Clette
Re: Nodes, cores and yambo parallelization
Posted: Mon Jul 18, 2016 8:34 am
by amolina
Dear Thierry,
I am not sure, but it seems the submission is not correctly done. Yambo is not seeing the 256 process. Maybe you need to add some options/flags to the mpirun as:
mpirun -np 256 yambo -F GW.in
Best,
Alejandro.
Re: Nodes, cores and yambo parallelization
Posted: Mon Jul 18, 2016 2:30 pm
by Daniele Varsano
Dear Thierry,
beside that:
BTW, is my processor distribution of the different tasks good enough ?
I would avoid to parallelize on q points so I would put 1 both in X_all_q_CPU and in SE_CPU, as it can strongly unbalance your calculation.
Moreover:
Code: Select all
%QPkrange # [GW] QP generalized Kpoint/Band indices
1|121| 1|40|
%
are you sure you need to calculate corrections for 40 bands and all k points: these are 4840 corrections which is a huge number.
In the case the suggestion of Ale does not work, please post it again, including your report file.
Best,
Daniele
Re: Nodes, cores and yambo parallelization
Posted: Tue Jul 19, 2016 6:29 pm
by Flex
Here is another calculation, this time a BSE (the first that started), but with the same idea, this time 4*4 cpu's and I added the tag "-np 16"
I still get the same error.
The in, r, l, and submit files are attached (r in next post)
also, thanks for the tips about roles and numbers of cpu's
Re: Nodes, cores and yambo parallelization
Posted: Tue Jul 19, 2016 6:33 pm
by Flex
report file in 2 parts
Re: Nodes, cores and yambo parallelization
Posted: Thu Jul 21, 2016 12:49 pm
by amolina
Dear Thierry,
in your input file I don't see the variables for parallelization of the dielectric function. You can see at the end of the report file that yambo complains when starting with the dielectric function.
If you are running 16 processes I recommend you to try this:
X_all_q_CPU= "1 16 1 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
or this.
X_all_q_CPU= "1 8 2 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
Cheers,
Alejandro.
Re: Nodes, cores and yambo parallelization
Posted: Sat Jul 23, 2016 11:55 am
by Flex
Hello all,
So far, the suggestions from amolina does seem to work, thanks a lot for that.
I still have a question about Q points
Daniele Varsano wrote:Dear Thierry,
beside that:
BTW, is my processor distribution of the different tasks good enough ?
I would avoid to parallelize on q points so I would put 1 both in X_all_q_CPU and in SE_CPU, as it can strongly unbalance your calculation.
Moreover:
Code: Select all
%QPkrange # [GW] QP generalized Kpoint/Band indices
1|121| 1|40|
%
are you sure you need to calculate corrections for 40 bands and all k points: these are 4840 corrections which is a huge number.
In the case the suggestion of Ale does not work, please post it again, including your report file.
Best,
Daniele
So, regarding this suggestion, I reduced the band number, but I don't know how to reduce the Q points. Since they come from the QE grid, I can't remove Q points without damaging the structure. I think I would rather have to reduce the grid in the QE calculation instead. Doesn't this reduce the scf and nscf precision ?
Thanks again for the answers
Re: Nodes, cores and yambo parallelization
Posted: Sat Jul 23, 2016 3:06 pm
by Daniele Varsano
Dear Flex,
Qpoint as you argued cannot be reduced, otherwise the BZ integration would be wrong, and as you say it also reduce precision in the ground state calculations. I was not meaning at all to reduce q points.
There I was talking about:
1) avoid parallelization of q points, this does not mean to discard any of them, but is just the tuning of the parallelization strategy.
2) Asking if you are really interested in calculating the GW corrections for all that bands: please note that QPkrange it is not a convergence parameter but indicates the bands and kpoints you want to calculate the QP corrections. Usually one it is interested in the bands around the Fermi energy, or more bands if needed for the BSE calculations, but the first deep bands usually are not of much interest. Of course what I'm saying it is not a rule and may be you are interested in that for some reason. Calculating corrections for that number of points could be quite consuming.
Best,
Daniele