Nodes, cores and yambo parallelization

Flex · Post by **Flex** » Sun Jul 17, 2016 11:45 pm

Hello,

I finally updated to yambo 4.0.2 and I am trying to use the new parallelization system. I'm working on 16 nodes*16cores (so, 256 cpu's ?) and I plan on using more later. The batch script is attached.

So, I read carefully the tutorials and set the para variables (for a GW calculation) to

X_all_q_CPU= "4 8 8 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
X_all_q_nCPU_invert=0 # [PARALLEL] CPUs for matrix inversion
SE_CPU= "2 32 4" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)

(see also input attached)

That should be powers of 2 and the products equal to 256

I still get this error :

<06s> P0001: CPU structure provided for the Response_G_space ENVIRONMENT is incomplete. Switching to defaults
P0001: [ERROR] STOP signal received while in :[05] Dynamic Dielectric Matrix (PPA)
P0001: [ERROR]Impossible to define an appropriate parallel structure

(see also log attached, the other 16 are similar)

Did I miss something in my cpu count ? Is Yambo configured the right way ?

BTW, is my processor distribution of the different tasks good enough ?

Thanks in advance

Thierry Clette

Post by **amolina** » Mon Jul 18, 2016 8:34 am

Dear Thierry,
I am not sure, but it seems the submission is not correctly done. Yambo is not seeing the 256 process. Maybe you need to add some options/flags to the mpirun as:
mpirun -np 256 yambo -F GW.in
Best,
Alejandro.

Post by **Daniele Varsano** » Mon Jul 18, 2016 2:30 pm

Dear Thierry,
beside that:

BTW, is my processor distribution of the different tasks good enough ?

I would avoid to parallelize on q points so I would put 1 both in X_all_q_CPU and in SE_CPU, as it can strongly unbalance your calculation.

Moreover:

Code: Select all

%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1|121|  1|40|
%

are you sure you need to calculate corrections for 40 bands and all k points: these are 4840 corrections which is a huge number.

In the case the suggestion of Ale does not work, please post it again, including your report file.
Best,
Daniele

Flex · Post by **Flex** » Tue Jul 19, 2016 6:29 pm

Here is another calculation, this time a BSE (the first that started), but with the same idea, this time 4*4 cpu's and I added the tag "-np 16"

I still get the same error.

The in, r, l, and submit files are attached (r in next post)

also, thanks for the tips about roles and numbers of cpu's

Flex · Post by **Flex** » Tue Jul 19, 2016 6:33 pm

report file in 2 parts

Post by **amolina** » Thu Jul 21, 2016 12:49 pm

Dear Thierry,

in your input file I don't see the variables for parallelization of the dielectric function. You can see at the end of the report file that yambo complains when starting with the dielectric function.

If you are running 16 processes I recommend you to try this:

X_all_q_CPU= "1 16 1 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)

or this.

X_all_q_CPU= "1 8 2 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)

Cheers,
Alejandro.

Flex · Post by **Flex** » Sat Jul 23, 2016 11:55 am

Hello all,

So far, the suggestions from amolina does seem to work, thanks a lot for that.

I still have a question about Q points

Daniele Varsano wrote:Dear Thierry,
beside that:

BTW, is my processor distribution of the different tasks good enough ?
I would avoid to parallelize on q points so I would put 1 both in X_all_q_CPU and in SE_CPU, as it can strongly unbalance your calculation.

Moreover:
Code: Select all
%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1|121|  1|40|
%
are you sure you need to calculate corrections for 40 bands and all k points: these are 4840 corrections which is a huge number.

In the case the suggestion of Ale does not work, please post it again, including your report file.
Best,
Daniele

So, regarding this suggestion, I reduced the band number, but I don't know how to reduce the Q points. Since they come from the QE grid, I can't remove Q points without damaging the structure. I think I would rather have to reduce the grid in the QE calculation instead. Doesn't this reduce the scf and nscf precision ?

Thanks again for the answers

Post by **Daniele Varsano** » Sat Jul 23, 2016 3:06 pm

Dear Flex,

Qpoint as you argued cannot be reduced, otherwise the BZ integration would be wrong, and as you say it also reduce precision in the ground state calculations. I was not meaning at all to reduce q points.

There I was talking about:
1) avoid parallelization of q points, this does not mean to discard any of them, but is just the tuning of the parallelization strategy.
2) Asking if you are really interested in calculating the GW corrections for all that bands: please note that QPkrange it is not a convergence parameter but indicates the bands and kpoints you want to calculate the QP corrections. Usually one it is interested in the bands around the Fermi energy, or more bands if needed for the BSE calculations, but the first deep bands usually are not of much interest. Of course what I'm saying it is not a rule and may be you are interested in that for some reason. Calculating corrections for that number of points could be quite consuming.

Best,

Daniele

Yambo Community Forum

Nodes, cores and yambo parallelization

Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization

Re: Nodes, cores and yambo parallelization