Problem in parallelization
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan
-
- Posts: 88
- Joined: Sun Apr 11, 2021 3:02 pm
Problem in parallelization
Good morning,
I was trying to run Yambo in parallel, using in my input:
X_all_q_CPU= "1 1 8 4" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,g,k,c,v)
SE_CPU= "1 1 32" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
and in my submitting script:
#!/bin/bash
# Submission script for Zenobe
...
#PBS -l select=32:mem=50000mb
...
mpirun -np 32 yambo -F gw_ppa_9_par.in -J gw_ppa_9
The calculation initiates without errors but I get this message:
<12s> P1-node0767: Response_G_space_and_IO parallel ENVIRONMENT is incomplete. Switching to defaults
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for K(bz) on 2 CPU] Loaded/Total (Percentual):113/225(50%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for Q(ibz) on 2 CPU] Loaded/Total (Percentual):57/113(50%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):113/113(100%)
<12s> P1-node0767: [LA] SERIAL linear algebra
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for K(ibz) on 1 CPU] Loaded/Total (Percentual):113/113(100%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for CON bands on 2 CPU] Loaded/Total (Percentual):107/214(50%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for VAL bands on 2 CPU] Loaded/Total (Percentual):18/36(50%)
<13s> P1-node0767: [PARALLEL distribution for RL vectors(X) on 2 CPU] Loaded/Total (Percentual):638675/2175625(29%)
<14s> P1-node0767: [DIP] Checking dipoles header
<20s> P1-node0767: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):14125/28250(50%)
Due to this, the computation is very slow.
I'm not an expert on parallelization strategy, I tried to follow the tutorial for parallelizing GW Yambo calculations and some tips on this forum, but I could solve this. What could I do?
I was trying to run Yambo in parallel, using in my input:
X_all_q_CPU= "1 1 8 4" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,g,k,c,v)
SE_CPU= "1 1 32" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)
and in my submitting script:
#!/bin/bash
# Submission script for Zenobe
...
#PBS -l select=32:mem=50000mb
...
mpirun -np 32 yambo -F gw_ppa_9_par.in -J gw_ppa_9
The calculation initiates without errors but I get this message:
<12s> P1-node0767: Response_G_space_and_IO parallel ENVIRONMENT is incomplete. Switching to defaults
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for K(bz) on 2 CPU] Loaded/Total (Percentual):113/225(50%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for Q(ibz) on 2 CPU] Loaded/Total (Percentual):57/113(50%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for K-q(ibz) on 1 CPU] Loaded/Total (Percentual):113/113(100%)
<12s> P1-node0767: [LA] SERIAL linear algebra
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for K(ibz) on 1 CPU] Loaded/Total (Percentual):113/113(100%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for CON bands on 2 CPU] Loaded/Total (Percentual):107/214(50%)
<12s> P1-node0767: [PARALLEL Response_G_space_and_IO for VAL bands on 2 CPU] Loaded/Total (Percentual):18/36(50%)
<13s> P1-node0767: [PARALLEL distribution for RL vectors(X) on 2 CPU] Loaded/Total (Percentual):638675/2175625(29%)
<14s> P1-node0767: [DIP] Checking dipoles header
<20s> P1-node0767: [PARALLEL distribution for Wave-Function states] Loaded/Total(Percentual):14125/28250(50%)
Due to this, the computation is very slow.
I'm not an expert on parallelization strategy, I tried to follow the tutorial for parallelizing GW Yambo calculations and some tips on this forum, but I could solve this. What could I do?
You do not have the required permissions to view the files attached to this post.
Laura Caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
- Daniele Varsano
- Posts: 3816
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Problem in parallelization
Dear Laura,
it seems to me you are using the name of old variables.
In order to activate parallel variable use the -V par verbosity (or -V all) when building input file.
The correct syntax here is:
Please note also that your input file is a full frequency calculation and not a plasmon pole (ppa) as indicated in the input file.
Full frequency GW is much more expensive than a ppa calculation and also the number of frequencyies (ETStpsXd) should be converged.
The syntax to build a ppa input file is e.g:
or just type:
for the full list of options.
Best,
Daniele
it seems to me you are using the name of old variables.
In order to activate parallel variable use the -V par verbosity (or -V all) when building input file.
The correct syntax here is:
Code: Select all
X_and_IO_CPU= "" # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "" # [PARALLEL] CPUs roles (q,g,k,c,v)
Full frequency GW is much more expensive than a ppa calculation and also the number of frequencyies (ETStpsXd) should be converged.
The syntax to build a ppa input file is e.g:
Code: Select all
> yambo -p p -g n -r -V par
Code: Select all
> yambo -h
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 88
- Joined: Sun Apr 11, 2021 3:02 pm
Re: Problem in parallelization
Ok thanks. Now it's working properly!Daniele Varsano wrote: ↑Thu Jun 10, 2021 11:12 am Dear Laura,
it seems to me you are using the name of old variables.
In order to activate parallel variable use the -V par verbosity (or -V all) when building input file.
The correct syntax here is:
X_and_IO_CPU= "" # [PARALLEL] CPUs for each role
X_and_IO_ROLEs= "" # [PARALLEL] CPUs roles (q,g,k,c,v)
Thanks, I didn't notice that maybe I mistakenly deleted the ppa line.Please note also that your input file is a full frequency calculation and not a plasmon pole (ppa) as indicated in the input file.
Full frequency GW is much more expensive than a ppa calculation and also the number of frequencyies (ETStpsXd) should be converged.
Thanks again for your support.
Laura
Laura Caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
-
- Posts: 88
- Joined: Sun Apr 11, 2021 3:02 pm
Re: Problem in parallelization
Dear Daniele,
I tried the parallelization method as told yesterday. I got results quickly and I didn't have problems with the parallelization. However, when I got my results, I have quite a different GW gap compared to the calculation without parallelization. Is there anything wrong with it? Could it be due to the parallelization?
I tried the parallelization method as told yesterday. I got results quickly and I didn't have problems with the parallelization. However, when I got my results, I have quite a different GW gap compared to the calculation without parallelization. Is there anything wrong with it? Could it be due to the parallelization?
You do not have the required permissions to view the files attached to this post.
Laura Caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
- Daniele Varsano
- Posts: 3816
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Problem in parallelization
Dear Laura,
what kind of difference we are talking about?
can you post the report of the two calculations?
Best,
Daniele
what kind of difference we are talking about?
can you post the report of the two calculations?
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 88
- Joined: Sun Apr 11, 2021 3:02 pm
Re: Problem in parallelization
Dear Daniele,
I have a difference in the final gap. Unfortunately, since I had to remove files, I don't have the report file. I found the output of two calculations, one made after an Abinit one and another one with QE. I expected of course similar results, but I obtained quite different ones. I don't know if the problem can be seen in the output or not.
I have a difference in the final gap. Unfortunately, since I had to remove files, I don't have the report file. I found the output of two calculations, one made after an Abinit one and another one with QE. I expected of course similar results, but I obtained quite different ones. I don't know if the problem can be seen in the output or not.
You do not have the required permissions to view the files attached to this post.
Laura Caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
- Daniele Varsano
- Posts: 3816
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Problem in parallelization
Dear Laura,
from the file you sent I can see differences in the setting of the calculations:
1.)number of bands in the polarisation (250 vs 300)
2.)Blocks of the screening matrix (9Ry vs 1RL)
3.)Direction of the field (110 vs 100)
The difference in the results comes from 1 and 2 in particular.
Best,
Daniele
from the file you sent I can see differences in the setting of the calculations:
1.)number of bands in the polarisation (250 vs 300)
2.)Blocks of the screening matrix (9Ry vs 1RL)
3.)Direction of the field (110 vs 100)
The difference in the results comes from 1 and 2 in particular.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 88
- Joined: Sun Apr 11, 2021 3:02 pm
Re: Problem in parallelization
Dear Daniele,
After your reply, I noticed that regardless of the value of the screening matrix blocks in my input, Yambo continues to read 1 RL.
In fact, in the input it was selected:
NGsBlkXd= 9 Ry # [Xd] Response block size
and in the output:
# | NGsBlkXp= 1 RL # [Xp] Response block size
I tried with this value with the other calculation, obtaining the same results, so this seems to be the problem.
Has anyone ever experienced such a thing?
After your reply, I noticed that regardless of the value of the screening matrix blocks in my input, Yambo continues to read 1 RL.
In fact, in the input it was selected:
NGsBlkXd= 9 Ry # [Xd] Response block size
and in the output:
# | NGsBlkXp= 1 RL # [Xp] Response block size
I tried with this value with the other calculation, obtaining the same results, so this seems to be the problem.
Has anyone ever experienced such a thing?
Last edited by Laura Caputo on Fri Jun 11, 2021 12:29 pm, edited 1 time in total.
Laura Caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
-
- Posts: 88
- Joined: Sun Apr 11, 2021 3:02 pm
Re: Problem in parallelization
Dear Daniele,
After your reply, I noticed that the mistake was in the response block size value. When I automatically generate the input file, the name is 'NGsBlkXp', however in the description of the variable on the website, the name is 'NGsBlkXd'. Which name is correct?
After your reply, I noticed that the mistake was in the response block size value. When I automatically generate the input file, the name is 'NGsBlkXp', however in the description of the variable on the website, the name is 'NGsBlkXd'. Which name is correct?
Laura Caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
Ph.D. Student
Université Catholique de Louvain
https://uclouvain.be/fr/repertoires/laura.caputo
- Daniele Varsano
- Posts: 3816
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: Problem in parallelization
They are different variables:
NGsBlkXp: blocks for the dielectric matrix in plasmon-pole approximation
NGsBlkXd: blocks for the dynamical dielectric matrix needed for a full-frequency calculations.
Note that before you had an input for a full-frequency calculation (dynamical matrix, number of frequencies etc...).
Can you point out in which tutorial the error is present in order to correct it?
Best,
Daniele
NGsBlkXp: blocks for the dielectric matrix in plasmon-pole approximation
NGsBlkXd: blocks for the dynamical dielectric matrix needed for a full-frequency calculations.
Note that before you had an input for a full-frequency calculation (dynamical matrix, number of frequencies etc...).
Can you point out in which tutorial the error is present in order to correct it?
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/