Error running parallel calculations with Coulomb cutoff

ldx03 · Post by **ldx03** » Sun May 10, 2015 8:58 pm

Dear developers,

I have done a few test calculations on monolayer BN with yambo-4.0.0-rev78 lately. There is a error comes up when I try to run a parallel calculation with Coulomb cutoff:

Code: Select all

  [04.02] RIM integrals
  =====================
  [ERROR] STOP signal received while in :[04.02] RIM integrals
  [ERROR]Incomplete Parallel Index Filling

I tried a few times with different parallel parameter settings, including using the default value when I comment out the parallel parameters in the input files, but this error is always there. If I turn off the Coulomb cutoff or RIM settings, the calculation can run without problems. There is no such problem when I run the calculation in yambo-3.4.1-rev61 with the same input files. I've attached my input files and log files here. Could you help to check if there is something wrong with my input files (for the new version)? Thanks a lot!

Regards,
Lede

Post by **Daniele Varsano** » Sun May 10, 2015 9:14 pm

Dear Lede,
thanks for reporting,

Anyway what happen if you set:

Code: Select all

X_all_q_CPU= "16 1 1 1"              # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v"

i.e. please fill with number all the variable present in the list.
The same is true for the self energy part.

If it does not work, I will try to reproduce the error and fix it.
In the meanwhile I suggest you to run the coulomb cutoff only in serial (yambo -c), once for all and then continues the all the other runlevel in parallel, always introducing the -c option, which this time will be read and not calculated.

Next, not at all related with your problem, but have in mind there is factor two in the definition of the box side.

Code: Select all

% CUTBox
 0.00     | 0.00     | 23.89726     |        # [CUT] [au] Box sides
%

I your input you are cutting interaction at ~12au, while I think you want to cut around ~24, so you should put the zcut just a little bit smaller than your box side.
This is something I still ned to fix in the code.

Best,
Daniele

ldx03 · Post by **ldx03** » Sun May 10, 2015 11:02 pm

Dear Daniele,

Thanks for the reply. The error will still come up with your recommended settings. I attached my abinit input files here if you want to reproduce the error.

In the meanwhile I suggest you to run the coulomb cutoff only in serial (yambo -c), once for all and then continues the all the other runlevel in parallel, always introducing the -c option, which this time will be read and not calculated.

Thanks for the tip. I tried and it works. But the resulting band gap at Gamma point given by 4.0.0-rev78 is different from the one given by 3.4.1-rev61 by about 1 eV with the same input, which result is more reliable (or is there something wrong with my inputs attached previously)?

Next, not at all related with your problem, but have in mind there is factor two in the definition of the box side.

Thanks for the reminder. I already set the cutoff box size half of my cell size. I guess I could further decrease the cutoff size a little too.

Best,
Lede

Post by **Daniele Varsano** » Mon May 11, 2015 6:46 am

Dear Lede,
The two release should give the same results, can you post the two report files?

I already set the cutoff box size half of my cell size. I guess I could further decrease the cutoff size a little too.

In the input you posted, it was set to 1/4 of the cell size, if you want to be half of the cell size you need to set in input zcut about he cell size.

Best,

Daniele

ldx03 · Post by **ldx03** » Mon May 11, 2015 2:23 pm

Dear Daniele,

In the input you posted, it was set to 1/4 of the cell size, if you want to be half of the cell size you need to set in input zcut about he cell size.

Oh, I see. I remember there is a factor of two but I mistook it before. Thanks!

The two release should give the same results, can you post the two report files?

Yes, I found it very strange too. Maybe some of the default value I didn't set are different in these two versions?

Here are the output files calculated with the two versions of Yambo with almost the same input. Please help to check. Thanks!

Best,
Lede

Post by **Daniele Varsano** » Mon May 11, 2015 2:48 pm

Dear Lede,
at first glance it is not easy to spot the problem. I think we will need to reproduce the error.
The only think I do not like in the input of the 4.0 release is this variable:

Code: Select all

X_all_q_CPU= "16 1 1 1"              # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "c v k q"

where you changed the order of the "c v k q" instead of (q,k,c,v).
Moreover I suppose that the in your job you are using exactly 16 cpu.

I will have a deeper look.

Daniele

ldx03 · Post by **ldx03** » Mon May 11, 2015 3:23 pm

Dear Daniele,

Does the order matters? If I use the default parallel settings by commenting out "X_all_q_CPU,X_all_q_ROLEs, ...", in the output file the order will be like "c.v.k", so I thought the order does not matters. But I can try again with the (q,k,c,v) order.

Moreover I suppose that the in your job you are using exactly 16 cpu.

Yes, I am.

Thanks,
Lede

Post by **Daniele Varsano** » Tue May 12, 2015 2:38 pm

Dear Lede,
I tried to reproduce your error but I was unable. Here the results using your same input, and a slightly modified abinit input (nscf calculation for computing non occupied states as it is faster and more precise). My results do coincide with your 3.4.1 output (slight differences due to the different abinit input). You can find here abinit input and yambo report (which contains qp energies and the mirror of the input at the end), moreover the cutoff is calculated in parallel without problem.

Just now I realized your are using OpenMPI library, which have some issues that can be fixed at the compilation stage, by adding this line:

Code: Select all

 --enable-openmpi

in your configure script.

Do not forget to do a make clean_all before reconfiguring and recompiling.
I hope that this will fix the problem.
Best,
Daniele

ldx03 · Post by **ldx03** » Tue May 12, 2015 5:41 pm

Dear Daniele,

Thanks a lot! Now I can reproduce the results obtained in 3.4.1 too, after recompile the code with your instruction. Everything is fine now. Thanks again for your time and your help.

Cheers,
Lede

Post by **Daniele Varsano** » Tue May 12, 2015 5:47 pm

Dear Lede,
Great it works! The issue of the OpenMPI should be fixed in some automatic way in order to avoid situation like this, but I'm afraid it is not straightforward.

Best,
Daniele

Yambo Community Forum

Error running parallel calculations with Coulomb cutoff

Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff

Re: Error running parallel calculations with Coulomb cutoff