NETCDF error
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan
-
- Posts: 149
- Joined: Tue Apr 08, 2014 6:05 am
Re: NETCDF error
Dear Daniele,
I thank you very much for all your kind helps.
Enjoy your vacation. I will be in Rome in about 3 weeks. Hopefully it will be a bit cooler.
Best wishes
Martin
I thank you very much for all your kind helps.
Enjoy your vacation. I will be in Rome in about 3 weeks. Hopefully it will be a bit cooler.
Best wishes
Martin
Martin Spenke, PhD Student
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
- Davide Sangalli
- Posts: 640
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: NETCDF error
Dear Martin,
the 3.4.2 is still supported.
As usual the best way to solve the problem, is to reproduce it.
So If you can attach the input files for pwscf and yambo (both for 4.0.3 and 3.4.2) I'll run it.
Also it's interesting if you find out a run where 3.4.2 is faster than 4.0.3
I'll check that as well.
Best,
D.
the 3.4.2 is still supported.
As usual the best way to solve the problem, is to reproduce it.
So If you can attach the input files for pwscf and yambo (both for 4.0.3 and 3.4.2) I'll run it.
Also it's interesting if you find out a run where 3.4.2 is faster than 4.0.3
I'll check that as well.
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
-
- Posts: 149
- Joined: Tue Apr 08, 2014 6:05 am
Re: NETCDF error
Dear Davide,
The error :
is because of large ndb.W file (> 2.1 GB) and is related to netcdf limited size of variables in netcdf.inc, I set the variable to higher values, and recompiled, but the problem still persists.
This is a heavy calculation, probably you do not like to run it.
Now i am looking for a keyword in yambo_3.4.2 to avoid W file printing like in yambo_4.0.x.
If not possible, i like to disable printing of W file in the source code of yambo_3.4.2. I commented in iox.F file a line with respect to W printing. However it did not work for me.
The other option is to know from where the bug with respect to SOC comes to be able to further calculate things with yambo_3.4.1 as it is not coupled to netcdf library.
Actually because of SOC i shifted to yambo_3.4.2 and 4.0.x.
Regarding BSE; yambo_4.0.3 with parallelization over empty bands and eh
is really very slow relative to yambo_3.4.2.
Best wishes
Martin
The error :
Code: Select all
P001: [ERROR] STOP signal received while in :[06.01] G0W0 on the real axis
P001: [ERROR][NetCDF] NetCDF: One or more variable sizes violate format constraints
This is a heavy calculation, probably you do not like to run it.
Now i am looking for a keyword in yambo_3.4.2 to avoid W file printing like in yambo_4.0.x.
If not possible, i like to disable printing of W file in the source code of yambo_3.4.2. I commented in iox.F file a line with respect to W printing. However it did not work for me.
The other option is to know from where the bug with respect to SOC comes to be able to further calculate things with yambo_3.4.1 as it is not coupled to netcdf library.
Actually because of SOC i shifted to yambo_3.4.2 and 4.0.x.
In my system, in GW calculation (heavy calculation) if i parallelize only over conduction bands and qp, then 4.0.3 is as fast as 3.4.2 or 3.4.1. But never faster !!! about other parallelization strategies you can forget.Also it's interesting if you find out a run where 3.4.2 is faster than 4.0.3
Regarding BSE; yambo_4.0.3 with parallelization over empty bands and eh
is really very slow relative to yambo_3.4.2.
Best wishes
Martin
Martin Spenke, PhD Student
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
- Davide Sangalli
- Posts: 640
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: NETCDF error
Ok, it is clear.
3.4 is supported in the sense that we fix bugs if are found. But new features are not released there, i.e. to be able to switch off the I/O.
However we support only the last patch level (i.e. 3.4.2 presently). 3.4.1 instead is not supported anymore.
This limit of the NETCDF is indeed not strictly a bug, but if this cause you not to be able to perform a run we would be glad to solve it.
If you just want not to print ndb.W, you can open the file io_QP_and_GF.F and put
if(trim(what)=="W") return
at the very beginning of the file. I never tryed, so I cannot say if this will work in practice.
For the performances with parallelization, in general 4.0 is designed to scale better on a large number of CPU (n>100) but not to be faster on less CPUs. So if you find a strategy which gives the same speed as 3.4 this is what is expected.
However we improved some aspects of the code in serial and also a better scaling of the memory.
For BSE the parallelization suggested is over the kpt and then over the BSE blocks ("t" in input). In case you would like to run on more CPUs the suggestion is to parallelize on "t", i.e. on the number of BSE blocks. For 73 kpt there will be 73*(73+1)/2 blocks if I'm not wrong. The "eh" parallelization works fine only in case you do not have kpts, i.e. no symmetry operations. Ignore it otherwise. With the BSE in 4.0 we still have problems with the I/O. Thus the suggestion is to switch off the io of the kernel ("BS" in input). With these indications 4.0 should be at least as fast as 3.4 Finally we suggest not to compile with OpenMP, since in BSE the hybrid MPI+OpenMP parallelization is not yet finalized
D.
3.4 is supported in the sense that we fix bugs if are found. But new features are not released there, i.e. to be able to switch off the I/O.
However we support only the last patch level (i.e. 3.4.2 presently). 3.4.1 instead is not supported anymore.
This limit of the NETCDF is indeed not strictly a bug, but if this cause you not to be able to perform a run we would be glad to solve it.
If you just want not to print ndb.W, you can open the file io_QP_and_GF.F and put
if(trim(what)=="W") return
at the very beginning of the file. I never tryed, so I cannot say if this will work in practice.
For the performances with parallelization, in general 4.0 is designed to scale better on a large number of CPU (n>100) but not to be faster on less CPUs. So if you find a strategy which gives the same speed as 3.4 this is what is expected.
However we improved some aspects of the code in serial and also a better scaling of the memory.
For BSE the parallelization suggested is over the kpt and then over the BSE blocks ("t" in input). In case you would like to run on more CPUs the suggestion is to parallelize on "t", i.e. on the number of BSE blocks. For 73 kpt there will be 73*(73+1)/2 blocks if I'm not wrong. The "eh" parallelization works fine only in case you do not have kpts, i.e. no symmetry operations. Ignore it otherwise. With the BSE in 4.0 we still have problems with the I/O. Thus the suggestion is to switch off the io of the kernel ("BS" in input). With these indications 4.0 should be at least as fast as 3.4 Finally we suggest not to compile with OpenMP, since in BSE the hybrid MPI+OpenMP parallelization is not yet finalized
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
-
- Posts: 149
- Joined: Tue Apr 08, 2014 6:05 am
Re: NETCDF error
Dear Davide,
No, unfortunately
It does not print W file, however G0W0 calculation hangs for ever.
By the way i was unable to find a _GF.F file !
many thanks for the suggestions regarding BSE. I will check them.
I am using pure mpi version of yambo_4.0.x.
Best wishes
Martin
No, unfortunately
did not help.if(trim(what)=="W") return
It does not print W file, however G0W0 calculation hangs for ever.
By the way i was unable to find a _GF.F file !
Indeed, memory consummation was drastically reduced relative to the 3.4.x versions. A very positive point.However we improved some aspects of the code in serial and also a better scaling of the memory.
many thanks for the suggestions regarding BSE. I will check them.
I am using pure mpi version of yambo_4.0.x.
Best wishes
Martin
Martin Spenke, PhD Student
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
- Davide Sangalli
- Posts: 640
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: NETCDF error
Ok. I'll try to see if there is another way to avoid the IO of the ndb.W
D.
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
-
- Posts: 149
- Joined: Tue Apr 08, 2014 6:05 am
Re: NETCDF error
Dear Davide,
I forgot to say that you can obtain the netcdf error :
with any system which can generate a ndb.W file large than 2.1 GB, provided you are using the standard netcdf values in netcdf.inc file.
Even with netcdf large file size support option you will obtain the above error once the ndb.W file becomes larger than a certain size.
I remember in yambo_3.4.1 i generated ndb.W of about 80 GB (to converge wrt frequencies)
without any problem, since it was not mandatory to link yambo against netcdf at that time.
many thanks.
Bests
Martin
I forgot to say that you can obtain the netcdf error :
Code: Select all
P001: [ERROR] STOP signal received while in :[06.01] G0W0 on the real axis
P001: [ERROR][NetCDF] NetCDF: One or more variable sizes violate format constraints
Even with netcdf large file size support option you will obtain the above error once the ndb.W file becomes larger than a certain size.
I remember in yambo_3.4.1 i generated ndb.W of about 80 GB (to converge wrt frequencies)
without any problem, since it was not mandatory to link yambo against netcdf at that time.
many thanks.
Bests
Martin
Martin Spenke, PhD Student
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
-
- Posts: 149
- Joined: Tue Apr 08, 2014 6:05 am
Re: NETCDF error
Dear Davide,
Just wanted to again ask you whether you found a way to avoid printing of ndb.W and BS files in yambo_3.4.2 ?
This is really a big obstacle for calculating large systems.
Best wishes
Martin
Just wanted to again ask you whether you found a way to avoid printing of ndb.W and BS files in yambo_3.4.2 ?
This is really a big obstacle for calculating large systems.
Best wishes
Martin
Martin Spenke, PhD Student
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
- Davide Sangalli
- Posts: 640
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: NETCDF error
Dear Martin,
sorry. I did not manage to have a look in this yet.
It's on my todo list.
Did you try version 4.0 instead, or it was not possible to do the same calculations with it ?
We should release soon version 4.1
Indeed there already exist a pre-release hidden link on the yambo website.
http://www.yambo-code.org/testing-robot ... 109.tar.gz
Could you give a try to that in case you have problems with 4.0 ?
Sorry again and thank you for using yambo.
Best,
D.
sorry. I did not manage to have a look in this yet.
It's on my todo list.
Did you try version 4.0 instead, or it was not possible to do the same calculations with it ?
We should release soon version 4.1
Indeed there already exist a pre-release hidden link on the yambo website.
http://www.yambo-code.org/testing-robot ... 109.tar.gz
Could you give a try to that in case you have problems with 4.0 ?
Sorry again and thank you for using yambo.
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
-
- Posts: 149
- Joined: Tue Apr 08, 2014 6:05 am
Re: NETCDF error
Dear Davide,
It's my pleasure to use Yambo.
I can run the same GW+BSE calculations with version 4.0.x, too, however
diagonalization of the BS matrix (4000x4000 which is not at all large) becomes
very very slow in comparison to yambo_3.4.1.
Is there any remedy to circumvent negative eigenvalues in BS matrix, except switching back to TD approximation ?
Is the truncation of empty bands in Yambo still experimental ?
I will for sure check the 4.1 version.
many thanks and best wishes
Martin
It's my pleasure to use Yambo.
I can run the same GW+BSE calculations with version 4.0.x, too, however
diagonalization of the BS matrix (4000x4000 which is not at all large) becomes
very very slow in comparison to yambo_3.4.1.
Is there any remedy to circumvent negative eigenvalues in BS matrix, except switching back to TD approximation ?
Is the truncation of empty bands in Yambo still experimental ?
I will for sure check the 4.1 version.
many thanks and best wishes
Martin
Martin Spenke, PhD Student
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany
Theoretisch-Physikalisches Institut
Universität Hamburg, Germany