NETCDF error
Moderators: Davide Sangalli, andrea.ferretti, myrta gruning, andrea marini, Daniele Varsano, Conor Hogan
-
- Posts: 17
- Joined: Mon Jan 22, 2018 8:23 pm
NETCDF error
Hi all,
I'd like to check RPA on a 50^3 k-grid in GaAs to check convergence, but am running into an issue. To get the yambo setup to run, I had to add the DBsFRAGpm= "+QINDX" option. When I then run an RPA calculation, I get the following error:
P0001: [ERROR] File ./SAVE//ndb.kindx; Variable Sindx; NetCDF: Variable not found
This was the second time I tried, the first time I got an error related to "Variable Sindx; NetCDF: One or more variable sizes violate format constraints"
Looking in the forum, I saw that this was related to NetCDF databases being >2Gb in size and the solution was to compile with large file support, or recommended was to add the -S option to yambo to fragment the database. The -S option doesn't seem to be a supported feature anymore right?
To clarify, I ran yambo initialization successfully only after adding DBsFRAGpm= "+QINDX". Then doing an RPA I get the format constraint error. Then I tried adding "-S" which obviously didn't work, then trying to run RPA again to reproduce the error I got the variable not found error.
Thanks,
Vatsal
I'd like to check RPA on a 50^3 k-grid in GaAs to check convergence, but am running into an issue. To get the yambo setup to run, I had to add the DBsFRAGpm= "+QINDX" option. When I then run an RPA calculation, I get the following error:
P0001: [ERROR] File ./SAVE//ndb.kindx; Variable Sindx; NetCDF: Variable not found
This was the second time I tried, the first time I got an error related to "Variable Sindx; NetCDF: One or more variable sizes violate format constraints"
Looking in the forum, I saw that this was related to NetCDF databases being >2Gb in size and the solution was to compile with large file support, or recommended was to add the -S option to yambo to fragment the database. The -S option doesn't seem to be a supported feature anymore right?
To clarify, I ran yambo initialization successfully only after adding DBsFRAGpm= "+QINDX". Then doing an RPA I get the format constraint error. Then I tried adding "-S" which obviously didn't work, then trying to run RPA again to reproduce the error I got the variable not found error.
Thanks,
Vatsal
Vatsal A. Jhalani
Postdoctoral Scholar | Department of Applied Physics
California Institute of Technology
Postdoctoral Scholar | Department of Applied Physics
California Institute of Technology
- Davide Sangalli
- Posts: 640
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: NETCDF error
Ciao Vatsal,
you are really pushing yambo using big k-points grids.
First, as you noticed, the -S option does not exist anymore and it was replaced by the option
which is DB specific and thus more flexible.
The QINDX db is needed for any run, thus if you put such option in the input for the setup, you need to keep it in all the subsequent input files.
Otherwise yambo checks the ndb.kindx, finds that it is fragmented while the input file does not want it fragmented and tries to recompute it.
Accordingly, this is my guess for what is happening.
a) you generate the ndb.kindx with fragments
b) run the RPA calculation, without specifically asking for fragmentation. yambo tries to re-generate the ndb.kindx and fails for the violation of the formal constraints
c) then running RPA again, yambo finds the corrupted ndb.kindx and gives the message Variable not found
In summary just erase all ndb.kindx and run first the setup and the subsequent RPA calculation with
in input in both cases.
An alternative is to recompile yambo using netcdf v4, adding to the configure the option
It has less formal constraints and should work also without the fragmentation.
Yambo should still be able to read all the databases in older format, but will generate all dew databases in HDF5 format.
Both solutions are fine for your case from what I understand.
If you push yambo even further you may need the netcdf v4 and the fragmentation together.
Best,
D.
you are really pushing yambo using big k-points grids.
First, as you noticed, the -S option does not exist anymore and it was replaced by the option
Code: Select all
DBsFRAGpm= "+QINDX"
The QINDX db is needed for any run, thus if you put such option in the input for the setup, you need to keep it in all the subsequent input files.
Otherwise yambo checks the ndb.kindx, finds that it is fragmented while the input file does not want it fragmented and tries to recompute it.
Accordingly, this is my guess for what is happening.
a) you generate the ndb.kindx with fragments
b) run the RPA calculation, without specifically asking for fragmentation. yambo tries to re-generate the ndb.kindx and fails for the violation of the formal constraints
c) then running RPA again, yambo finds the corrupted ndb.kindx and gives the message Variable not found
In summary just erase all ndb.kindx and run first the setup and the subsequent RPA calculation with
Code: Select all
DBsFRAGpm= "+QINDX"
An alternative is to recompile yambo using netcdf v4, adding to the configure the option
Code: Select all
--enable-netcdf-hdf5
Yambo should still be able to read all the databases in older format, but will generate all dew databases in HDF5 format.
Both solutions are fine for your case from what I understand.
If you push yambo even further you may need the netcdf v4 and the fragmentation together.
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
-
- Posts: 17
- Joined: Mon Jan 22, 2018 8:23 pm
Re: NETCDF error
Ciao,
Yeah, I can tell I am pushing yambo, but I know it can do it!!!!
Anyways, thanks for the suggestion, I did both fragmentation and compiling with netcdf v4 and was able to get past the error.
However... now I get another one. Yambo crashes at the dipoles, and some of the log files end with the error:
P0303: [ERROR] STOP signal received while in :[04] Dipoles
P0303: [ERROR]Allocation attempt of DIP_S of negative size.
I seem to get this type of negative allocation attempt a lot when pushing to big k-point grids for different variables. Note I am using the Covariant dipole approach.
Grazie!
Vatsal
Yeah, I can tell I am pushing yambo, but I know it can do it!!!!

Anyways, thanks for the suggestion, I did both fragmentation and compiling with netcdf v4 and was able to get past the error.
However... now I get another one. Yambo crashes at the dipoles, and some of the log files end with the error:
P0303: [ERROR] STOP signal received while in :[04] Dipoles
P0303: [ERROR]Allocation attempt of DIP_S of negative size.
I seem to get this type of negative allocation attempt a lot when pushing to big k-point grids for different variables. Note I am using the Covariant dipole approach.
Grazie!
Vatsal
Vatsal A. Jhalani
Postdoctoral Scholar | Department of Applied Physics
California Institute of Technology
Postdoctoral Scholar | Department of Applied Physics
California Institute of Technology
-
- Posts: 169
- Joined: Sat Aug 17, 2019 2:48 pm
Re: NETCDF error
Dear developers,
I am facing the NetCDF error while doing BSE! Everything was going well when I was increasing BSEbands but suddenly after 130-220 bands, the error started to come! "P1: [ERROR] File ./BSEBnds_130-225//ndb.BS_Q1_CPU_0; Variable W_DbGd; NetCDF: Invalid dimension size"
Is it that dimension of the matrix has become too large to handle without NetCDF libraries? Files are attached.
I then tried to compile Yambo 4.5.1 (in another folder) with just addition of, in the previous ./configure command. It compiled, but interestingly, it compiled in the serial version while ifort and mpi were detected successfully.
My configure command:
So, was it really an error due to the non-NetCDF compilation of code? If yes, then can you please suggest me the way it should be compiled with those libraries?
On Yambo-wiki pages some commands are given to give the path manually in configure. But is there a way to install it on the fly just like Yambo does with other needed packages?
Thanks,
I am facing the NetCDF error while doing BSE! Everything was going well when I was increasing BSEbands but suddenly after 130-220 bands, the error started to come! "P1: [ERROR] File ./BSEBnds_130-225//ndb.BS_Q1_CPU_0; Variable W_DbGd; NetCDF: Invalid dimension size"
Is it that dimension of the matrix has become too large to handle without NetCDF libraries? Files are attached.
I then tried to compile Yambo 4.5.1 (in another folder) with just addition of
Code: Select all
--enable-netcdf-hdf5
My configure command:
Code: Select all
./configure FC=ifort F77=ifort CC=icc PFC=mpiifort --with-blas-libs="-L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core" --with-lapack-libs="-L$MKLROOT/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core" --with-scalapack-libs="-L$MKLROOT/lib/intel64 -lmkl_scalapack_lp64" --with-blacs-libs="-L$MKLROOT/lib/intel64 -lmkl_blacs_intelmpi_lp64" --with-fft-libs="-mkl" --enable-time-profile --enable-memory-profile --enable-open-mp --enable-msgs-comps --enable-netcdf-hdf5
So, was it really an error due to the non-NetCDF compilation of code? If yes, then can you please suggest me the way it should be compiled with those libraries?
On Yambo-wiki pages some commands are given to give the path manually in configure. But is there a way to install it on the fly just like Yambo does with other needed packages?
Thanks,
You do not have the required permissions to view the files attached to this post.
Haseeb Ahmad
MS - Physics,
LUMS - Pakistan
MS - Physics,
LUMS - Pakistan
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: NETCDF error
Dear Haseeb,
in Yambo all the database I/O is handled by the netcdf library, they are downloaded and compiled when running the makefile.
What you are trying to do now is to add the hdf5 support. I do not know if this will make things work.
Anyway, the problem seems to be related to the double grid variable (are you using it?), and it is not easy to understand what is going wrong.
Note that your matrix is extremely large, are you sure you need such a large matrix to have your results converged?
About the compilation, it would be needed to have a look to the config.log file.
Best,
Daniele
in Yambo all the database I/O is handled by the netcdf library, they are downloaded and compiled when running the makefile.
What you are trying to do now is to add the hdf5 support. I do not know if this will make things work.
Anyway, the problem seems to be related to the double grid variable (are you using it?), and it is not easy to understand what is going wrong.
Note that your matrix is extremely large, are you sure you need such a large matrix to have your results converged?
About the compilation, it would be needed to have a look to the config.log file.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
- Davide Sangalli
- Posts: 640
- Joined: Tue May 29, 2012 4:49 pm
- Location: Via Salaria Km 29.3, CP 10, 00016, Monterotondo Stazione, Italy
- Contact:
Re: NETCDF error
Dear Haseeb,
let me specify that using the
you switched from netcd3 to netcdf4 which is based on HDF5
It maybe that such change solved the problem.
In general for big BSE simulation I suggest to use
If you want to dig deeper in the issue you were having, try to recompile the netcdf library.
You need to add
to the configure of the netcf library, i.e. in
add such flag to the AUX_FLAGS
Moreover you need to add the following lines in src/setup/setup.F
before the include memory.h, and the following lines
just at the beginning of the file (before call section line)
Repeating the run, the NETCDF library will be much more verbose, possibly giving useful information on the source of the NETCDF error
Best,
D.
let me specify that using the
Code: Select all
--enable-netcdf-hdf5
you switched from netcd3 to netcdf4 which is based on HDF5
It maybe that such change solved the problem.
In general for big BSE simulation I suggest to use
Code: Select all
--enable-hdf5-par-io
You need to add
Code: Select all
--enable-logging
Code: Select all
lib/netcdf/Makefile.loc
lib/netcdff/Makefile.loc
Code: Select all
AUXFLAGS=--prefix=$(LIBPATH) \
--without-pic --enable-static --disable-shared \
--disable-dap --enable-logging $(netcdf_opt)
Code: Select all
#if defined _HDF5_IO
! For I/O debug, see below
!use IO_m, ONLY:netcdf_call,nf90_set_log_level
#endif
Code: Select all
#if defined _HDF5_IO
! This is very useful for I/O debug
! NETCDF library need to be compiled with --enable-loggning flag
call netcdf_call(nf90_set_log_level(3),1)
#endif
Repeating the run, the NETCDF library will be much more verbose, possibly giving useful information on the source of the NETCDF error
Best,
D.
Davide Sangalli, PhD
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
CNR-ISM, Division of Ultrafast Processes in Materials (FLASHit) and MaX Centre
https://sites.google.com/view/davidesangalli
http://www.max-centre.eu/
-
- Posts: 169
- Joined: Sat Aug 17, 2019 2:48 pm
Re: NETCDF error
Dear developers, thanks for your kind responses!
Since I have not installed NetCDF or hdf5 libraries that's why I wanted to use the internal libraries during compilation!
The problem with non-mpi was new intel installations whose mpi has to be source separately apart from the ifort compiler. So, now new compilation is parallel as it should, but by using just --enable-netcdf-hdf5 (not using --enable-hdf5-par-io), the original error is replaced by:
P1: [ERROR] File ./BSEBnds_130-225//ndb.BS_Q1_CPU_0; Variable BSE_RESONANT; NetCDF: HDF error
and after using the --enable-hdf5-par-io the error seems to be not coming but the kernel is taking a long time to even start! Is it okay? I think it is doing a heavy disk-IO, for it has created a massive file (ndb.BS_PAR_Q1) of around 180 GB!
Thank you,
Since I have not installed NetCDF or hdf5 libraries that's why I wanted to use the internal libraries during compilation!
The problem with non-mpi was new intel installations whose mpi has to be source separately apart from the ifort compiler. So, now new compilation is parallel as it should, but by using just --enable-netcdf-hdf5 (not using --enable-hdf5-par-io), the original error is replaced by:
P1: [ERROR] File ./BSEBnds_130-225//ndb.BS_Q1_CPU_0; Variable BSE_RESONANT; NetCDF: HDF error
and after using the --enable-hdf5-par-io the error seems to be not coming but the kernel is taking a long time to even start! Is it okay? I think it is doing a heavy disk-IO, for it has created a massive file (ndb.BS_PAR_Q1) of around 180 GB!
Dear Daniele, I think I need to, I have attached some results of BSEbands convergence testing. As you can see, results are changing when I am increasing the bands I would be happy if you can see them and comment.Note that your matrix is extremely large, are you sure you need such a large matrix to have your results converged?
Thank you,
You do not have the required permissions to view the files attached to this post.
Haseeb Ahmad
MS - Physics,
LUMS - Pakistan
MS - Physics,
LUMS - Pakistan
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: NETCDF error
Dear Haseeb,
convergence really depends on which part of the spectrum you are interested in:
of course the more bands you introduce the more will change the high energy part of the spectrum, the question is if it make sense to look above the continuum region. Now I do not know to what system you are looking at, usually one it is interested in excitations below the quasi-particle gap (bound excitons), and the range 130-200 seems to me provide converged results up to 6eV.
Best,
Daniele
convergence really depends on which part of the spectrum you are interested in:
of course the more bands you introduce the more will change the high energy part of the spectrum, the question is if it make sense to look above the continuum region. Now I do not know to what system you are looking at, usually one it is interested in excitations below the quasi-particle gap (bound excitons), and the range 130-200 seems to me provide converged results up to 6eV.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
-
- Posts: 169
- Joined: Sat Aug 17, 2019 2:48 pm
Re: NETCDF error
Dear Daniele,
thanks for looking into it.
Actually, I am interested in the visible portion (due to the photocatalytic system). But even at low energies, how I trust the 130-200 result, since the real part of the dielectric is changing so much (even after 200 bands)! For imaginary parts, it makes sense that more bands mean more transition and hence imaginary part should increase as it is doing! But how should I be certain about the real part which since I need to report the refractive index which in turn is related to static dielectric function.
Moreover, I can also stay within QP bandgap, but since I have the experimental data up to 6.2 eV that's why for comparison of different theoretical models, I'm calculating the results up to 6.5 eV.
Thanking you,
thanks for looking into it.
.I do not know to what system you are looking at
Actually, I am interested in the visible portion (due to the photocatalytic system). But even at low energies, how I trust the 130-200 result, since the real part of the dielectric is changing so much (even after 200 bands)! For imaginary parts, it makes sense that more bands mean more transition and hence imaginary part should increase as it is doing! But how should I be certain about the real part which since I need to report the refractive index which in turn is related to static dielectric function.
Moreover, I can also stay within QP bandgap, but since I have the experimental data up to 6.2 eV that's why for comparison of different theoretical models, I'm calculating the results up to 6.5 eV.
Thanking you,
Haseeb Ahmad
MS - Physics,
LUMS - Pakistan
MS - Physics,
LUMS - Pakistan
- Daniele Varsano
- Posts: 4198
- Joined: Tue Mar 17, 2009 2:23 pm
- Contact:
Re: NETCDF error
Dear Haseeb,
I was referring to the excitation energies, ie your spectrum seems to be converged quantitatively in the low energy part and qualitatively at higher energy.
Real and imaginary parts are related by Kramers-Kronig relations, so also a change in the high energy part in the imaginary part reflects in a change in the real part at all the energies. In any case, the change you have with the higher number of bands is around 1% of the value of the function.
Best,
Daniele
I was referring to the excitation energies, ie your spectrum seems to be converged quantitatively in the low energy part and qualitatively at higher energy.
Real and imaginary parts are related by Kramers-Kronig relations, so also a change in the high energy part in the imaginary part reflects in a change in the real part at all the energies. In any case, the change you have with the higher number of bands is around 1% of the value of the function.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/