Page 1 of 1

intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Thu Jan 09, 2020 10:43 am
by wachr
Dear all,

when calculating an 2D heterostack (for BSE, converging k-points), I came across instabilities of the intel-mkl (2019) linked with qe and yambo:
In DFT (QE), when increasing the k-point density from 12x12x1 to 18x18x1 (hexagonal unit cell), I had to change the parallelization strategy in QE in order to avoid segfaults (running qe with mpirun pw.x -nk 2, which means two pools of processors for k-point parallelization).

Independent on the k-point density - also for the 12x12x1 - I found a segfault when yambo was reading the last part of the wavefunction on the first processor:

Code: Select all

*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x13787
[ 0] /usr/lib64/libpthread.so.0(+0xf5e0)[0x7f1d77d065e0]
[ 1] /beegfs-home/modules/intelmkl/compilers_and_libraries_2019/linux/mkl/lib/intel64_lin/libmkl_intel_lp64.so(cdotc+0xba)[0x7f1d7f1eaeda]
Thus, I thought to use an older version of yambo without mkl (4.2.1) to process the database: ./SAVE//ndb.gops; Variable GROT; NetCDF: Start+count exceeds dimension bound suddenly appeared. In order to resolve this, I removed the ndb.gops and ran the initialization step of yambo, once again with the older version. And it runs. But it's a bit unsatisfactory as the numerical routines from the system are loaded that decrease the performance strongly.

So my question is what to do in order to get yambo running performantly with the mkl. From the QE experience, I also modified the parallelization strategy in yambo - without success. Is there any idea on this? May this be a compilation issue?

Thank you very much!
Christian

P.S. Some i/o on the 6.4.1. run in the appendix
edit: typos.

Re: intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Thu Jan 09, 2020 10:59 am
by Daniele Varsano
Dear Christian,
we will have a look at that, can you post also your config.log files?
What seems strange to me is that the crash happens when reading the wfs, so not seems related to the linear algebra operations.
In the meanwhile, you can try to compile yambo-4.4 using internal linear algebra:
-/configure --enable-int-linalg
and see if it runs and the performances are not severely compromised.

Best,
Daniele

Re: intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Thu Jan 09, 2020 11:12 am
by wachr
Dear Daniele,

thank you very much for the extremely quick answer! I will try your hint in order to resolve the issue.

I used the following call of configure to produce the config.log file attached

Code: Select all

./configure --prefix=$(pwd) --enable-iotk --with-iotk-path=$(pwd)/../iotk  --enable-mpi  --enable-uspp  --with-fft-path=/beegfs-home/modules/fftw3/3.3.8/ --with-lapack-libs="-lmkl_intel_lp64  -lmkl_sequential -lmkl_core" --with-blas-libs="-lmkl_intel_lp64 -lmkl_sequential -lmkl_core" --with-scalapack-libs=-lscalapack --with-blacs-libs=-lblacs 
Best regards!
Christian

Re: intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Thu Jan 09, 2020 12:21 pm
by wachr
I compiled the internal, linear algebra into yambo and tried to run the job again: it works. So this appears to be a working state, so far.

In case the performance of the internal blas / lapack is similar to the mkl, I will not try to put more energy into the mkl-version. However, this error may hit somebody else. So in case that it is reproducible by you, I would be happy to receive feedback :).

All the best and thank you for your effort!
Christian

Re: intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Fri Jan 10, 2020 10:17 am
by andrea.ferretti
Dear Christian,

pls note that the USPP implementation of yambo is meant to be a beta-version. If you are not using USPP pseudopot I would remove
--enable-uspp from the configure line (this may trigger un-wanted behaviours).

Moreover: the Intel19 compiler has been found to miscompile (or mis-optimize) a parallelism related library of yambo leading to random crashes.
This problem is compiler-related (eg does not show up when using the gnu compiler), and was worked around in yambo-4.5 (just released).
If you have to recompile the code, I would checkout this version, in order to get rid of this possible compiler issue.

take care
Andrea

Re: intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Tue Jan 14, 2020 11:49 am
by wachr
Dear Andrea,

thank you very much! Basically, I used the gnu7 compiler (an information that I forgot to write - which is, however, visible from the configure output).

Then, I will remove the --enable-uspp flag for the compilation with intelmkl and report whether this was the reason. And maybe, I will try yambo 4.5 (is it fully compatible with the wavefunctions and databases for yambo 4.4?).

Best regards,
Christian

Re: intel mkl crashes after reading wfcs (yambo 4.4.1 / qe 6.4.1)

Posted: Tue Jan 14, 2020 11:56 am
by andrea.ferretti
Dear Christian,

thanks for reporting.
Yes, wave functions converted from 4.4 should be compatible with 4.5
(worst case scenario, delete the SAVE/ndb* files and re-do the initialisation).

Andrea