Page 1 of 5

BSE diagonalization solver error

Posted: Wed Sep 18, 2019 1:05 pm
by will
Hi all,
I used Yambo-4.4.0 to run BSE calculations for a bismuth-based perovskite. The soc effect was considered. When I used the Diago solver, I got the following error:
______________________________
[05.01] Diago solver
====================


[ERROR] STOP signal received while in :[05.01] Diago solver

[ERROR]Allocation attempt of BS_mat of negative size.
_______________________________________________

Input and report files are attached.
Thanks!
Best
Xiaowei

Re: BSE diagonalization solver error

Posted: Wed Sep 18, 2019 1:16 pm
by Daniele Varsano
Dear Xiaowei,
most probably you are running in a out-of-memory issue.
Given the size of your BSE matrix:

Code: Select all

 |Dimension               :  49856
which need roughly 37Gb.

My suggestions are:
1) Use less cpu per node (anyway check first that your node has enough memory to deal with)
2) Compile the code using scalapack libraries, if not already done, and assign more cpus for the linear algebra in the input file:

Code: Select all

BS_nCPU_LinAlg_DIAGO= 
Not sure this will work
3) Solve the BSE run with recursive methods (Haydock) (-y h option), this will avoid allocating the entire matrix, this has the drawback that will not calculate the eigenstates of the BSE needed for a real space representation of the excitonic wavefunction.

Best,
Daniele

Re: BSE diagonalization solver error

Posted: Wed Sep 18, 2019 2:38 pm
by will
Dear Daniele,
Thanks for your reply. I'll try your suggestions and let you know the results. The Haydock method is ok and produced resonable spectrum. But I wanna obtain the oscillator strengths.
Best,
Xiaowei

Re: BSE diagonalization solver error

Posted: Thu Sep 19, 2019 4:00 am
by will
Dear Daniele,
I tried your suggestions. And I used the biggest node with 4 TB memory, but the error remains. I attached the job script.

Thanks
Best,
Xiaowei

Re: BSE diagonalization solver error

Posted: Thu Sep 19, 2019 10:00 am
by Daniele Varsano
Dear Xiaowei,
I used the biggest node with 4 TB memory,
Here I'm talking about RAM memory and not storage memory. Do you how much RAM memory have your nodes?
As you have already the BSE matrix and you do not need to recalculate it, I suggest you to try a serial run for the diagonalization, so that you can use all the memory available in the node.
Best,
Daniele

Re: BSE diagonalization solver error

Posted: Fri Sep 20, 2019 1:59 am
by will
Dear Daniele,
I see ! Here the "4 TB memory" means RAM memory. But in this node there are 144 cores, so the RAM per core willl be less than 37 GB.
There is only one this node. In other nodes, each node has 32 cores with 8GB RAM per core. So I should assign more RAM to the cores to solve the problem.

Thanks
Best,
Xiaowei

Re: BSE diagonalization solver error

Posted: Fri Oct 04, 2019 1:10 am
by sdwang
Dear Daniele,
I got the same problem when running BSE. The memory of mine is 1T and I used 32 cores(I reduced the cores but it still do not work). In the previous calculation it works and now I just increased the k points keeping other parameters the same as before, but it stops.
Attached is the input and log files.

Thanks!

Shudong

Re: BSE diagonalization solver error

Posted: Fri Oct 04, 2019 8:30 am
by Daniele Varsano
Hi Shudong,
I think it is a memory issue.
You have a matrix of large dimension (Nd~49000). A rough estimation of the memory:
NdxNdx16/(1024^3) ~ 36Gb per core
1Tb/32core ~ 31Gb per core

Try to use a smaller number of cores allowing more than 36Gb of memory.

Best,
Daniele

Re: BSE diagonalization solver error

Posted: Fri Oct 04, 2019 9:43 am
by sdwang
Dear Daniele,
I have reduced the memory per node to about 42 Gb (24 cores), but it still stopped with the same error.


Best

Shudong

Re: BSE diagonalization solver error

Posted: Fri Oct 04, 2019 9:47 am
by Daniele Varsano
Dear Shudong,
What I did was just an estimation of the memory allocated by the matrix, then you can have other memory allocated.
In order to check it you should recompile Yambo with the option:
--enable-memory-profile
As a test, try also to remove the scalapack distribution, I'm not sure if in this case there is a memory duplication, even this should not occur.
Best,
Daniele