The frequently occurred 'Memory Overflow' when calculating Quasi-particle energies with yambo in GW calculations
Posted: Tue Apr 21, 2020 8:20 am
Dear Experts,
I am starting to use yambo by following the example 23 in wannier90 tutorial, the QP energies calculation step with yambo code appears to be a memory eater. Previously a pure carbon system I calculated was always terminated until I change the node to another one with 128G memory. But this issue has occurred again in my present doped system, and the 128G node cannot worked again this time. My computational environment is 128G memory per node with 24 cores. I have tried one node and two or three parallel nodes, the feedbacks from technique support are always be memory problem. I really have no idea about how to solve this problem, the only hint I found in log file is this word: <01m-19s> P0010: Self_Energy parallel ENVIRONMENT is incomplete. Switching to defaults.
I have learned a little from here: http://www.yambo-code.org/wiki/index.ph ... n_parallel. The technique support suggests parallelism under MPI+openMP. In our policy, -N, -c, -n represent nodes, CPUs for per task, number of tasks respectively. The first thing is no matter what number I defined after -c, the thread is always be 1 ( * THREADS (max): 1 in r_em1d_ppa_HF_and_locXC_gw0 file, for example, set -N 1 -c 24 -n 1, the threads cannot reach to 24 but stay to 1.). Another thing is what I depicted in the first paragraph, that is 'memory overflow'. While the job can run when I use only one CPU on one node, apparently, this is very inefficiency. I am still unfamiliar about this code. Sincerely seek help. I have uploaded the directory of my jobs, and it not includes the SAVE directory from the previous step due to the large size.
I am starting to use yambo by following the example 23 in wannier90 tutorial, the QP energies calculation step with yambo code appears to be a memory eater. Previously a pure carbon system I calculated was always terminated until I change the node to another one with 128G memory. But this issue has occurred again in my present doped system, and the 128G node cannot worked again this time. My computational environment is 128G memory per node with 24 cores. I have tried one node and two or three parallel nodes, the feedbacks from technique support are always be memory problem. I really have no idea about how to solve this problem, the only hint I found in log file is this word: <01m-19s> P0010: Self_Energy parallel ENVIRONMENT is incomplete. Switching to defaults.
I have learned a little from here: http://www.yambo-code.org/wiki/index.ph ... n_parallel. The technique support suggests parallelism under MPI+openMP. In our policy, -N, -c, -n represent nodes, CPUs for per task, number of tasks respectively. The first thing is no matter what number I defined after -c, the thread is always be 1 ( * THREADS (max): 1 in r_em1d_ppa_HF_and_locXC_gw0 file, for example, set -N 1 -c 24 -n 1, the threads cannot reach to 24 but stay to 1.). Another thing is what I depicted in the first paragraph, that is 'memory overflow'. While the job can run when I use only one CPU on one node, apparently, this is very inefficiency. I am still unfamiliar about this code. Sincerely seek help. I have uploaded the directory of my jobs, and it not includes the SAVE directory from the previous step due to the large size.