TERM_UNKNOWNLSF无法确定终止原因; 已记录 0 但TERM_UNKNOWN未显示0 TERM_孤立系统LSF 自动终止了孤立作业27 日 TERM_WINDOW在队列运行窗口关闭后作业终止2 TERM_ZOMBIE当LSF不可用时作业已退出19 日 提示:在lsb.acct文件中记录到JOB_FINISH事件的整数值和终止原因关键字在lsbatch.h文件中进行映射。
%bjobs -sumRUN SSUSP USUSP UNKNOWN PEND FWD_PENDPSUSP123 456 789 5 5 33 过滤-sum结果以仅针对用户显示作业槽计数user1,运行bjobs -sum -u user1: %bjobs -sum -u user1RUN SSUSP USUSP UNKNOWN PEND FWD_PENDPSUSP20 10 10 0 5 02
TERM_UNKNOWN LSF cannot determine a termination reason; 0 is logged but TERM_UNKNOWN is not displayed 0 TERM_ORPHAN_SYSTEM The orphan job was automatically terminated by LSF 27 TERM_WINDOW Job killed after queue run window closed 2 TERM_ZOMBIE Job exited while LSF is not available 19 Tip: ...
UNKNOWN mbatchd has lost contact with the sbatchd on the host where the job was running. PEND The job is pending, which may include PSUSP and chunk job WAIT. When -sum is used with -p in the LSF multicluster capability, WAIT jobs are not counted as PEND or FWD_PEND. When -sum...
注:对于处于 UNKNOWN 状态的作业,作业运行时估计基于作业mbatchd的内部计数。 资源使用情况 对于LSF 多集群功能作业转发模型,如果禁用了LSF 多集群功能资源使用情况更新,那么不会显示此信息。 使用lsf.conf中的LSF_HPC_EXTENSIONS="HOST_RUSAGE"来指定基于主机的资源使用情况。