阿里云的 Prometheus 使用 container_memory_working_set_bytes 来进行告警设置 container_memory_working_set_bytes 是容器真实使用的内存量,也是资源限制limit时的重启判断依据 # 进入容器 pod 内部,获取到的pod内存数据:cat/sys/fs/cgroup/memory/memory.stat# 重要说明total_cache: 表示当前pod缓存内存量 total_rss...
表达式1:sum (container_memory_working_set_bytes{container !="",container!="POD"}) by (container, pod) / sum(container_spec_memory_limit_bytes{container !="",container!="POD"}) by (container, pod) * 100 !=+Inf 表达式2:round(sum by(name, id, job, node) (container_memory_rss{image!
container_memory_working_set_bytes 这个指标更能表达内存的使用情况,容器oom killer也是根据container_memory_working_set_bytes 来决定是否oom kill的 container_memory_usage_bytes 这个指标显示的内存 包括了container_memory_cache,不能准确的反映容器真实内存使用量 统计k8s集群总内...
container_memory_working_set_bytes是容器真实使用的内存量,也是资源限制limit时的重启判断依据。
{PROM_OCP_ROUTE}" TOP10_MEM_BANNER="Top 10 memory-consuming pods, ${PROJECT_CPD_INST_OPERANDS} namespace: <pod>, <memory GB>" TOP10_MEM_QUERY="topk(10, max(container_memory_working_set_bytes{namespace=\"${PROJECT_CPD_INST_OPERANDS}\",container!=\"\",pod!=\"\"}) by (pod) )...
sum(container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", image!="", container!="POD"}) by (cluster, namespace, pod, container) /sum(kube_pod_container_resource_limits_memory_bytes) by (cluster, namespace, pod, container) > 0.75 ...
node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate 预聚合指标 kube_pod_container_resource_limits_cpu_cores kube-state-metrics Memory Usage (working_set) sum(container_memory_working_set_bytes{cluster="$cluster", container!="", container!="POD"}) by (namespace)...
- alert: HostOutOfMemory expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes * 100 < 10 for: 5m labels: severity: warning annotations: summary: Host out of memory (instance {{ $labels.instance }}) description: Node memory is filling up (< 10% left)\n VALUE = {{ $valu...
sum(namespace:kube_pod_container_resource_requests_memory_bytes:sum{}) / sum(kube_node_status_allocatable_memory_bytes) > (count(kube_node_status_allocatable_memory_bytes)-1) / count(kube_node_status_allocatable_memory_bytes) 5 存储过载。 KubeCPUQuotaOvercommit sum(kube_resourcequota{job="kube...
- alert: pod 内存使用率 expr:(sum(container_memory_working_set_bytes{container!="POD",name!=""})BY(instance, namespace,pod)/ sum(container_spec_memory_limit_bytes>0)BY(instance, namespace,pod)*100)>95for: 2m labels: severity: warning annotations: summary: Container Memory usage(instance...