Skip to content
English
  • There are no suggestions because the search field is empty.

Low Available Memory on Host (TrueScan-12)

Overview
An alert was triggered indicating that host TrueScan-12 has low available memory.

 The alert is based on the system.mem.pct_usable metric reporting less than 10% usable memory on average over the last 5 minutes.

 This condition typically means that running processes are consuming most of the system RAM, leaving insufficient memory for additional workloads or spikes in demand.
Symptoms
·       Slower application performance due to paging or swapping
·       Increased service latency
·       Service crashes or unexpected restarts
·       Background job failures
·       System instability or degraded responsiveness
·       Out Of Memory (OOM) kill events in logs

Impact
If not resolved, low memory can impact user-facing applications, backend services, scheduled jobs, monitoring agents, and overall host stability.


Alert Details
Metric: system.mem.pct_usable
Threshold: Less than 10%
Evaluation Window: 5-minute average

Initial Troubleshooting Steps

  1. Review Metrics in Datadog
    Navigate to Infrastructure → Host Map → Select Host → Metrics Tab.
    Review:
    - system.mem.pct_usable
    - system.mem.total
     - system.mem.used
     - system.swap.used
     - process.mem.rss
     - process.mem.real

  2. Check Live Processes
    Use Infrastructure → Live Processes and sort by memory usage (RSS) to identify high memory-consuming processes.

  3. SSH Into the Host
    Run the following commands:
    ·       top -o %MEM
    ·       htop
    ·       free -m
    ·       ps aux --sort=-%mem | head -10
    ·       vmstat 1 5

Common Causes & Resolutions
1.     Memory leak or runaway process: Restart or stop the problematic process.
2.     Memory-intensive workload: Scale vertically (increase RAM) or redistribute workloads.
3.     Concurrent heavy jobs: Stagger or reschedule batch tasks.
4.     Misconfigured JVM/application heap: Tune heap size, buffers, or memory limits.
5.     Insufficient instance size: Resize instance or migrate to higher memory class.
6.     Swap disabled or insufficient: Enable/configure swap (short-term mitigation only).

Preventive Measures
·       Configure early warning alerts at 20–25% usable memory.
·       Implement autoscaling where applicable.
·       Enable process-level memory monitoring.
·       Conduct regular capacity planning reviews.
·       Monitor OOM events and logs.