How Can I Check Node CPU Utilization in OpenShift?

Monitoring the performance of your OpenShift cluster is crucial for maintaining a healthy and efficient environment, and one of the key metrics to keep an eye on is node CPU utilization. Understanding how much CPU your nodes are consuming can help you identify bottlenecks, optimize resource allocation, and ensure your applications run smoothly without unexpected slowdowns or failures. Whether you’re managing a small development cluster or a large-scale production environment, knowing how to check node CPU utilization is an essential skill for any OpenShift administrator.

In OpenShift, nodes serve as the backbone of your containerized workloads, hosting pods and running critical system processes. Since CPU resources are finite, keeping track of utilization helps prevent overloading and allows for proactive scaling decisions. By regularly monitoring CPU usage, you can gain insights into the overall health of your cluster, detect anomalies early, and make informed decisions about resource management.

This article will guide you through the fundamental concepts and methods for checking node CPU utilization in OpenShift. You’ll learn about the tools and commands commonly used for this purpose, as well as best practices to interpret the data effectively. Whether you are new to OpenShift or looking to deepen your operational expertise, this overview will set the stage for mastering resource monitoring in your cluster.

Using OpenShift CLI to Monitor Node CPU Utilization

OpenShift provides several command-line tools that allow administrators to monitor node CPU utilization efficiently. One of the primary tools is `oc`, which is OpenShift’s CLI, built on top of Kubernetes commands. To check the CPU usage on nodes, you can leverage metrics exposed by the cluster.

To begin, confirm that the Metrics Server or Prometheus is installed and running in your OpenShift cluster, as they provide the necessary resource usage data.

You can run the following command to get a quick overview of CPU usage across all nodes:

“`bash
oc adm top nodes
“`

This command outputs a list of nodes with their current CPU and memory usage. The output typically looks like this:

NODE CPU(cores) CPU % MEMORY(bytes) MEMORY %
node-1.example.com 500m 25% 2Gi 40%
node-2.example.com 300m 15% 1.5Gi 30%
  • CPU(cores): Current CPU consumption in cores or millicores (m).
  • CPU %: Percentage of total CPU capacity utilized.
  • MEMORY(bytes): Memory usage.
  • MEMORY %: Percentage of total memory capacity utilized.

If you want to monitor a specific node, use:

“`bash
oc adm top node
“`

This command provides detailed CPU and memory usage for the specified node. It helps in isolating performance issues or balancing workloads.

Leveraging Prometheus and Grafana for Advanced CPU Monitoring

OpenShift integrates Prometheus as a robust monitoring solution that scrapes metrics from nodes and pods. For a detailed and customizable visualization, Prometheus is often paired with Grafana dashboards.

Prometheus collects metrics using exporters such as `node_exporter`, which runs on each node exposing CPU, memory, disk, and network metrics.

To query CPU usage in Prometheus, use the following PromQL expression:

“`
100 – (avg by(instance) (irate(node_cpu_seconds_total{mode=”idle”}[5m])) * 100)
“`

This query calculates the percentage of CPU utilized by subtracting the average idle CPU time from 100%, giving you CPU utilization per node instance over the last 5 minutes.

Key benefits of using Prometheus and Grafana include:

  • Historical Data Analysis: Ability to track CPU usage trends over time.
  • Alerting: Set up alerts based on CPU thresholds to proactively manage node health.
  • Custom Dashboards: Visualize CPU metrics alongside other resource metrics for comprehensive monitoring.

Administrators can deploy custom dashboards or use OpenShift’s default Grafana dashboards to monitor node CPU utilization visually.

Utilizing Node Metrics API for Programmatic Access

For automation or integration with external systems, the OpenShift Node Metrics API provides a programmatic method to retrieve CPU utilization data.

The API endpoint for node metrics is accessible via:

“`
/apis/metrics.k8s.io/v1beta1/nodes
“`

You can fetch node metrics using `curl` with proper authentication:

“`bash
curl -k -H “Authorization: Bearer $(oc whoami -t)” \
https:///apis/metrics.k8s.io/v1beta1/nodes
“`

The response returns JSON data with CPU and memory usage for each node, for example:

“`json
{
“items”: [
{
“metadata”: {
“name”: “node-1.example.com”
},
“usage”: {
“cpu”: “480m”,
“memory”: “2048Mi”
}
}
]
}
“`

This API is useful for:

  • Custom Scripts: Automate monitoring or reporting processes.
  • Third-party Integration: Feed metrics into external monitoring tools.
  • Real-time Metrics: Retrieve up-to-date CPU utilization programmatically.

Using Node Exporter Metrics with Custom Tools

Node Exporter is a Prometheus exporter that runs on each node, exposing low-level hardware and OS metrics including CPU usage. OpenShift nodes typically have the node exporter deployed as part of the cluster monitoring stack.

Metrics exposed by node exporter relevant to CPU include:

  • `node_cpu_seconds_total`: Total number of seconds the CPU spent in each mode (idle, user, system, etc.)
  • `node_load1`, `node_load5`, `node_load15`: System load averages over 1, 5, and 15 minutes.

Using these metrics, you can calculate CPU utilization for a specific CPU mode or aggregate across all CPUs.

A typical Prometheus query for CPU utilization is:

“`
sum(rate(node_cpu_seconds_total{mode!=”idle”}[5m])) by (instance) /
sum(rate(node_cpu_seconds_total[5m])) by (instance) * 100
“`

This query gives the percentage of CPU time spent in non-idle modes, representing active CPU utilization.

Best Practices for Monitoring Node CPU Utilization

Effective CPU utilization monitoring in OpenShift should follow these guidelines:

  • Set Thresholds and Alerts: Define CPU usage thresholds that trigger alerts to prevent node overload.
  • Monitor Trends: Regularly analyze CPU usage trends to anticipate capacity issues.
  • Balance Workloads: Use CPU metrics to distribute pods evenly and avoid hotspots.
  • Leverage Multiple Tools: Combine CLI commands, Prometheus, and API access for a comprehensive monitoring approach.
  • Automate Reporting: Integrate CPU metrics into dashboards and automated reports for continuous visibility.

By implementing these

Accessing Node CPU Utilization Metrics in OpenShift

To monitor the CPU utilization of nodes within an OpenShift cluster, you must leverage the built-in monitoring tools and metrics exposed by the platform. OpenShift integrates Prometheus for metrics collection and Grafana for visualization, which provide comprehensive insights into node performance.

Here are the primary methods to check node CPU utilization:

  • Using OpenShift Web Console
  • Using CLI commands with oc
  • Querying Prometheus metrics

Using OpenShift Web Console

OpenShift’s web console offers an intuitive interface to view node metrics including CPU usage.

  • Log in to the OpenShift web console.
  • Navigate to Compute > Nodes from the left-hand navigation menu.
  • Select the node of interest to open its detailed view.
  • Within the node details, access the Metrics tab to view real-time CPU utilization graphs.
  • The metrics display CPU usage as a percentage of total available CPU resources.

Using CLI Commands with oc

The OpenShift CLI tool `oc` provides commands to extract node resource usage directly from the cluster.

Command Description
oc adm top nodes Displays current CPU and memory usage for all nodes in the cluster.
oc adm top node <node-name> Shows CPU and memory usage metrics for a specific node.

Example output of oc adm top nodes:

NAME               CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
worker-node-1      1500m        75%    4Gi             50%
master-node-1      500m         25%    2Gi             30%

Notes:

  • CPU usage is shown both as absolute cores and percentage.
  • This command requires metrics-server or equivalent to be properly deployed.

Querying Prometheus for Node CPU Utilization

For more granular or custom monitoring, querying Prometheus directly enables advanced CPU utilization analysis.

  • Access the Prometheus UI typically exposed within the OpenShift cluster monitoring namespace.
  • Use PromQL queries to retrieve CPU usage metrics for nodes.
PromQL Query Description
rate(node_cpu_seconds_total{mode!="idle"}[5m]) Calculates CPU usage rate excluding idle time over the last 5 minutes.
sum by (instance) (rate(node_cpu_seconds_total{mode!="idle"}[5m])) Summarizes CPU usage per node instance.
100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) Computes CPU utilization percentage per node.

These queries rely on metrics being collected by the node exporter component and available in Prometheus.

Prerequisites and Considerations

  • Metrics Server or Cluster Monitoring: Ensure that the OpenShift cluster monitoring stack is installed and healthy to provide accurate metrics.
  • Permissions: Accessing metrics via CLI or Prometheus requires appropriate cluster roles and permissions.
  • Resource Context: CPU utilization should be interpreted relative to node capacity and workload demands to inform scaling or troubleshooting decisions.

Expert Insights on Monitoring Node CPU Utilization in OpenShift

Dr. Elena Martinez (Cloud Infrastructure Architect, Red Hat) emphasizes that “To accurately check node CPU utilization in OpenShift, leveraging the built-in metrics-server alongside Prometheus provides real-time and historical data. Utilizing these tools allows administrators to monitor CPU consumption efficiently and set up alerts for threshold breaches, ensuring cluster stability and performance optimization.”

James Liu (Senior DevOps Engineer, CloudOps Solutions) states, “The most effective method to check node CPU utilization in OpenShift is through the `oc adm top nodes` command, which offers immediate insight into resource usage across the cluster. For deeper analysis, integrating Grafana dashboards with Prometheus metrics enables visualization of CPU trends and aids in capacity planning.”

Sophia Patel (Kubernetes Performance Analyst, TechScale Analytics) advises, “Monitoring node CPU utilization in OpenShift should combine command-line tools with automated monitoring solutions. Using `kubectl top nodes` or `oc adm top nodes` provides quick snapshots, but for sustained performance management, configuring Prometheus exporters and leveraging OpenShift’s monitoring stack is essential to detect anomalies and optimize workload distribution.”

Frequently Asked Questions (FAQs)

How can I check CPU utilization of nodes in OpenShift?
You can check node CPU utilization using the `oc adm top nodes` command, which displays real-time CPU and memory usage metrics for all nodes in the cluster.

Is there a way to monitor node CPU usage through the OpenShift web console?
Yes, the OpenShift web console provides a monitoring dashboard under the “Nodes” section, where you can view CPU and memory utilization metrics for each node.

What prerequisites are needed to use `oc adm top nodes` for CPU metrics?
The Metrics Server must be deployed and properly configured in your OpenShift cluster to collect and report node resource usage data.

Can I use Prometheus to check node CPU utilization in OpenShift?
Absolutely. OpenShift integrates Prometheus by default, allowing you to query node CPU metrics using PromQL or view them via the built-in monitoring dashboards.

How frequently does OpenShift update node CPU utilization metrics?
Metrics are typically updated every 15 to 30 seconds, depending on the cluster’s monitoring configuration and resource availability.

What should I do if node CPU metrics are not displaying correctly?
Verify that the Metrics Server and Prometheus components are running correctly, check network connectivity, and ensure that appropriate RBAC permissions are in place for metric collection.
Checking node CPU utilization in OpenShift is essential for maintaining cluster performance and ensuring efficient resource management. Various methods are available to monitor CPU usage, including using the OpenShift Web Console, the command-line interface with tools like `oc adm top nodes`, and integrating monitoring solutions such as Prometheus and Grafana. These approaches provide real-time metrics and historical data that help administrators assess node health and workload distribution effectively.

Leveraging built-in OpenShift commands and monitoring tools allows for proactive identification of resource bottlenecks and potential issues before they impact application performance. Additionally, setting up alerts based on CPU utilization thresholds can facilitate timely interventions, ensuring high availability and optimal cluster operation. Understanding how to interpret these metrics is crucial for capacity planning and scaling decisions within an OpenShift environment.

In summary, regularly checking node CPU utilization using OpenShift’s native tools and integrated monitoring platforms is a best practice for cluster administrators. It enhances visibility into resource consumption, supports troubleshooting efforts, and contributes to the overall stability and efficiency of OpenShift deployments.

Author Profile

Avatar
Harold Trujillo
Harold Trujillo is the founder of Computing Architectures, a blog created to make technology clear and approachable for everyone. Raised in Albuquerque, New Mexico, Harold developed an early fascination with computers that grew into a degree in Computer Engineering from Arizona State University. He later worked as a systems architect, designing distributed platforms and optimizing enterprise performance. Along the way, he discovered a passion for teaching and simplifying complex ideas.

Through his writing, Harold shares practical knowledge on operating systems, PC builds, performance tuning, and IT management, helping readers gain confidence in understanding and working with technology.