Kubernetes resource tuning is a vital aspect of managing workloads in cloud-native ecosystems, particularly when integrated within GitOps pipelines. This blog post looks at some of the challenges with Kubernetes resource management (such as reserving too many resources, wasting valuable quota, or blocking deployments), while highlighting some zero-cost approaches to reducing the frustration of resource management.
Resource Tuning Overview and Challenges
At the heart of Kubernetes resource tuning lie two fundamental concepts: Resource Quotas and Pod Requests & Limits. These mechanisms are designed to manage how applications consume resources, ensuring that they operate within defined parameters to maintain cluster health and efficiency.
Resource Quotas
Resource Quotas are Kubernetes objects that define the total amount of resources a namespace can consume. This includes CPU, memory, and even the number of objects, like Pods, Services, or PersistentVolumeClaims, that can be created. They ensure fair resource distribution, preventing any single team or project from consuming excessive cluster resources and maintaining predictable performance across services.
Challenges with Resource Quotas that directly affect Developers and DevOps engineers can include:
- Finding the optimal quota settings that ensure resources meet application needs without leading to resource starvation or inefficiency.
- Avoiding over-provisioning that could increase costs or reduce overall platform efficiency.
- Interacting with various teams to adjust and tune Resource Quota settings, ensuring that development and deployments aren’t blocked by Resource Quota configurations.
Pod Requests & Limits
Pod Requests and Limits specify the minimum and maximum amount of CPU and memory that a container can use. Requests ensure that a container has enough resources to start and run, while limits prevent it from using resources excessively, which could affect other containers running on on the same node.
Challenges with Pod Requests and Limits that directly affect Developers and DevOps engineers can include:
- Setting them too low can lead to:
- Slow startup times or high latency due to inadequate CPU allocation.
- Potential crashes from memory saturation.
- Setting them too high might:
- Waste resources or exceed the allocated ResourceQuota.
- Prevent new pods from being scheduled due to exceeded quotas or insufficient resources on nodes
- Affect other applications through resource deprivation, throttling, or out-of-memory (OOM) kills.
Addressing Deployment Delays and Resource Allocation Challenges
Determining if a ResourceQuota is Blocking Deployments
A frequent bottleneck faced in Kubernetes deployments is the failure of pods to deploy due to hitting resource quota limits. This scenario is a major pain point for developers, particularly during critical updates or rolling deployments. A deeper understanding of how resource quotas interact with deployment strategies allows developers to anticipate and mitigate these issues, streamlining the deployment process.
For example, imagine there is a Deployment set to perform a rolling update, but the latest version is not rolled out. Something like this might show up in the events:
When this event is identified, someone must analyze the quota usage and negotiate resource adjustments, a process that can delay deployment timelines significantly.
Stop Guessing With Container Resource Requests
“What should I be setting this to?” is a recurring question that challenges developers, especially in the absence of comprehensive performance data for new or evolving applications. Initial resource allocations are often guesswork, aimed at ensuring availability and performance without a solid basis in historical usage data.
For example, imagine a Deployment that is running without any awareness of it’s resource usage. Using kubectl top
, one can see that the pod is using 5 millicores (CPU) and 15 megabytes (Memory) at this point in time. Sadly, this says nothing about it’s usage under any sort of load, yet it indicates that it could be up to 25x over-provisioned.
In order to fine-tune these resources, the pods need to run under a normal load for some time, performance metrics need to be collected, reviewed, analyzed, and then applied to the pod configuration. It’s not a wonder that so many Kubernetes clusters are often under utilized — it’s easier to request more resources than needed rather than consistently fine tune the configuration.
Making It Easier
With a little bit of scripting and some existing Kubernetes capabilities, the challenges above can be eased.
Script: Check ResourceQuota Utilization in Namespace
This script is something that a developer or platform user could frequently run when looking for ResourceQuota issues. It will inspect all ResouceQuotas in the namespace, determine the utilization, and provide recommendations as to which ones may need an increase.
This script is also available in our open source RunWhen Local Troubleshooting Cheat Sheet — and is available to view in our demo instance here.
Automating Pod Resource Request Recommendations
In order to figure out the right amount of resources to request for a pod, it must be monitored over time, analyzed, and then adjusted.
Luckily this is already freely available in Kubernetes with the Vertical Pod Autoscaler!
Kubernetes has two types of autoscaling capabilities for pods:
- Horizontal Pod Autoscaler — Add’s or removes pod replicas based on resource usage and a defined configuration (not applicable in this use case).
- Vertical Pod Autoscaler — Monitors a pods usage and determines if it needs more or less resources. It can automatically make the change, or simply provide a recommendation.
In order to address the question of “What Should I Be Setting This To?”, the Vertical Pod Autoscaler can be used. While it may not be a fit for many environments to automatically resize a running pod (which won’t work well for GitOps environments), the Vertical Pod Autoscaler can run in recommendation mode. Using the built in metrics service (and optionally a prometheus instance) the VPA can provide a desired target configuration for pods, taking out the guess work.
Note: There are likely more advanced recommendation algorithms than what the VPA can provide, which may be of interest in other paid or open source products.
Deploying the VPA with Recommendations Only
The VPA can easily be deployed into a running Kubernetes cluster from the documentation, or from the fairwinds helm chart. Using the helm chart allows you to easily enable the recommender
and disable the updater
and admissionController
components (if those aren’t of interest).
For example, this is a sample FluxCD HelmRelease manifest that is used to deploy the VPA into a sandbox cluster:
Configuring the VPA to Provide Recommendations for a Deployment
With the VPA deployed, a VerticalPodAutoscaler manifest can be created to specify which resources generate recommendations for.
For example, this VPA configuration will target a specific deployment and generate recommendations only — ensuring it will not attempt to automatically update the pod. This is critical for pods that are managed by GitOps tooling.
Script: Identify Pod Resource Recommendations in Namespace
With the VPA in place, this script can help developers automatically query the VPA for recommendations for each resource.
Note: While the VPA itself can be directly queried, it’s output must still be interpreted and translated into actionable next steps. This script outlines the specific details and configuration changes in a way that can be automated.
This script is also available in our open source RunWhen Local Troubleshooting Cheat Sheet — and is available to view in our demo instance here.
Integrating Resource Tuning into GitOps Pipelines
With the scripts above determining what resources need to be updated and how they should be configured, the final step is to include these recommendations into a GitOps pipeline.
While each teams GitOps pipeline will vary slightly, the output from each script above can be trimmed such that json content under “Recommended Next Steps” is fed into a script which takes the necessary actions to update the GitOps repository, notifying the code owners of the desired change through the existing GitOps approval procedures.
For example, this script is used for GitOps repositories stored in GitHub and works for both ArgoCD and FluxCD GitOps engines. The script:
- Identifies the GitOps controller (FluxCD and ArgoCD) that manages the resource
- Fetches details about the associated Git repository and path for manifest files
- Processes input JSON to identify remediation actions required for Kubernetes resources, categorizing them by type (e.g., resource quota updates, resource request recommendations, PVC size increases).
- For each identified remediation action, it dynamically constructs modifications to apply to the resource’s YAML manifest based on the specified changes.
- Clones the relevant Git repository, creates a new branch for the changes, updates the YAML files with the constructed modifications, and commits these changes.
- Generates a detailed Pull Request (PR) body, including the rationale for changes and links to relevant details, and creates a PR against the main branch of the Git repository to apply the changes.
- Outputs the link to the Pull Request for review and approval processes.
Putting it Together with RunWhen (a shameless plug)
RunWhen provides a troubleshooting platform for developers and platform users. It combines open source scripts (like those above), with Digital Assistants, to find, suggest, and autonomously run the right troubleshooting scripts at the right time. Sharing these capabilities with an entire team of developers and DevOps engineers, without the need to find and configure the scripts, can save developers and platform engineers countless hours of resource tuning effort.
To see how the above scenarios would be handled in the RunWhen Platform, see the following YouTube videos:
To experience this yourself, feel free to log into the RunWhen Platform and check out our tutorials documentation.
Final Thoughts
The topic of resource management and finding the balance between performance and cost can be a complex topic with large (non-demo) platforms. This article was written to share some scripts and tools that are available at no cost and might be a good starting point for some platform or app teams. These were deployed and demonstrated in the RunWhen Sandbox cluster, a cluster that is used to validate open source integrations with any other tool or solution.
There are many companies that have invested in providing deeper insights into application use and resource tuning recommendations (such as KubeCost, Densify, CAST.AI, StormForge, and many more) which should be considered for larger and complex environments. Understanding the importance of comprehensive insights, RunWhen can easily build and support open source integrations with all of these software providers, adding their detailed insights as a useful step in the overall troubleshooting and root cause analysis workflow.
What other resource management frustrations have you experienced? Could we automate this toil away? Drop us a message in the comments or in our Slack instance — we’d love to share your experience.
View article on Medium here.