Skip to main content

Kubernetes Integration Guide

1. Overview

The Vega Platform provides seamless integration for collecting detailed data and metrics from Kubernetes clusters through the Vega Kubernetes Metrics Agent. This versatile agent collects, processes, and transmits a wide range of metrics to the Vega Platform, giving users deep insights into their Kubernetes environments.

Key Features

  • Node Metrics: Comprehensive data for each node, including capacity, allocatable resources, and utilization
  • Pod Metrics: Resource metrics for all pods, including requests, limits, and actual usage
  • Cluster Metrics: Aggregated cluster-wide statistics
  • Storage Metrics: Detailed metrics for persistent volumes (PVs) and claims (PVCs)
  • Namespace Metrics: Resource quotas, limit ranges, and consumption
  • Workload Metrics: Monitoring for deployments, stateful sets, and daemon sets
  • Network Metrics: Services and ingress metrics
  • Autoscaling Metrics: HPA performance and scaling metrics
  • Replication Metrics: Data for controllers and replica sets

2. Prerequisites

System Requirements

  • Kubernetes cluster version 1.30 or higher
  • Helm v3.0 or higher
  • Outbound access to:
    • Container image repository (public.ecr.aws/c0f8b9o4/vegacloud)
    • api.vegacloud.io (port 443) — for pre-signed URL retrieval.
    • vegametricsocean.s3.us-west-2.amazonaws.com (port 443) — for uploading data.

Access Requirements

  • Kubernetes Administrator privileges
  • RBAC permissions for:
    • Creating deployments
    • Creating cluster roles
    • Reading metrics
  • API access credentials (clientId and clientSecret)

Pre-Installation Checks

  1. Verify cluster access:
kubectl cluster-info
kubectl auth can-i create deployment
kubectl auth can-i create clusterrole
  1. Verify Helm installation:
helm version
  1. Check network connectivity:
curl -v https://google.com

3. Installation

Quick Start

  1. Add the Vega Helm repository:
helm repo add vegacloud https://vegacloud.github.io/charts/
helm repo update
  1. Verify repository addition:
helm search repo vegacloud
  1. Install the agent:
helm install vega-metrics vegacloud/vega-metrics-agent \
--set vega.clientId="your-client-id" \
--set vega.clientSecret="your-client-secret" \
--set vega.orgSlug="your-org-slug" \
--set vega.clusterName="your-cluster-name"

Please Note: Insecure mode is enabled by default to allow installation of the agent across a wide range of Kubernetes clusters out of the box. However, we strongly recommend that you have deployed and configured internal TLS/certificates within your Kubernetes cluster and enable secure mode for the agent by setting env.VEGA_INSECURE=false.

4. Configuration

Basic Configuration

Essential Parameters

ParameterDescriptionDefaultRequired
vega.clientIdClient ID from API registration""Yes
vega.clientSecretClient Secret from API registration""Yes
vega.orgSlugOrganization slug""Yes
vega.clusterNameKubernetes cluster name""Yes
env.VEGA_INSECUREEnable/disable insecure modetrueNo

5. Verification & Monitoring

Installation Verification

  1. Check pod status:
kubectl get pods -n vegacloud -l app=metrics-agent
  1. View agent logs:
kubectl logs -f deployment/metrics-agent -n vegacloud
  1. Verify metrics collection:
kubectl top nodes
kubectl top pods -A
  1. Check agent connectivity:
kubectl exec -it deployment/metrics-agent -n vegacloud -- curl -v https://api.vegacloud.io/health

Health Monitoring

  1. Monitor agent health:
kubectl describe pod -l app=metrics-agent -n vegacloud
  1. Check resource usage:
kubectl top pod -l app=metrics-agent -n vegacloud
  1. View recent events:
kubectl get events -n vegacloud --field-selector involvedObject.name=vega-metrics

6. Maintenance

Upgrades

Option 1: Upgrade via Helm Chart

# Update repository
helm repo update

# Upgrade with existing values
helm upgrade vega-metrics vegacloud/vega-metrics-agent \
--namespace vegacloud \
--reuse-values

Option 2: Update Container Image

  1. Check available container versions:
docker pull public.ecr.aws/c0f8b9o4/vegacloud/vega-metrics-agent
docker images public.ecr.aws/c0f8b9o4/vegacloud/vega-metrics-agent
  1. Update the deployment with new image:
kubectl set image deployment/metrics-agent \
vega-metrics-agent=public.ecr.aws/c0f8b9o4/vegacloud/vega-metrics-agent:latest \
-n vegacloud
  1. Monitor the rolling update:
kubectl rollout status deployment/metrics-agent -n vegacloud

Note: Replace :latest with a specific version tag for better version control.

Version-Specific Helm Upgrade

helm upgrade vega-metrics vegacloud/vega-metrics-agent \
--version 1.2.3 \
--namespace vegacloud \
--reuse-values

Backup and Recovery

  1. Export current configuration:
helm get values vega-metrics -n vegacloud > vega-metrics-backup.yaml
  1. Export secrets:
kubectl get secret -n vegacloud vega-metrics-agent-secret -o yaml > vega-metrics-agent-secret-backup.yaml

Uninstallation

helm uninstall vega-metrics -n vegacloud
kubectl delete namespace vegacloud # Optional: removes the namespace

7. Troubleshooting

Common Issues

  1. Pod in CrashLoopBackOff

    • Check logs: kubectl logs -f deployment/metrics-agent -n vegacloud
    • Verify credentials: kubectl get secret vega-metrics-agent-secret -n vegacloud
    • Check resource limits: kubectl describe pod -l app=metrics-agent -n vegacloud
  2. Connection Issues

    • Verify network policies allow outbound traffic
    • Check if proxy configuration is needed
    • Ensure correct orgSlug is configured
    • Verify DNS resolution: kubectl run test-dns --image=busybox:1.28 --rm -it --restart=Never -- nslookup api.vegacloud.io
  3. Authentication Failures

    • Verify clientId and clientSecret are correct
    • Check secret creation: kubectl describe secret vega-metrics-agent-secret -n vegacloud
    • Validate API access: curl -v -H "Authorization: Bearer $TOKEN" https://api.vegacloud.io/health

Debugging Steps

  1. Enable debug logging:
helm upgrade vega-metrics vegacloud/vega-metrics-agent \
--set env.LOG_LEVEL=DEBUG \
--namespace vegacloud
  1. Check RBAC permissions:
kubectl auth can-i --list --as system:serviceaccount:vegacloud:vega-metrics-agent
  1. Verify network connectivity:
kubectl run test-net --image=busybox:1.28 --rm -it --restart=Never -- wget -q -O- https://api.vegacloud.io/health

8. Reference

Resource Recommendations

Based on the number of nodes in your cluster, here are the suggested CPU and memory requirements for the agent:

NodesCPU RequestCPU LimitMemory RequestMemory Limit
< 100500m1000m2Gi4Gi
100-2001000m1500m4Gi8Gi
200-5001500m2000m8Gi16Gi
500-10002000m3000m16Gi24Gi
1000+3000m-24Gi-

To apply these recommendations:

helm install vega-metrics vegacloud/vega-metrics-agent \
--set vega.clientId="your-client-id" \
--set vega.clientSecret="your-client-secret" \
--set vega.orgSlug="your-org-slug" \
--set vega.clusterName="your-cluster-name" \
--set resources.requests.cpu="500m" \
--set resources.requests.memory="2Gi" \
--set resources.limits.cpu="1000m" \
--set resources.limits.memory="4Gi"

Common Commands

Operational Commands

# Check agent status
kubectl get pods -n vegacloud -l app=metrics-agent

# View agent configuration
helm get values vega-metrics -n vegacloud

# Force agent restart
kubectl rollout restart deployment vega-metrics -n vegacloud

# Scale agent
kubectl scale deployment vega-metrics -n vegacloud --replicas=2

Debugging Commands

# Check agent permissions
kubectl auth can-i --list --as system:serviceaccount:vegacloud:vega-metrics-agent

# View agent events
kubectl get events -n vegacloud --sort-by='.lastTimestamp'

# Check resource usage
kubectl top pod -l app=metrics-agent -n vegacloud

# View detailed pod information
kubectl describe pod -l app=metrics-agent -n vegacloud