Prometheus Metrics Reference

      +
      This page captures the metrics supplied to Prometheus by the Couchbase Autonomous Operator and links reference pages of a number of additional metrics that are exported by third party libraries.

      Operator Metrics

      Metric

      Type

      Unit

      Labels

      Optional Labels

      Stability

      Added

      backup_jobs_created_total

      Total number of backup jobs that have been created by the operator

      counter

      namespace,backup_type

      cluster_uuid,cluster_name

      committed

      2.8.0

      cpu_under_management

      Total cpu requests for operator managed pods in k8s cpu units

      gauge

      namespace,name

      cluster_uuid,cluster_name

      committed

      2.8.0

      in_place_upgrade_failures

      The number of times in place upgrades have failed

      counter

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      in_place_upgrades_total

      Total number of in place upgrades performed by operator

      counter

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      kubernetes_api_request_failures

      Total failed requests to the Kubernetes API by the operator

      counter

      method,host,path

      committed

      2.8.0

      kubernetes_api_requests_time_milliseconds

      Length of time per request to the Kubernetes API

      histogram

      milliseconds

      method,host,path

      committed

      2.8.0

      kubernetes_api_requests_total

      Total requests made to the Kubernetes API by the operator

      counter

      method,host,path

      committed

      2.8.0

      memory_under_management_bytes

      Total memory requests for operator managed pods in bytes

      gauge

      bytes

      namespace,name

      cluster_uuid,cluster_name

      committed

      2.8.0

      pod_readiness_duration

      The time it takes for a pod to enter a ready state

      gauge

      milliseconds

      name,serverClass

      cluster_uuid,cluster_name

      committed

      2.7.0

      pod_recoveries_total

      Total number of times operator has recovered a pod when the pod has been down

      counter

      name,podName

      cluster_uuid,cluster_name

      committed

      2.7.0

      pod_recovery_failures_total

      Total number of times operator has failed to recover a pod

      counter

      name,podName

      cluster_uuid,cluster_name

      committed

      2.7.0

      pod_replacements_failed

      Total number of times pods have failed to be recovered by the operator

      counter

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      pod_replacements_total

      The amount of times operator has replaced a couchbase server pod due to a change in a couchbase cluster resources

      counter

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      reconcile_failures

      Total failed reconcile operations performed on a specific cluster

      counter

      namespace,name

      cluster_uuid,cluster_name

      committed

      2.3.0

      reconcile_time_seconds

      Length of time per reconcile for a specific cluster

      histogram

      seconds

      namespace,name

      cluster_uuid,cluster_name

      committed

      2.3.0

      reconcile_total

      Total reconcile operations performed on a specific cluster

      counter

      namespace,name,result

      cluster_uuid,cluster_name

      committed

      2.3.0

      server_http_request_codes_total

      Total HTTP requests to Couchbase Server for a specific cluster, method and status code returned

      counter

      name,method,code,service,host

      name,namespace

      committed

      2.3.0

      server_http_request_failures

      Total failed HTTP requests to Couchbase Server for a specific cluster

      counter

      name,method,service,host

      name,namespace

      committed

      2.3.0

      server_http_requests_time_milliseconds

      Length of time per request for a specific cluster

      histogram

      milliseconds

      name,method,service,host

      name,namespace

      committed

      2.3.0

      server_http_requests_total

      Total HTTP requests to Couchbase Server for a specific cluster

      counter

      name,method,service,host

      name,namespace

      committed

      2.3.0

      swap_rebalance_failures

      Total number of times swap rebalances have failed

      counter

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      swap_rebalances_total

      Total number of swap rebalances performed by the operator

      counter

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      upgrade_duration

      The time taken to perform an upgrade

      milliseconds

      name

      cluster_uuid,cluster_name

      committed

      2.7.0

      volume_expansions_total

      Total number of times the size of volumes have been increased under management

      counter

      name,volumeName

      cluster_uuid,cluster_name

      committed

      2.7.0

      volume_size_under_management_bytes

      Total memory claimed by volumes under management by the operator in bytes

      gauge

      bytes

      namespace,name

      cluster_uuid,cluster_name

      committed

      2.8.0

      Additional Metrics