Set up a prometheus by a prometheus operator in a kubernetes cluster

This document is written by following this document.

Install a prometheus operator

First, create CRDs and resources in the default namespace.

set LATEST (curl -s https://api.github.com/repos/prometheus-operator/prometheus-operator/releases/latest | jq -cr .tag_name)
curl -sLO https://github.com/prometheus-operator/prometheus-operator/releases/download/$LATEST/bundle.yaml
kubectl apply -f bundle.yaml

Note that I failed to create a prometheus resource because the CRD definition was too big like next.

The CustomResourceDefinition "prometheuses.monitoring.coreos.com" is invalid: metadata.annotations: Too long: must have at most 262144 bytes

Hence, I created the resource by the next command.

kubectl apply --server-side=true -f https://raw.githubusercontent.com/prometheus-operator/kube-prometheus/main/manifests/setup/0prometheusCustomResourceDefinition.yaml

Create a Prometheus resource

Create a prometheus server with exposing its admin API

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus
spec:
  serviceAccountName: prometheus
  podMonitorSelector:
    matchLabels:
      team: frontend
  resources:
    requests:
      memory: 400Mi
  enableAdminAPI: true
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources:
  - nodes
  - nodes/metrics
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources:
  - configmaps
  verbs: ["get"]
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: prometheus
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: default

An application monitor

On an application to be monitored, create a ServiceMonitor CRD. For example, if there is the next example app,

apiVersion: apps/v1
kind: Deployment
metadata:
  name: example-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: example-app
  template:
    metadata:
      labels:
        app: example-app
    spec:
      containers:
      - name: example-app
        image: fabxc/instrumented_app
        ports:
        - name: web
          containerPort: 8080
---
kind: Service
apiVersion: v1
metadata:
  name: example-app
  labels:
    app: example-app
spec:
  selector:
    app: example-app
  ports:
  - name: web
    port: 8080

Then deploy the ServiceMonitor like next

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: example-app
  labels:
    team: frontend
spec:
  selector:
    matchLabels:
      app: example-app
  endpoints:
  - port: web

Confirm if this configuration works. At first, confirm what kind of metrics are output by example-app.

kubectl port-forward service/example-app 18080:8080
curl localhost:18080/metrics

Then access to the prometheus and see if the query runs

kubectl port-forward svc/prometheus-operated 9091:9090

Then open the browser and see if prometheus query shows the correct data.

If you can’t, see the troubleshooting page for more details.

Install Node Exporter

To utilize some metrics of nodes, follow this article.

At first, deploy

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  labels:
    app: node-exporter
  name: node-exporter
  namespace: prometheus
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      annotations:
        cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
      labels:
        app: node-exporter
    spec:
      containers:
      - args:
        - --web.listen-address=0.0.0.0:9100
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        image: quay.io/prometheus/node-exporter:v0.18.1
        imagePullPolicy: IfNotPresent
        name: node-exporter
        ports:
        - containerPort: 9100
          hostPort: 9100
          name: metrics
          protocol: TCP
        resources:
          limits:
            cpu: 200m
            memory: 50Mi
          requests:
            cpu: 100m
            memory: 30Mi
        volumeMounts:
        - mountPath: /host/proc
          name: proc
          readOnly: true
        - mountPath: /host/sys
          name: sys
          readOnly: true
      hostNetwork: true
      hostPID: true
      restartPolicy: Always
      tolerations:
      - effect: NoSchedule
        operator: Exists
      - effect: NoExecute
        operator: Exists
      volumes:
      - hostPath:
          path: /proc
          type: ""
        name: proc
      - hostPath:
          path: /sys
          type: ""
        name: sys
---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: node-exporter
  name: node-exporter
  namespace: prometheus
spec:
  ports:
  - name: node-exporter
    port: 9100
    protocol: TCP
    targetPort: 9100
  selector:
    app: node-exporter
  sessionAffinity: None
  type: ClusterIP
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: node-exporter
    serviceMonitorSelector: prometheus
  name: node-exporter
  namespace: prometheus
spec:
  endpoints:
  - honorLabels: true
    interval: 30s
    path: /metrics
    targetPort: 9100
  jobLabel: node-exporter
  namespaceSelector:
    matchNames:
    - prometheus
  selector:
    matchLabels:
      app: node-exporter

Then change the Prometheus resources

-  serviceMonitorSelector:
-    matchLabels:
-      team: frontend
+  serviceMonitorSelector: {}

Install Kube State Metrics

Install from helm charts following the official document.

At first, install kube-state-metrics into a namespace, let’s say kube-state-metrics. Then set up a ServiceMonitor

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app: kube-state-metrics
    serviceMonitorSelector: prometheus
  name: kube-state-metrics
  namespace: prometheus
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: kube-state-metrics
  jobLabel: kube-state-metrics
  namespaceSelector:
    matchNames:
    - kube-state-metrics
  endpoints:
  - port: http

Then I was able to see some metrics exported by kube-state-metrics.

2024

Back to Top ↑

2023

Follow Kubernetes the Hard way

4 minute read

This article was written by just following Kelsey Hightower’s Kubernetes Hardway document to understand Kubernetes internal architecture.

Gcp Billing Analyze

less than 1 minute read

There are a few documents to manage billing data in BigQuery Attribution of committed use discount fees and credits How to export to BigQuery Structur...

Prometheus Metrics Overview on Grafana

1 minute read

In this post, some variables defined in Grafana are used for Prometheus metrics, including $__rate_interval: This article describes the benefit of this va...

Use Google Secret Manager in a GKE cluster

3 minute read

There are an awesome article about the options to use the Google Secret Manager and their pros and cons. In this article, use Secrets Store CSI Driver by fol...

Working around MySQL lock metadata

2 minute read

There are multiple documents about innodb locks on MySQL 5.7: InnoDB locking Locks Set by Different SQL Statements in InnoDB Using InnoDB Transaction ...

Upgrade Windows 10 to Windows 11

3 minute read

I used to use Windows 11, but for some reasons, the OS stopped working and I needed to clean-install it from Windows 10 from windows recovery environment.

Back to Top ↑

2022

MySQL backup and restore

1 minute read

In this article, explain how to backup MySQL database using Percona Xtrabackup. There are two binaries, innobackupex and xtrabackup. innobackupex is a wrappe...

tmux

1 minute read

Basic configuration

Back to Top ↑

2021

MySQL Replication

1 minute read

This configuration is for the version 5.7 and it’s minimum configuration in the official document.

jq cheetsheet

less than 1 minute read

jq is used to parse JSON result, format and output on the cli.

Compare static site generator

less than 1 minute read

There are many web sites to compare static site generator, but they miss some explanations that require to me. For some people, these features are important ...

Back to Top ↑

2020

Getting Started with Kubernetes Deployment

less than 1 minute read

The deployment is many use cases and in this page, they’re not described. For the details for those use cases or the concept of deployment, see official page.

Overview about MySQL Lock

2 minute read

This document is written for MySQL 5.7, so these contents may be not correct for other versions.

MySQL Performance

2 minute read

This document is written for MySQL 5.7, so these contents may be not correct for other versions. In this page, performance_schema is mainly discussed.

Git hooks

less than 1 minute read

Configurations

gitHub pages

3 minute read

Getting Started See Official tutorial for detail steps.

Gitconfig

1 minute read

Configuration The detail for gitconfig is written in official page.

git cli

less than 1 minute read

Written in March 2020.

MySQL Tuner

less than 1 minute read

MySQL Tuner tool This is a tool to review a configuration for MySQL server.

kubectl cheetsheet

less than 1 minute read

Collect recent error logs If the logs are outputted by zap, error messages are aggregated by checking level = error. This log does not work very well if the ...

Introduction to GCP Cloud endpoints

less than 1 minute read

The Cloud endpoint is actually the NGINX proxy which offers the following features on GCP. Authentication and validation Logging and monitoring in GCP

HTTP/2 for Go

1 minute read

http package in golang supports HTTP/2 protocols. It’s automatically configured.

Back to Top ↑

2019

Terraform overview

1 minute read

Basic concepts There are some basic components for terraform.

Protocol Buffers for Go with Gadgets

less than 1 minute read

gogo/protobuf is the library to store some extensions from golang/protobuf in this repository. There are some useful packages that golang/protobuf does not p...

Introduction to GCP Cloud CDN

less than 1 minute read

Target upstream services Cloud CDN can have only GCP load balancer as the upstream services. And GCP load balancer can configure one of followings for backen...

Getting Started with Google closure library

less than 1 minute read

Some JavaScript library depends on Google Closure. If you need to understand the behavior of such a library, you have to know closure. The official document ...

Back to Top ↑