k8s集群监控cadvisor+prometheus+grafana部署
目录
参考文章:
k8s集群部署cadvisor node-exporter prometheus grafana监控系统 - cyh00001 - 博客园
准备工作:
Cluster集群节点介绍:
master:192.168.136.21(以下所步骤都在该节点进行)
worker:192.168.136.22
worker:192.168.136.23
##vim缩进混乱,冒号模式下,:set paste进入黏贴模式,:set nopaste退出黏贴模式(默认)。##
1.新建命名空间monitor
kubectl create ns monitor
拉取cadvisor镜像,由于官方的镜像在在谷歌镜像中,国内无法访问,我这里直接用别人的,直接拉取即可,注意镜像名是 lagoudocker/cadvisor:v0.37.0。
docker pull lagoudocker/cadvisor:v0.37.0
2.部署
新建 /opt/cadvisor_prome_gra 目录,配置文件较多,单独新建一个目录。
2.1部署cadvisor
部署cadvisor的DaemonSet资源,DaemonSet资源可以保证集群内的每一个节点运行同一组相同的pod,即使是新加入的节点也会自动创建对应的pod。
vim case1-daemonset-deploy-cadvisor.yaml
-
apiVersion: apps/v1
-
kind: DaemonSet
-
metadata:
-
name: cadvisor
-
namespace: monitor
-
spec:
-
selector:
-
matchLabels:
-
app: cAdvisor
-
template:
-
metadata:
-
labels:
-
app: cAdvisor
-
spec:
-
tolerations: #污点容忍,忽略master的NoSchedule
-
- effect: NoSchedule
-
key: node-role.kubernetes.io/master
-
hostNetwork: true
-
restartPolicy: Always # 重启策略
-
containers:
-
- name: cadvisor
-
image: lagoudocker/cadvisor:v0.37.0
-
imagePullPolicy: IfNotPresent # 镜像策略
-
ports:
-
- containerPort: 8080
-
volumeMounts:
-
- name: root
-
mountPath: /rootfs
-
- name: run
-
mountPath: /var/run
-
- name: sys
-
mountPath: /sys
-
- name: docker
-
mountPath: /var/lib/containerd
-
volumes:
-
- name: root
-
hostPath:
-
path: /
-
- name: run
-
hostPath:
-
path: /var/run
-
- name: sys
-
hostPath:
-
path: /sys
-
- name: docker
-
hostPath:
-
path: /var/lib/containerd
kubectl apply -f case1-daemonset-deploy-cadvisor.yaml
kubectl get pod -n monitor -owide 查询
因为有三个节点,所以会有三个pod,如果后期加入工作节点,DaemonSet会自动添加。
测试cadvisor <masterIP>:<8080>
2.2部署node_exporter
部署node-exporter的DaemonSet资源和Service资源。
vim case2-daemonset-deploy-node-exporter.yaml
-
apiVersion: apps/v1
-
kind: DaemonSet
-
metadata:
-
name: node-exporter
-
namespace: monitor
-
labels:
-
k8s-app: node-exporter
-
spec:
-
selector:
-
matchLabels:
-
k8s-app: node-exporter
-
template:
-
metadata:
-
labels:
-
k8s-app: node-exporter
-
spec:
-
tolerations:
-
- effect: NoSchedule
-
key: node-role.kubernetes.io/master
-
containers:
-
- image: prom/node-exporter:v1.3.1
-
imagePullPolicy: IfNotPresent
-
name: prometheus-node-exporter
-
ports:
-
- containerPort: 9100
-
hostPort: 9100
-
protocol: TCP
-
name: metrics
-
volumeMounts:
-
- mountPath: /host/proc
-
name: proc
-
- mountPath: /host/sys
-
name: sys
-
- mountPath: /host
-
name: rootfs
-
args:
-
- --path.procfs=/host/proc
-
- --path.sysfs=/host/sys
-
- --path.rootfs=/host
-
volumes:
-
- name: proc
-
hostPath:
-
path: /proc
-
- name: sys
-
hostPath:
-
path: /sys
-
- name: rootfs
-
hostPath:
-
path: /
-
hostNetwork: true
-
hostPID: true
-
---
-
apiVersion: v1
-
kind: Service
-
metadata:
-
annotations:
-
prometheus.io/scrape: "true"
-
labels:
-
k8s-app: node-exporter
-
name: node-exporter
-
namespace: monitor
-
spec:
-
type: NodePort
-
ports:
-
- name: http
-
port: 9100
-
nodePort: 39100
-
protocol: TCP
-
selector:
-
k8s-app: node-exporter
kubectl get pod -n monitor
验证 node-exporter 数据 ,注意是9100端口,<nodeIP>:<9100>
2.3部署prometheus
prometheus资源包括ConfigMap资源、Deployment资源、Service资源。
vim case3-1-prometheus-cfg.yaml
-
---
-
kind: ConfigMap
-
apiVersion: v1
-
metadata:
-
labels:
-
app: prometheus
-
name: prometheus-config
-
namespace: monitor
-
data:
-
prometheus.yml: |
-
global:
-
scrape_interval: 15s
-
scrape_timeout: 10s
-
evaluation_interval: 1m
-
scrape_configs:
-
- job_name: 'kubernetes-node'
-
kubernetes_sd_configs:
-
- role: node
-
relabel_configs:
-
- source_labels: [__address__]
-
regex: '(.*):10250'
-
replacement: '${1}:9100'
-
target_label: __address__
-
action: replace
-
- action: labelmap
-
regex: __meta_kubernetes_node_label_(. )
-
- job_name: 'kubernetes-node-cadvisor'
-
kubernetes_sd_configs:
-
- role: node
-
scheme: https
-
tls_config:
-
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
-
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
-
relabel_configs:
-
- action: labelmap
-
regex: __meta_kubernetes_node_label_(. )
-
- target_label: __address__
-
replacement: kubernetes.default.svc:443
-
- source_labels: [__meta_kubernetes_node_name]
-
regex: (. )
-
target_label: __metrics_path__
-
replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
-
- job_name: 'kubernetes-apiserver'
-
kubernetes_sd_configs:
-
- role: endpoints
-
scheme: https
-
tls_config:
-
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
-
bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
-
relabel_configs:
-
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
-
action: keep
-
regex: default;kubernetes;https
-
- job_name: 'kubernetes-service-endpoints'
-
kubernetes_sd_configs:
-
- role: endpoints
-
relabel_configs:
-
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
-
action: keep
-
regex: true
-
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
-
action: replace
-
target_label: __scheme__
-
regex: (https?)
-
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
-
action: replace
-
target_label: __metrics_path__
-
regex: (. )
-
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
-
action: replace
-
target_label: __address__
-
regex: ([^:] )(?::\d )?;(\d )
-
replacement: $1:$2
-
- action: labelmap
-
regex: __meta_kubernetes_service_label_(. )
-
- source_labels: [__meta_kubernetes_namespace]
-
action: replace
-
target_label: kubernetes_namespace
-
- source_labels: [__meta_kubernetes_service_name]
-
action: replace
-
target_label: kubernetes_service_name
注意case3-2配置文件中的k8s-master记得更改,不能改成本地主机ip(原因未知)
设置192.168.136.21(k8s-master)节点为prometheus数据存放路径 /data/prometheus。
vim case3-2-prometheus-deployment.yaml
-
---
-
apiVersion: apps/v1
-
kind: Deployment
-
metadata:
-
name: prometheus-server
-
namespace: monitor
-
labels:
-
app: prometheus
-
spec:
-
replicas: 1
-
selector:
-
matchLabels:
-
app: prometheus
-
component: server
-
#matchExpressions:
-
#- {key: app, operator: In, values: [prometheus]}
-
#- {key: component, operator: In, values: [server]}
-
template:
-
metadata:
-
labels:
-
app: prometheus
-
component: server
-
annotations:
-
prometheus.io/scrape: 'false'
-
spec:
-
nodeName: k8s-master
-
serviceAccountName: monitor
-
containers:
-
- name: prometheus
-
image: prom/prometheus:v2.31.2
-
imagePullPolicy: IfNotPresent
-
command:
-
- prometheus
-
- --config.file=/etc/prometheus/prometheus.yml
-
- --storage.tsdb.path=/prometheus
-
- --storage.tsdb.retention=720h
-
ports:
-
- containerPort: 9090
-
protocol: TCP
-
volumeMounts:
-
- mountPath: /etc/prometheus/prometheus.yml
-
name: prometheus-config
-
subPath: prometheus.yml
-
- mountPath: /prometheus/
-
name: prometheus-storage-volume
-
volumes:
-
- name: prometheus-config
-
configMap:
-
name: prometheus-config
-
items:
-
- key: prometheus.yml
-
path: prometheus.yml
-
mode: 0644
-
- name: prometheus-storage-volume
-
hostPath:
-
path: /data/prometheusdata
-
type: Directory
创建sa和clusterrolebinding
kubectl create serviceaccount monitor -n monitor
kubectl create clusterrolebinding monitor-clusterrolebinding -n monitor --clusterrole=cluster-admin --serviceaccount=monitor:monitor
kubectl apply -f case3-2-prometheus-deployment.yaml
case3-2这一步有大坑,用“k8s-master"可以,但是用“192.168.136.21”就不可以!Deployment和pod一直起不来,查看pod的日志显示找不到“192.168.136.21”主机,改成“k8s-master”也不行,几天后突然就好了,期间有关过机。(原因未知)
vim case3-3-prometheus-svc.yaml
-
---
-
apiVersion: v1
-
kind: Service
-
metadata:
-
name: prometheus
-
namespace: monitor
-
labels:
-
app: prometheus
-
spec:
-
type: NodePort
-
ports:
-
- port: 9090
-
targetPort: 9090
-
nodePort: 30090
-
protocol: TCP
-
selector:
-
app: prometheus
-
component: server
kubectl apply -f case3-3-prometheus-svc.yaml
2.4部署rbac权限
包括Secret资源、ServiceAccount资源、ClusterRole资源、ClusterRoleBinding资源,ServiceAccount是服务账户,ClusterRole是权限规则,ClusterRoleBinding是将ServiceAccount和ClusterRole进行绑定。
pod和 apiserver 的认证信息通过 secret 进行定义,由于认证信息属于敏感信息,所以需要保存在secret 资源当中,并以存储卷的方式挂载到 Pod 当中。从而让 Pod 内运行的应用通过对应的secret 中的信息来连接 apiserver,并完成认证。
vim case4-prom-rbac.yaml
-
apiVersion: v1
-
kind: ServiceAccount
-
metadata:
-
name: prometheus
-
namespace: monitor
-
-
---
-
apiVersion: v1
-
kind: Secret
-
type: kubernetes.io/service-account-token
-
metadata:
-
name: monitor-token
-
namespace: monitor
-
annotations:
-
kubernetes.io/service-account.name: "prometheus"
-
---
-
apiVersion: rbac.authorization.k8s.io/v1
-
kind: ClusterRole
-
metadata:
-
name: prometheus
-
rules:
-
- apiGroups:
-
- ""
-
resources:
-
- nodes
-
- services
-
- endpoints
-
- pods
-
- nodes/proxy
-
verbs:
-
- get
-
- list
-
- watch
-
- apiGroups:
-
- "extensions"
-
resources:
-
- ingresses
-
verbs:
-
- get
-
- list
-
- watch
-
- apiGroups:
-
- ""
-
resources:
-
- configmaps
-
- nodes/metrics
-
verbs:
-
- get
-
- nonResourceURLs:
-
- /metrics
-
verbs:
-
- get
-
---
-
#apiVersion: rbac.authorization.k8s.io/v1beta1
-
apiVersion: rbac.authorization.k8s.io/v1
-
kind: ClusterRoleBinding
-
metadata:
-
name: prometheus
-
roleRef:
-
apiGroup: rbac.authorization.k8s.io
-
kind: ClusterRole
-
name: prometheus
-
subjects:
-
- kind: ServiceAccount
-
name: prometheus
-
namespace: monitor
kubectl apply -f case4-prom-rbac.yaml
2.5.部署 metrics
包括Deployment资源、Service资源、ServiceAccount资源、ClusterRole资源、ClusterRoleBinding资源。
注意是部署在kube-system!
vim case5-kube-state-metrics-deploy.yaml
-
apiVersion: apps/v1
-
kind: Deployment
-
metadata:
-
name: kube-state-metrics
-
namespace: kube-system
-
spec:
-
replicas: 1
-
selector:
-
matchLabels:
-
app: kube-state-metrics
-
template:
-
metadata:
-
labels:
-
app: kube-state-metrics
-
spec:
-
serviceAccountName: kube-state-metrics
-
containers:
-
- name: kube-state-metrics
-
image: registry.cn-hangzhou.aliyuncs.com/zhangshijie/kube-state-metrics:v2.6.0
-
ports:
-
- containerPort: 8080
-
-
---
-
---
-
apiVersion: v1
-
kind: ServiceAccount
-
metadata:
-
name: kube-state-metrics
-
namespace: kube-system
-
---
-
apiVersion: rbac.authorization.k8s.io/v1
-
kind: ClusterRole
-
metadata:
-
name: kube-state-metrics
-
rules:
-
- apiGroups: [""]
-
resources: ["nodes", "pods", "services", "resourcequotas", "replicationcontrollers", "limitranges", "persistentvolumeclaims", "persistentvolumes", "namespaces", "endpoints"]
-
verbs: ["list", "watch"]
-
- apiGroups: ["extensions"]
-
resources: ["daemonsets", "deployments", "replicasets"]
-
verbs: ["list", "watch"]
-
- apiGroups: ["apps"]
-
resources: ["statefulsets"]
-
verbs: ["list", "watch"]
-
- apiGroups: ["batch"]
-
resources: ["cronjobs", "jobs"]
-
verbs: ["list", "watch"]
-
- apiGroups: ["autoscaling"]
-
resources: ["horizontalpodautoscalers"]
-
verbs: ["list", "watch"]
-
---
-
apiVersion: rbac.authorization.k8s.io/v1
-
kind: ClusterRoleBinding
-
metadata:
-
name: kube-state-metrics
-
roleRef:
-
apiGroup: rbac.authorization.k8s.io
-
kind: ClusterRole
-
name: kube-state-metrics
-
subjects:
-
- kind: ServiceAccount
-
name: kube-state-metrics
-
namespace: kube-system
-
-
---
-
apiVersion: v1
-
kind: Service
-
metadata:
-
annotations:
-
prometheus.io/scrape: 'true'
-
name: kube-state-metrics
-
namespace: kube-system
-
labels:
-
app: kube-state-metrics
-
spec:
-
type: NodePort
-
ports:
-
- name: kube-state-metrics
-
port: 8080
-
targetPort: 8080
-
nodePort: 31666
-
protocol: TCP
-
selector:
-
app: kube-state-metrics
kubectl apply -f case5-kube-state-metrics-deploy.yaml
2.6部署grafana
grafana图形界面对接prometheus数据源,包括Deployment资源、Service资源。
vim grafana-enterprise.yaml
-
apiVersion: apps/v1
-
kind: Deployment
-
metadata:
-
name: grafana-enterprise
-
namespace: monitor
-
spec:
-
replicas: 1
-
selector:
-
matchLabels:
-
app: grafana-enterprise
-
template:
-
metadata:
-
labels:
-
app: grafana-enterprise
-
spec:
-
containers:
-
- image: grafana/grafana
-
imagePullPolicy: Always
-
#command:
-
# - "tail"
-
# - "-f"
-
# - "/dev/null"
-
securityContext:
-
allowPrivilegeEscalation: false
-
runAsUser: 0
-
name: grafana
-
ports:
-
- containerPort: 3000
-
protocol: TCP
-
volumeMounts:
-
- mountPath: "/var/lib/grafana"
-
name: data
-
resources:
-
requests:
-
cpu: 100m
-
memory: 100Mi
-
limits:
-
cpu: 500m
-
memory: 2500Mi
-
volumes:
-
- name: data
-
emptyDir: {}
-
---
-
apiVersion: v1
-
kind: Service
-
metadata:
-
name: grafana
-
namespace: monitor
-
spec:
-
type: NodePort
-
ports:
-
- port: 80
-
targetPort: 3000
-
nodePort: 31000
-
selector:
-
app: grafana-enterprise
kubectl apply -f grafana-enterprise.yaml
账号admin 密码admin
添加数据源data sources,命名为prometheus,注意端口号30090。
添加模板13332,还可以添加其他模板,例如:14981、13824、14518。
点击左侧“ ”号,选择“import”导入模板。
模板13332
cadvisor模板编号14282,此处有个bug尚未解决,可以监控集群内所有容器的性能资源,但如果选中其中一个容器就无法显示数据。(应该是可以解决的)。
现在显示的是pod的ID,不方便管理员浏览,为了方便显示成pod的name,模板右侧的“设置图标”,选择“Variables”,选择第二个,将“name”改成“pod”即可。
仪表台的每一个板块也需要更改,点击板块标题,选择“Edit”,“name”改成“pod”。
3.测试监控效果
新建名为nginx01的deployment任务,测试监控结果。
vim nginx01.yaml
-
apiVersion: apps/v1
-
kind: Deployment
-
metadata:
-
name: nginx01
-
spec:
-
replicas: 2
-
selector:
-
matchLabels:
-
app: nginx01
-
template:
-
metadata:
-
labels:
-
app: nginx01
-
spec:
-
containers:
-
- name: nginx
-
image: nginx:1.7.9
kubectl apply -f nginx01.yaml
出现两个nginx01,因为设置了2个副本。
至此,cadvisor prometheus grafana集群监控部署完成。
这篇好文章是转载于:学新通技术网
- 版权申明: 本站部分内容来自互联网,仅供学习及演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,请提供相关证据及您的身份证明,我们将在收到邮件后48小时内删除。
- 本站站名: 学新通技术网
- 本文地址: /boutique/detail/tanhibagei
-
photoshop保存的图片太大微信发不了怎么办
PHP中文网 06-15 -
word里面弄一个表格后上面的标题会跑到下面怎么办
PHP中文网 06-20 -
photoshop扩展功能面板显示灰色怎么办
PHP中文网 06-14 -
《学习通》视频自动暂停处理方法
HelloWorld317 07-05 -
TikTok加速器哪个好免费的TK加速器推荐
TK小达人 10-01 -
Android 11 保存文件到外部存储,并分享文件
Luke 10-12 -
微信公众号没有声音提示怎么办
PHP中文网 03-31 -
excel下划线不显示怎么办
PHP中文网 06-23 -
微信运动停用后别人还能看到步数吗
PHP中文网 07-22 -
excel打印预览压线压字怎么办
PHP中文网 06-22