【kubernetes】k8s-glp

前言

我们经常有一些日志采集的需求,采集完毕之后,希望有一个中心WebUi来方便的查看很多不同节点,不同服务的日志。

按照传统的方式,一般都是会采用ELK,就是elasticsearch+logstash+kibana,但是由于JVM对资源的消耗太大,加上ES是通过全文搜索的方式需要进行倒排索引的分词,所以这些功能几乎是用不上,我们查询日志一般都可以通过定制常规的label信息,然后搜索即可,大可不必进行分词的行为。尽管后续又由于logstash的资源占用过大问题,作者又利用go语言开发出了filebeat,来辅助日志采集体系,后续加入了某公司之后,被集成到了beats的项目中,因为也可以交efk/ebk,都可以。

鉴于这一点,随之而来的就是GLP,就是grafna+loki+promtail。这是一套完全基于go语言生态写的,更贴近云原生。一套体系都是经过grafna lab云原生孕育而生。资源占用少,效率高,能够解决痛点,天生支持k8s等等特性。都让他成为新的崛起之秀。

Kubernetes Logs

glp-1

默认情况下,容器日志会存储在 /var/log/pods 路径下。

每个文件夹对应一个 Pod,Pod 下级目录为容器名,再下级即为容器日志。

1
2
3
4
5
6
7
8
tree kube-system_kube-flannel-ds-amd64-9x66j_28e71490-d614-4cd8-9ea7-af23cc7b9bff/

kube-system_kube-flannel-ds-amd64-9x66j_28e71490-d614-4cd8-9ea7-af23cc7b9bff/
├── install-cni
│ └── 3.log -> /data/docker/containers/6accaa2d6890df8ca05d1f40aaa9b8da69ea0a00a8e4b07a0949cdc067843e37/6accaa2d6890df8ca05d1f40aaa9b8da69ea0a00a8e4b07a0949cdc067843e37-json.log
└── kube-flannel
├── 2.log -> /data/docker/containers/9e8eea717cc3efd0804900a53244a32286d9e04767f76d9c8a8cc3701c83ece5/9e8eea717cc3efd0804900a53244a32286d9e04767f76d9c8a8cc3701c83ece5-json.log
└── 3.log -> /data/docker/containers/06389981d26cbe60328cd5a46af7b003c8d687d1c411704784aa12d4d82672b8/06389981d26cbe60328cd5a46af7b003c8d687d1c411704784aa12d4d82672b8-json.log

日志文件 kube-flannel/3.log 只是对 /var/lib/docker/containers/***/***.log 文件的软链接,本质上还是 Docker 维护日志, k8s 对其引用而已。

日志是 JSON 格式的,每一行包含如下三个信息:

  • log:日志内容
  • stream:stderr(异常输出)、stdout(正常输出)
  • time:时间

/var/lib/docker/containers 是通过 /etc/docker/daemon.json 配置的,并且也是默认路径。

grafna

由于k8s的网络架构的原因,我们访问的时候都是通过访问service的名字的,和docker-compose下的访问方式不太一样。

例如.

1
2
3
4
5
6
➜  whiteccinn.github.io git:(master) ✗ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dev-grafana LoadBalancer 10.102.248.247 localhost 3000:32695/TCP 17h
dev-loki ClusterIP 10.106.32.224 <none> 3100/TCP 17h
dev-promtail ClusterIP 10.108.116.190 <none> 9080/TCP 17h
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3d9h

那么,容器中的访问方式就是通过dev-loki, dev-promtail, dev-grafna来对pod进行访问,service的port再映射到对应的容器的port上

depployment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: grafana
name: grafana
spec:
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
securityContext:
fsGroup: 472
supplementalGroups:
- 0
containers:
- name: grafana
image: grafana/grafana:7.5.2
imagePullPolicy: IfNotPresent
ports:
- containerPort: 3000
name: http-grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /robots.txt
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 2
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 3000
timeoutSeconds: 1
resources:
requests:
cpu: 250m
memory: 750Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-pv
volumes:
- name: grafana-pv
persistentVolumeClaim:
claimName: grafana-pvc

pvc

1
2
3
4
5
6
7
8
9
10
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: grafana-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

service

1
2
3
4
5
6
7
8
9
10
11
12
13
apiVersion: v1
kind: Service
metadata:
name: grafana
spec:
ports:
- port: 3000
protocol: TCP
targetPort: http-grafana
selector:
app: grafana
sessionAffinity: None
type: LoadBalancer

loki

config-map

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-config
data:
loki-config.yml: |
auth_enabled: false

server:
http_listen_port: 3100

ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 5m
chunk_retain_period: 30s

schema_config:
configs:
- from: 2020-05-15
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 168h

storage_config:
boltdb:
directory: /tmp/loki/index

filesystem:
directory: /tmp/loki/chunks

limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h

pvc

1
2
3
4
5
6
7
8
9
10
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: loki-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

service

1
2
3
4
5
6
7
8
9
10
kind: Service
apiVersion: v1
metadata:
name: loki
spec:
ports:
- port: 3100
targetPort: http-loki
selector:
app: loki

statefulet

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: loki-statefulset
spec:
selector:
matchLabels:
app: loki
replicas: 2
serviceName: loki
template:
metadata:
labels:
app: loki
spec:
containers:
- name: loki
image: grafana/loki:2.3.0
imagePullPolicy: IfNotPresent
args:
- -config.file=/mnt/config/loki-config.yml
ports:
- containerPort: 3100
name: http-loki
volumeMounts:
- mountPath: /tmp/loki
name: storage-volume
- mountPath: /mnt/config
name: config-volume
securityContext:
runAsUser: 0
runAsGroup: 0
volumes:
- name: storage-volume
persistentVolumeClaim:
claimName: loki-pvc
- name: config-volume
configMap:
name: loki-config
items:
- key: loki-config.yml
path: loki-config.yml

promtail

config-map

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
apiVersion: v1
kind: ConfigMap
metadata:
name: promtail-config
namespace: default
data:
promtail-config.yml: |
server:
http_listen_port: 9080
grpc_listen_port: 0

positions:
filename: /tmp/positions.yaml

# clients:
# - url: http://dev-loki:3100/loki/api/v1/push

scrape_configs:
- job_name: containers
static_configs:
- targets:
- localhost
labels:
log_from: static_pods
__path__: /var/log/pods/*/*/*.log
pipeline_stages:
- docker: {}
- match:
selector: '{log_from="static_pods"}'
stages:
- regex:
source: filename
expression: "(?:pods)/(?P<namespace>\\S+?)_(?P<pod>\\S+)-\\S+?-\\S+?_\\S+?/(?P<container>\\S+?)/"
- labels:
namespace:
pod:
container:
- match:
selector: '{namespace!~"(default|kube-system)"}'
action: drop
drop_counter_reason: no_use

daemonest

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: promtail
spec:
selector:
matchLabels:
app: promtail
template:
metadata:
labels:
app: promtail
spec:
containers:
- name: promtail
image: grafana/promtail:2.3.0
imagePullPolicy: IfNotPresent
args:
- -config.file=/mnt/config/promtail-config.yml
- -client.url=http://dev-loki:3100/loki/api/v1/push
- -client.external-labels=hostname=$(NODE_NAME)
ports:
- containerPort: 9080
name: http-promtail
volumeMounts:
- mountPath: /var/lib/docker/containers
name: containers-volume
- mountPath: /var/log/pods
name: pods-volume
- mountPath: /mnt/config
name: config-volume
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
runAsGroup: 0
volumes:
- name: containers-volume
hostPath:
path: /var/lib/docker/containers
- name: pods-volume
hostPath:
path: /var/log/pods
- name: config-volume
configMap:
name: promtail-config
items:
- key: promtail-config.yml
path: promtail-config.yml
tolerations:
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule

注意:上述提到 /var/log/pods 下的日志只是对 /var/lib/docker/containers 下日志的软链接,所以 Promtail 部署时需要同时挂载这两个目录

service

1
2
3
4
5
6
7
8
9
10
kind: Service
apiVersion: v1
metadata:
name: promtail
spec:
ports:
- port: 9080
targetPort: http-promtail
selector:
app: promtail

这些详情的参数就不解释的,这就是一整套GLP的k8s的部署文件。由于我这里是采用kustomize来部署的。所以会有多层结构。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
➜  kustomize git:(main) tree
.
├── base
│   ├── grafna
│   │   ├── deployment.yaml
│   │   ├── kustomization.yml
│   │   ├── pvc.yaml
│   │   └── service.yaml
│   ├── kustomization.yml
│   ├── loki
│   │   ├── config-map.yaml
│   │   ├── kustomization.yml
│   │   ├── pvc.yaml
│   │   ├── service.yaml
│   │   └── statefulset.yaml
│   └── promtail
│   ├── config-map.yaml
│   ├── daemonset.yaml
│   ├── kustomization.yml
│   └── service.yaml
└── overlays
├── dev
│   ├── kustomization.yml
│   └── patch.yaml
└── prod
├── kustomization.yml
└── patch.yaml

一整套的运行就是:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
  kustomize git:(main) kustomize build overlays/dev
apiVersion: v1
data:
promtail-config.yml: |
server:
http_listen_port: 9080
grpc_listen_port: 0

positions:
filename: /tmp/positions.yaml

# clients:
# - url: http://dev-loki:3100/loki/api/v1/push

scrape_configs:
- job_name: containers
static_configs:
- targets:
- localhost
labels:
log_from: static_pods
__path__: /var/log/pods/*/*/*.log
pipeline_stages:
- docker: {}
- match:
selector: '{log_from="static_pods"}'
stages:
- regex:
source: filename
expression: "(?:pods)/(?P<namespace>\\S+?)_(?P<pod>\\S+)-\\S+?-\\S+?_\\S+?/(?P<container>\\S+?)/"
- labels:
namespace:
pod:
container:
- match:
selector: '{namespace!~"(default|kube-system)"}'
action: drop
drop_counter_reason: no_use
kind: ConfigMap
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-promtail-config
namespace: default
---
apiVersion: v1
data:
loki-config.yml: |
auth_enabled: false

server:
http_listen_port: 3100

ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
chunk_idle_period: 5m
chunk_retain_period: 30s

schema_config:
configs:
- from: 2020-05-15
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 168h

storage_config:
boltdb:
directory: /tmp/loki/index

filesystem:
directory: /tmp/loki/chunks

limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
kind: ConfigMap
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-loki-config
---
apiVersion: v1
kind: Service
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-grafana
spec:
ports:
- port: 3000
protocol: TCP
targetPort: http-grafana
selector:
app: amyris
org: unknow-x
variant: dev
sessionAffinity: None
type: LoadBalancer
---
apiVersion: v1
kind: Service
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-loki
spec:
ports:
- port: 3100
targetPort: http-loki
selector:
app: amyris
org: unknow-x
variant: dev
---
apiVersion: v1
kind: Service
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-promtail
spec:
ports:
- port: 9080
targetPort: http-promtail
selector:
app: amyris
org: unknow-x
variant: dev
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-grafana-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-loki-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-grafana
spec:
selector:
matchLabels:
app: amyris
org: unknow-x
variant: dev
template:
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
spec:
containers:
- image: grafana/grafana:7.5.2
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
tcpSocket:
port: 3000
timeoutSeconds: 1
name: grafana
ports:
- containerPort: 3000
name: http-grafana
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /robots.txt
port: 3000
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 30
successThreshold: 1
timeoutSeconds: 2
resources:
requests:
cpu: 250m
memory: 750Mi
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-pv
securityContext:
fsGroup: 472
supplementalGroups:
- 0
volumes:
- name: grafana-pv
persistentVolumeClaim:
claimName: dev-grafana-pvc
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-loki-statefulset
spec:
replicas: 2
selector:
matchLabels:
app: amyris
org: unknow-x
variant: dev
serviceName: dev-loki
template:
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
spec:
containers:
- args:
- -config.file=/mnt/config/loki-config.yml
image: grafana/loki:2.3.0
imagePullPolicy: IfNotPresent
name: loki
ports:
- containerPort: 3100
name: http-loki
securityContext:
runAsGroup: 0
runAsUser: 0
volumeMounts:
- mountPath: /tmp/loki
name: storage-volume
- mountPath: /mnt/config
name: config-volume
volumes:
- name: storage-volume
persistentVolumeClaim:
claimName: dev-loki-pvc
- configMap:
items:
- key: loki-config.yml
path: loki-config.yml
name: dev-loki-config
name: config-volume
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
name: dev-promtail
spec:
selector:
matchLabels:
app: amyris
org: unknow-x
variant: dev
template:
metadata:
annotations:
note: Hello, I am dev!
labels:
app: amyris
org: unknow-x
variant: dev
spec:
containers:
- args:
- -config.file=/mnt/config/promtail-config.yml
- -client.url=http://dev-loki:3100/loki/api/v1/push
- -client.external-labels=hostname=$(NODE_NAME)
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
image: grafana/promtail:2.3.0
imagePullPolicy: IfNotPresent
name: promtail
ports:
- containerPort: 9080
name: http-promtail
securityContext:
runAsGroup: 0
runAsUser: 0
volumeMounts:
- mountPath: /var/lib/docker/containers
name: containers-volume
- mountPath: /var/log/pods
name: pods-volume
- mountPath: /mnt/config
name: config-volume
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
volumes:
- hostPath:
path: /var/lib/docker/containers
name: containers-volume
- hostPath:
path: /var/log/pods
name: pods-volume
- configMap:
items:
- key: promtail-config.yml
path: promtail-config.yml
name: dev-promtail-config
name: config-volume

采用 kustomize build overlays/dev | kubectl apply -f - 来运行我们的dev环境的k8s所有的服务。

通过kubectl port-forward deployment.apps/dev-grafana 3000:3000 来做端口的转发。

1
2
3
➜  whiteccinn.github.io git:(master) ✗ kubectl port-forward deployment.apps/dev-grafana 3000:3000
Forwarding from 127.0.0.1:3000 -> 3000
Forwarding from [::1]:3000 -> 3000

通过 kubectl get svc 查看端口转发情况。

1
2
3
4
5
6
➜  whiteccinn.github.io git:(master) ✗ kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
dev-grafana LoadBalancer 10.102.248.247 localhost 3000:32695/TCP 17h
dev-loki ClusterIP 10.106.32.224 <none> 3100/TCP 17h
dev-promtail ClusterIP 10.108.116.190 <none> 9080/TCP 17h
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3d10h

最终通过命令 kubectl get pods -o wide 看到所有的pods都在正常运作了。

1
2
3
4
5
6
➜  whiteccinn.github.io git:(master) ✗ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
dev-grafana-7cd4c89fd4-wdkpb 1/1 Running 0 17h 10.1.0.16 docker-desktop <none> <none>
dev-loki-statefulset-0 1/1 Running 0 17h 10.1.0.18 docker-desktop <none> <none>
dev-loki-statefulset-1 1/1 Running 0 17h 10.1.0.19 docker-desktop <none> <none>
dev-promtail-n6jgs 1/1 Running 0 17h 10.1.0.17 docker-desktop <none> <none>

然后在浏览器打开localhost:3000,即可访问到grafna了。

grafna默认的账号密码就是admin

1.我们先来配置grafna的Dashboard。

数据源

2.对日志进行可视化配置。

log panel

3.配置搜索栏。

搜索栏

4.可以看到搜索栏了,并且需要更新一下查询的公式

公式

这里就是我希望利用k8s的glp来采集我的所有的pods在标准输出的所有的日志信息,做一个汇总和日志中心查询的web-ui。

docker-for-mac-6