Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to create dynamic indexes with K8s metadata? #334

Open
kaiohenricunha opened this issue Apr 5, 2023 · 0 comments
Open

How to create dynamic indexes with K8s metadata? #334

kaiohenricunha opened this issue Apr 5, 2023 · 0 comments

Comments

@kaiohenricunha
Copy link

I was trying to add a namespace label to the logstashPrefix like this:

apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-os
  labels:
    output.fluentd.fluent.io/scope: "cluster"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
    - customPlugin:
        config: |
          <match **>
            @type opensearch
            host "${FLUENT_OPENSEARCH_HOST}"
            port 443
            logstash_format  true
            logstash_prefix logs-$${record['kubernetes']['namespace_name']['labels']['org']}
            scheme https
            <endpoint>
              url "https://${FLUENT_OPENSEARCH_HOST}"
              region "${FLUENT_OPENSEARCH_REGION}"
              assume_role_arn "#{ENV['AWS_ROLE_ARN']}"
              assume_role_web_identity_token_file "#{ENV['AWS_WEB_IDENTITY_TOKEN_FILE']}"
            </endpoint>
          </match>
      logLevel: info

But it ended up creating an index with the Ruby expression in it:

logs-${record['kubernetes']['namespace_name']['labels']['org']}-2023.04.04

This is because fluentbit doesn't enrich the logs with namespace metadata besides the namespace name. Here's an example log:

{
  "_index": "logs-${record['kubernetes']['pod_name']}-2023.04.04",
  "_id": "XXX-XXX",
  "_version": 1,
  "_score": null,
  "_source": {
    "log": "2023-04-04T22:19:06.572970297Z stdout F Notice: XXXX for 'XXXX'",
    "kubernetes": {
      "pod_name": "dockerhub-limit-exporter-XXX-XXX",
      "namespace_name": "observability-system",
      "labels": {
        "app.kubernetes.io/instance": "dockerhub-limit-exporter",
        "app.kubernetes.io/name": "dockerhub-limit-exporter",
        "pod-template-hash": "XXX"
      },
      "annotations": {
        "kubernetes.io/psp": "eks.privileged"
      },
      "container_name": "dockerhub-limit-exporter",
      "docker_id": "XXXX",
      "container_image": "harbor.ext.XXX.com/XXX/docker-hub-limit-exporter:0.0.0"
    },
    "@timestamp": "2023-04-04T22:19:06.576573487+00:00"
  },
  "fields": {
    "@timestamp": [
      "2023-04-04T22:19:06.576Z"
    ]
  },
  "highlight": {
    "kubernetes.namespace_name": [
      "@opensearch-dashboards-highlighted-field@observability@/opensearch-dashboards-highlighted-field@-@opensearch-dashboards-highlighted-field@system@/opensearch-dashboards-highlighted-field@"
    ]
  },
  "sort": [
    1680646746576
  ]
}

This my actual fluentbit and fluentd configuration:

apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBit
metadata:
  labels:
    app.kubernetes.io/name: fluent-bit
  name: fluent-bit
  namespace: fluent-system
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: node-role.kubernetes.io/edge
            operator: DoesNotExist
  fluentBitConfigName: fluent-bit-config
  image: kubesphere/fluent-bit:${FLUENTBIT_IMAGE_TAG:=v2.0.9}
  imagePullSecrets:
    - name: image-pull-secret
  positionDB:
    hostPath:
      path: /var/lib/fluent-bit/
  resources:
    limits:
      cpu: ${FLUENTBIT_CPU_LIMIT:=500m}
      memory: ${FLUENTBIT_MEMORY_LIMIT:=200Mi}
    requests:
      cpu: ${FLUENTBIT_CPU_REQUEST:=10m}
      memory: ${FLUENTBIT_MEMORY_REQUEST:=25Mi}
  tolerations:
  - operator: Exists
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  labels:
    fluentbit.fluent.io/component: logging
    fluentbit.fluent.io/enabled: "true"
  name: tail
spec:
  tail:
    db: /fluent-bit/tail/pos.db
    dbSync: Normal
    memBufLimit: ${FLUENTBIT_MEM_BUF_LIMIT:=5MB}
    parser: docker
    path: /var/log/containers/*.log
    refreshIntervalSeconds: 10
    skipLongLines: true
    tag: kube.*
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterInput
metadata:
  labels:
    fluentbit.fluent.io/component: logging
    fluentbit.fluent.io/enabled: "true"
  name: docker
spec:
  systemd:
    db: /fluent-bit/tail/systemd.db
    dbSync: Normal
    path: /var/log/journal
    systemdFilter:
    - _SYSTEMD_UNIT=docker.service
    - _SYSTEMD_UNIT=kubelet.service
    tag: service.*
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFluentBitConfig
metadata:
  labels:
    app.kubernetes.io/name: fluent-bit
  name: fluent-bit-config
spec:
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  inputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/enabled: "true"
  service:
    parsersFile: parsers.conf
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterFilter
metadata:
  labels:
    fluentbit.fluent.io/component: logging
    fluentbit.fluent.io/enabled: "true"
  name: kubernetes
spec:
  filters:
  - kubernetes:
      annotations: true
      kubeCAFile: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      kubeTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      kubeURL: https://kubernetes.default.svc:443
      labels: true
  - nest:
      addPrefix: kubernetes_
      nestedUnder: kubernetes
      operation: lift
  - modify:
      rules:
      - remove: stream
      - remove: kubernetes_pod_id
      - remove: kubernetes_host
      - remove: kubernetes_container_hash
  - nest:
      nestUnder: kubernetes
      operation: nest
      removePrefix: kubernetes_
      wildcard:
      - kubernetes_*
  match: kube.*
---
apiVersion: fluentbit.fluent.io/v1alpha2
kind: ClusterOutput
metadata:
  labels:
    fluentbit.fluent.io/component: logging
    fluentbit.fluent.io/enabled: "true"
  name: fluentd
spec:
  forward:
    host: fluentd.fluent-system.svc
    port: 24224
  matchRegex: (?:kube|service)\.(.*)
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: Fluentd
metadata:
  name: fluentd
  namespace: fluent-system
  labels:
    app.kubernetes.io/name: fluentd
spec:
  globalInputs:
    - forward:
        bind: 0.0.0.0
        port: 24224
  replicas: 1
  image: kubesphere/fluentd:${FLUENTD_IMAGE_TAG:=v1.15.3}
  imagePullSecrets:
    - name: image-pull-secret
  resources:
    limits:
      cpu: ${FLUENTD_CPU_LIMIT:=500m}
      memory: ${FLUENTD_MEMORY_LIMIT:=500Mi}
    requests:
      cpu: ${FLUENTD_CPU_REQUEST:=100m}
      memory: ${FLUENTD_MEMORY_REQUEST:=128Mi}
  fluentdCfgSelector:
    matchLabels:
      config.fluentd.fluent.io/enabled: "true"
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterFluentdConfig
metadata:
  labels:
    config.fluentd.fluent.io/enabled: "true"
  name: fluentd-config
spec:
  clusterFilterSelector:
    matchLabels:
      filter.fluentd.fluent.io/enabled: "true"
  clusterOutputSelector:
    matchLabels:
      output.fluentd.fluent.io/enabled: "true"
  watchedNamespaces: [] # watches all namespaces when empty
---
apiVersion: fluentd.fluent.io/v1alpha1
kind: ClusterOutput
metadata:
  name: cluster-fluentd-output-os
  labels:
    output.fluentd.fluent.io/scope: "cluster"
    output.fluentd.fluent.io/enabled: "true"
spec:
  outputs:
    - customPlugin:
        config: |
          <match **>
            @type opensearch
            host "${FLUENT_OPENSEARCH_HOST}"
            port 443
            logstash_format  true
            logstash_prefix logs-$${record['kubernetes']['namespace_name']['labels']['org']}
            scheme https
            <endpoint>
              url "https://${FLUENT_OPENSEARCH_HOST}"
              region "${FLUENT_OPENSEARCH_REGION}"
              assume_role_arn "#{ENV['AWS_ROLE_ARN']}"
              assume_role_web_identity_token_file "#{ENV['AWS_WEB_IDENTITY_TOKEN_FILE']}"
            </endpoint>
          </match>
      logLevel: info

How can dynamic indexing be achieved leveraging the fluent-operator features?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant