From dedb585bb5dcf64d5a700f126860d54a924d0722 Mon Sep 17 00:00:00 2001 From: sh0rez Date: Mon, 5 Aug 2019 19:23:28 +0200 Subject: [PATCH] docs: general documentation rework - restructures the docs to make them easier to explore - rewrites promtail docs - unifies, shortens and extends docs --- docs/api.md | 291 ------------------ docs/{design => design-documents}/labels.md | 0 docs/index.md | 34 ++ docs/logcli.md | 35 ++- docs/loki/api.md | 114 +++++++ docs/{ => loki}/operations.md | 152 --------- docs/loki/storage.md | 158 ++++++++++ docs/promtail/api.md | 6 +- docs/promtail/configuration.md | 185 +++++++++++ docs/promtail/deployment.md | 150 +++++++++ docs/promtail/examples.md | 92 ++++++ docs/promtail/overview.md | 41 +++ .../parsing.md} | 2 +- docs/querying.md | 111 +++++++ docs/usage.md | 136 -------- mkdocs.yml | 10 + 16 files changed, 918 insertions(+), 599 deletions(-) delete mode 100644 docs/api.md rename docs/{design => design-documents}/labels.md (100%) create mode 100644 docs/index.md create mode 100644 docs/loki/api.md rename docs/{ => loki}/operations.md (53%) create mode 100644 docs/loki/storage.md create mode 100644 docs/promtail/configuration.md create mode 100644 docs/promtail/deployment.md create mode 100644 docs/promtail/examples.md create mode 100644 docs/promtail/overview.md rename docs/{logentry/processing-log-lines.md => promtail/parsing.md} (99%) create mode 100644 docs/querying.md delete mode 100644 docs/usage.md create mode 100644 mkdocs.yml diff --git a/docs/api.md b/docs/api.md deleted file mode 100644 index 4b0000449719d..0000000000000 --- a/docs/api.md +++ /dev/null @@ -1,291 +0,0 @@ -# Loki API - -The Loki server has the following API endpoints (_Note:_ Authentication is out of scope for this project): - -- `POST /api/prom/push` - - For sending log entries, expects a snappy compressed proto in the HTTP Body: - - - [ProtoBuffer definition](/pkg/logproto/logproto.proto) - - [Golang client library](/pkg/promtail/client/client.go) - - Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: - - ```json - { - "streams": [ - { - "labels": "{foo=\"bar\"}", - "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] - } - ] - } - - ``` - -- `GET /api/v1/query` - - For doing instant queries at a single point in time, accepts the following parameters in the query-string: - - - `query`: a logQL query - - `limit`: max number of entries to return (not used for metric queries) - - `time`: the evaluation time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. - - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. - - Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, - so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional - load to the index server and make the query slower. - - Responses looks like this: - - ```json - { - "resultType": "vector" | "streams", - "result": - } - ``` - - Examples: - - ```bash - $ curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' | jq - { - "resultType": "vector", - "result": [ - { - "metric": {}, - "value": [ - 1559848867745737, - "1267.1266666666666" - ] - }, - { - "metric": { - "level": "warn" - }, - "value": [ - 1559848867745737, - "37.77166666666667" - ] - }, - { - "metric": { - "level": "info" - }, - "value": [ - 1559848867745737, - "37.69" - ] - } - ] - } - ``` - - ```bash - curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query={job="varlogs"}' | jq - { - "resultType": "streams", - "result": [ - { - "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", - "entries": [ - { - "ts": "2019-06-06T19:25:41.972739Z", - "line": "foo" - }, - { - "ts": "2019-06-06T19:25:41.972722Z", - "line": "bar" - } - ] - } - ] - ``` - -- `GET /api/v1/query_range` - - For doing queries over a range of time, accepts the following parameters in the query-string: - - - `query`: a logQL query - - `limit`: max number of entries to return (not used for metric queries) - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always one hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. - - `step`: query resolution step width in seconds. Default 1 second. - - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. - - Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, - so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional - load to the index server and make the query slower. - - Responses looks like this: - - ```json - { - "resultType": "matrix" | "streams", - "result": - } - ``` - - Examples: - - ```bash - $ curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' --data-urlencode 'step=300' | jq - { - "resultType": "matrix", - "result": [ - { - "metric": { - "level": "info" - }, - "values": [ - [ - 1559848958663735, - "137.95" - ], - [ - 1559849258663735, - "467.115" - ], - [ - 1559849558663735, - "658.8516666666667" - ] - ] - }, - { - "metric": { - "level": "warn" - }, - "values": [ - [ - 1559848958663735, - "137.27833333333334" - ], - [ - 1559849258663735, - "467.69" - ], - [ - 1559849558663735, - "660.6933333333334" - ] - ] - } - ] - } - ``` - - ```bash - curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query={job="varlogs"}' | jq - { - "resultType": "streams", - "result": [ - { - "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", - "entries": [ - { - "ts": "2019-06-06T19:25:41.972739Z", - "line": "foo" - }, - { - "ts": "2019-06-06T19:25:41.972722Z", - "line": "bar" - } - ] - } - ] - ``` - -- `GET /api/prom/query` - - For doing queries, accepts the following parameters in the query-string: - - - `query`: a [logQL query](./usage.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) - - `limit`: max number of entries to return - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. - - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. - - `regexp`: a regex to filter the returned results - - Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, - so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional - load to the index server and make the query slower. - - > This endpoint will be deprecated in the future you should use `api/v1/query_range` instead. - > You can only query for logs, it doesn't accept [queries returning metrics](./usage.md#counting-logs). - - Responses looks like this: - - ```json - { - "streams": [ - { - "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", - "entries": [ - { - "ts": "2018-06-27T05:20:28.699492635Z", - "line": "..." - }, - ... - ] - }, - ... - ] - } - ``` - -- `GET /api/prom/label` - - For doing label name queries, accepts the following parameters in the query-string: - - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. - - Responses looks like this: - - ```json - { - "values": [ - "instance", - "job", - ... - ] - } - ``` - -- `GET /api/prom/label//values` - - For doing label values queries, accepts the following parameters in the query-string: - - - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. - - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. - - Responses looks like this: - - ```json - { - "values": [ - "default", - "cortex-ops", - ... - ] - } - ``` - -- `GET /ready` - - This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. - -- `GET /flush` - - This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. - -- `GET /metrics` - - This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. - - -## Examples of using the API in a third-party client library - -1) Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). -2) Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) diff --git a/docs/design/labels.md b/docs/design-documents/labels.md similarity index 100% rename from docs/design/labels.md rename to docs/design-documents/labels.md diff --git a/docs/index.md b/docs/index.md new file mode 100644 index 0000000000000..2ee9a91518217 --- /dev/null +++ b/docs/index.md @@ -0,0 +1,34 @@ +

+ Loki Logo
+ Like Prometheus, but for logs! +

+ +Grafana Loki is a set of components, that can be composed into a fully featured logging stack. + +It builds around the idea of treating a single log line as-is. This means that +instead of full-text indexing them, related logs are grouped using the same labels +as in Prometheus. This is much more efficient and scales better. + +## Components +- **[Loki](loki/overview.md)**: The main server component is called Loki. It is responsible for + permanently storing the logs it is being shipped and it executes the LogQL + queries from clients. + Loki shares its high-level architecture with Cortex, a highly scalable + Prometheus backend. +- **[Promtail](promtail/overview.md)**: To ship logs to a central place, an agent is required. Promtail + is deployed to every node that should be monitored and sends the logs to Loki. + It also does important task of pre-processing the log lines, including + attaching labels to them for easier querying. +- *Grafana*: The *Explore* feature of Grafana 6.0+ is the primary place of + contact between a human and Loki. It is used for discovering and analyzing logs. + +Alongside these main components, there are some other ones as well: + +- **[LogCLI](logcli.md)**: A command line interface to query logs and labels from Loki +- **Canary**: An audit utility to analyze the log-capturing performance of Loki. + Ingests data into Loki and immediately reads it back to check for latency and loss. +- **Docker Driver**: A Docker [log driver](https://docs.docker.com/config/containers/logging/configure/) to ship logs captured by Docker + directly to Loki, without the need of an agent. +- **Fluentd Plugin**: An Fluentd [output + plugin](https://docs.fluentd.org/output), to use Fluentd for shipping logs + into Loki diff --git a/docs/logcli.md b/docs/logcli.md index 4408d894c283b..4a137365ca89d 100644 --- a/docs/logcli.md +++ b/docs/logcli.md @@ -1,23 +1,25 @@ -# Log CLI usage Instructions +# LogCLI -Loki's main query interface is Grafana; however, a basic CLI is provided as a proof of concept. - -Once you have Loki running in a cluster, you can query logs from that cluster. +LogCLI is a handy tool to query logs from Loki without having to run a full Grafana instance. ## Installation -### Get latest version +### Binary (Recommended) +Head over to the [Releases](https://github.com/grafana/loki/releases) and download the `logcli` binary for your OS: +```bash +# download a binary (adapt app, os and arch as needed) +# installs v0.2.0. For up to date URLs refer to the release's description +$ curl -fSL -o "/usr/local/bin/logcli.gz" "https://github.com/grafana/logcli/releases/download/v0.2.0/logcli-linux-amd64.gz" +$ gunzip "/usr/local/bin/logcli.gz" -``` -$ go get github.com/grafana/loki/cmd/logcli +# make sure it is executable +$ chmod a+x "/usr/local/bin/logcli" ``` -### Build from source +### From source ``` -$ go get github.com/grafana/loki -$ cd $GOPATH/src/github.com/grafana/loki -$ go build ./cmd/logcli +$ go get github.com/grafana/loki/cmd/logcli ``` Now `logcli` is in your current directory. @@ -36,14 +38,15 @@ Otherwise, when running e.g. [locally](https://github.com/grafana/loki/tree/mast ``` $ export GRAFANA_ADDR=http://localhost:3100 ``` -> Note: If you are running loki behind a proxy server and have an authentication setup. You will have to pass URL, username and password accordingly. Please refer to the [docs](https://github.com/adityacs/loki/blob/master/docs/operations.md) for more info. +> Note: If you are running loki behind a proxy server and have an authentication setup, you will have to pass URL, username and password accordingly. Please refer to [Authentication](loki/operations.md#authentication) for more info. -``` +```bash $ logcli labels job https://logs-dev-ops-tools1.grafana.net/api/prom/label/job/values cortex-ops/consul cortex-ops/cortex-gw ... + $ logcli query '{job="cortex-ops/consul"}' https://logs-dev-ops-tools1.grafana.net/api/prom/query?query=%7Bjob%3D%22cortex-ops%2Fconsul%22%7D&limit=30&start=1529928228&end=1529931828&direction=backward®exp= Common labels: {job="cortex-ops/consul", namespace="cortex-ops"} @@ -55,14 +58,14 @@ Common labels: {job="cortex-ops/consul", namespace="cortex-ops"} Configuration values are considered in the following order (lowest to highest): -- environment value -- command line +- Environment variables +- Command line flags The URLs of the requests are printed to help with integration work. ### Details -```console +```bash $ logcli help usage: logcli [] [ ...] diff --git a/docs/loki/api.md b/docs/loki/api.md new file mode 100644 index 0000000000000..db116017df23d --- /dev/null +++ b/docs/loki/api.md @@ -0,0 +1,114 @@ +# API + +The Loki server has the following API endpoints (_Note:_ Authentication is out of scope for this project): + +### `POST /api/prom/push` + +For sending log entries, expects a snappy compressed proto in the HTTP Body: + +- [ProtoBuffer definition](/pkg/logproto/logproto.proto) +- [Golang client library](/pkg/promtail/client/client.go) + +Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: + +```json +{ + "streams": [ + { + "labels": "{foo=\"bar\"}", + "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] + } + ] +} +``` + +### `GET /api/prom/query` + +For doing queries, accepts the following parameters in the query-string: + +- `query`: a [logQL query](./usage.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) +- `limit`: max number of entries to return +- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. +- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. +- `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. +- `regexp`: a regex to filter the returned results + +Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, +so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional +load to the index server and make the query slower. + +Responses looks like this: + +```json +{ + "streams": [ + { + "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", + "entries": [ + { + "ts": "2018-06-27T05:20:28.699492635Z", + "line": "..." + }, + ... + ] + }, + ... + ] +} +``` + +### `GET /api/prom/label` + +For doing label name queries, accepts the following parameters in the query-string: + +- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. +- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + +Responses looks like this: + +```json +{ + "values": [ + "instance", + "job", + ... + ] +} +``` + +`GET /api/prom/label//values` + +For doing label values queries, accepts the following parameters in the query-string: + +- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. +- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + +Responses looks like this: + +```json +{ + "values": [ + "default", + "cortex-ops", + ... + ] +} +``` + +### `GET /ready` + +This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. + +### `GET /flush` + +This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. + +### `GET /metrics` + +This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. + + +## Examples of using the API in a third-party client library + +1. Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). +2. Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) diff --git a/docs/operations.md b/docs/loki/operations.md similarity index 53% rename from docs/operations.md rename to docs/loki/operations.md index d6807745f0c3d..11938e8ab4e10 100644 --- a/docs/operations.md +++ b/docs/loki/operations.md @@ -86,155 +86,3 @@ When scaling Loki, consider running several Loki processes with their respective Take a look at their respective `.libsonnet` files in [our production setup](../production/ksonnet/loki) to get an idea about resource usage. We're happy to get feedback about your resource usage. - -## Storage - -Loki needs two stores: an index store and a chunk store. -Loki receives logs in separate streams. -Each stream is identified by a set of labels. -As the log entries from a stream arrive, they are gzipped as chunks and saved in the chunks store. -The index then stores the stream's label set, and links them to the chunks. -The chunk format refer to [doc](../pkg/chunkenc/README.md) - -### Local storage - -By default, Loki stores everything on disk. -The index is stored in a BoltDB under `/tmp/loki/index`. -The chunks are stored under `/tmp/loki/chunks`. - -### Google Cloud Storage - -Loki has support for Google Cloud storage. -Take a look at our [production setup](https://github.com/grafana/loki/blob/a422f394bb4660c98f7d692e16c3cc28747b7abd/production/ksonnet/loki/config.libsonnet#L55) for the relevant configuration fields. - -### Cassandra - -Loki can use Cassandra for the index storage. Please pull the **latest** Loki docker image or build from **latest** source code. Example config for using Cassandra: - -```yaml -schema_config: - configs: - - from: 2018-04-15 - store: cassandra - object_store: filesystem - schema: v9 - index: - prefix: cassandra_table - period: 168h - -storage_config: - cassandra: - username: cassandra - password: cassandra - addresses: 127.0.0.1 - auth: true - keyspace: lokiindex - - filesystem: - directory: /tmp/loki/chunks -``` - -### AWS S3 & DynamoDB - -Example config for using S3 & DynamoDB: - -```yaml -schema_config: - configs: - - from: 2018-04-15 - store: aws - object_store: s3 - schema: v9 - index: - prefix: dynamodb_table_name - period: 0 -storage_config: - aws: - s3: s3://access_key:secret_access_key@region/bucket_name - dynamodbconfig: - dynamodb: dynamodb://access_key:secret_access_key@region -``` - -You can also use an EC2 instance role instead of hard coding credentials like in the above example. -If you wish to do this the storage_config example looks like this: - -```yaml -storage_config: - aws: - s3: s3://region/bucket_name - dynamodbconfig: - dynamodb: dynamodb://region -``` - -#### S3 - -Loki is using S3 as object storage. It stores log within directories based on -[`OrgID`](./operations.md#Multi-tenancy). For example, Logs from org `faker` -will stored in `s3://BUCKET_NAME/faker/`. - -The S3 configuration is setup with url format: `s3://access_key:secret_access_key@region/bucket_name`. - -For custom S3 endpoint (like Ceph Object Storage with S3 Compatible API), if it's using path-style url rather than -virtual hosted bucket addressing, please set config like below: - -```yaml -storage_config: - aws: - s3: s3://access_key:secret_access_key@custom_endpoint/bucket_name - s3forcepathstyle: true -``` - -To write to S3, Loki will require the following permissions on the bucket: - -* s3:ListBucket -* s3:PutObject -* s3:GetObject - -#### DynamoDB - -Loki uses DynamoDB for the index storage. It is used for querying logs, make -sure you adjust your throughput to your usage. - -DynamoDB access is very similar to S3, however you do not need to specify a -table name in the storage section, as Loki will calculate that for you. -You will need to set the table name prefix inside schema config section, -and ensure the `index.prefix` table exists. - -You can setup DynamoDB by yourself, or have `table-manager` setup for you. -You can find out more info about table manager at -[Cortex project](https://github.com/cortexproject/cortex). -There is an example table manager deployment inside the ksonnet deployment method. You can find it [here](../production/ksonnet/loki/table-manager.libsonnet) -The table-manager allows deleting old indices by rotating a number of different dynamodb tables and deleting the oldest one. If you choose to -create the table manually you cannot easily erase old data and your index just grows indefinitely. - -If you set your DynamoDB table manually, ensure you set the primary index key to `h` -(string) and use `r` (binary) as the sort key. Also set the "period" attribute in the yaml to zero. -Make sure adjust your throughput base on your usage. - -DynamoDB's table manager client defaults provisioning capacity units read to 300 and writes to 3000. -If you wish to override these defaults the config section should include: - -```yaml -table_manager: - index_tables_provisioning: - provisioned_write_throughput: 10 - provisioned_read_throughput: 10 - chunk_tables_provisioning: - provisioned_write_throughput: 10 - provisioned_read_throughput: 10 -``` - -For DynamoDB, Loki will require the following permissions on the table: - -* dynamodb:BatchGetItem -* dynamodb:BatchWriteItem -* dynamodb:DeleteItem -* dynamodb:DescribeTable -* dynamodb:GetItem -* dynamodb:ListTagsOfResource -* dynamodb:PutItem -* dynamodb:Query -* dynamodb:TagResource -* dynamodb:UntagResource -* dynamodb:UpdateItem -* dynamodb:UpdateTable diff --git a/docs/loki/storage.md b/docs/loki/storage.md new file mode 100644 index 0000000000000..8bfcc520e77f9 --- /dev/null +++ b/docs/loki/storage.md @@ -0,0 +1,158 @@ +# Storage + +Loki needs to store two different types of data: **Chunks** and **Indexes**. + +Loki receives logs in separate streams. Each stream is identified by a set of labels. +As the log entries from a stream arrive, they are gzipped as chunks and saved in +the chunks store. The chunk format is documented in [`pkg/chunkenc`](../pkg/chunkenc/README.md). + +On the other hand, the index stores the stream's label set and links them to the +individual chunks. + +### Local storage + +By default, Loki stores everything on disk. The index is stored in a BoltDB under +`/tmp/loki/index` and the chunks are stored under `/tmp/loki/chunks`. + +### Google Cloud Storage + +Loki supports Google Cloud Storage. Refer to Grafana Labs' +[production setup](https://github.com/grafana/loki/blob/a422f394bb4660c98f7d692e16c3cc28747b7abd/production/ksonnet/loki/config.libsonnet#L55) +for the relevant configuration fields. + +### Cassandra + +Loki can use Cassandra for the index storage. Example config using Cassandra: + +```yaml +schema_config: + configs: + - from: 2018-04-15 + store: cassandra + object_store: filesystem + schema: v9 + index: + prefix: cassandra_table + period: 168h + +storage_config: + cassandra: + username: cassandra + password: cassandra + addresses: 127.0.0.1 + auth: true + keyspace: lokiindex + + filesystem: + directory: /tmp/loki/chunks +``` + +### AWS S3 & DynamoDB + +Example config for using S3 & DynamoDB: + +```yaml +schema_config: + configs: + - from: 0 + store: dynamo + object_store: s3 + schema: v9 + index: + prefix: dynamodb_table_name + period: 0 +storage_config: + aws: + s3: s3://access_key:secret_access_key@region/bucket_name + dynamodbconfig: + dynamodb: dynamodb://access_key:secret_access_key@region +``` + +If you don't wish to hard-code S3 credentials, you can also configure an +EC2 instance role by changing the `storage_config` section: + +```yaml +storage_config: + aws: + s3: s3://region/bucket_name + dynamodbconfig: + dynamodb: dynamodb://region +``` + +#### S3 + +Loki can use S3 as object storage, storing logs within directories based on +the [OrgID](./operations.md#Multi-tenancy). For example, logs from the `faker` +org will be stored in `s3://BUCKET_NAME/faker/`. + +The S3 configuration is set up using the URL format: +`s3://access_key:secret_access_key@region/bucket_name`. + +S3-compatible APIs (e.g., Ceph Object Storage with an S3-compatible API) can +be used. If the API supports path-style URL rather than virtual hosted bucket +addressing, configure the URL in `storage_config` with the custom endpoint: + +```yaml +storage_config: + aws: + s3: s3://access_key:secret_access_key@custom_endpoint/bucket_name + s3forcepathstyle: true +``` + +Loki needs the following permissions to write to an S3 bucket: + +* s3:ListBucket +* s3:PutObject +* s3:GetObject + +#### DynamoDB + +Loki can use DynamoDB for storing the index. The index is used for querying +logs. Throughput to the index should be adjusted to your usage. + +Access to DynamoDB is very similar to S3; however, a table name does not +need to be specified in the storage section, as Loki calculates that for +you. The table name prefix will need to be configured inside `schema_config` +for Loki to be able to create new tables. + +DynamoDB can be set up manually or automatically through `table-manager`. +The `table-manager` allows deleting old indices by rotating a number of +different DynamoDB tables and deleting the oldest one. An example deployment +of the `table-manager` using ksonnet can be found +[here](../production/ksonnet/loki/table-manager.libsonnet) and more information +about it can be find at the +[Cortex project](https://github.com/cortexproject/cortex). + +DynamoDB's `table-manager` client defaults provisioning capacity units +read to 300 and writes to 3000. The defaults can be overwritten in the +config: + +```yaml +table_manager: + index_tables_provisioning: + provisioned_write_throughput: 10 + provisioned_read_throughput: 10 + chunk_tables_provisioning: + provisioned_write_throughput: 10 + provisioned_read_throughput: 10 +``` + +If DynamoDB is set up manually, old data cannot be easily erased and the index +will grow indefinitely. Manual configurations should ensure that the primary +index key is set to `h` (string) and the sort key is set to `r` (binary). The +"period" attribute in the yaml should be set to zero. + +Loki needs the following permissions to write to DynamoDB: + +* dynamodb:BatchGetItem +* dynamodb:BatchWriteItem +* dynamodb:DeleteItem +* dynamodb:DescribeTable +* dynamodb:GetItem +* dynamodb:ListTagsOfResource +* dynamodb:PutItem +* dynamodb:Query +* dynamodb:TagResource +* dynamodb:UntagResource +* dynamodb:UpdateItem +* dynamodb:UpdateTable diff --git a/docs/promtail/api.md b/docs/promtail/api.md index 2d1cf892ab620..a60d3b663a1e6 100644 --- a/docs/promtail/api.md +++ b/docs/promtail/api.md @@ -1,12 +1,12 @@ -# Promtail API +# API Promtail features an embedded web server exposing a web console at `/` and the following API endpoints: -- `GET /ready` +### `GET /ready` This endpoint returns 200 when Promtail is up and running, and there's at least one working target. -- `GET /metrics` +### `GET /metrics` This endpoint returns Promtail metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. diff --git a/docs/promtail/configuration.md b/docs/promtail/configuration.md new file mode 100644 index 0000000000000..e86a9a8cc7488 --- /dev/null +++ b/docs/promtail/configuration.md @@ -0,0 +1,185 @@ +# Configuration + +## `scrape_configs` (Target Discovery) +The way how Promtail finds out the log locations and extracts the set of labels +is by using the `scrape_configs` section in the `promtail.yaml` configuration +file. The syntax is equal to what [Prometheus +uses](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config). + +The `scrape_configs` contains one or more *entries* which are all executed for +each discovered target (read each container in each new pod running in the instance): +```yaml +scrape_configs: + - job_name: local + static_configs: + - ... + + - job_name: kubernetes + kubernetes_sd_config: + - ... +``` + +If more than one entry matches your logs, you will get duplicates as the logs are +sent in more than one stream, likely with a slightly different labels. + +There are different types of labels present in Promtail: + +* Labels starting with `__` (two underscores) are internal labels. They usually + come from dynamic sources like the service discovery. Once relabeling is done, + they are removed from the label set. To persist those, rename them to + something not starting with `__`. +* Labels starting with `__meta_kubernetes_pod_label_*` are "meta labels" which + are generated based on your kubernetes pod labels. + Example: If your kubernetes pod has a label `name` set to `foobar` then the + `scrape_configs` section will have a label `__meta_kubernetes_pod_label_name` + with value set to `foobar`. +* There are other `__meta_kubernetes_*` labels based on the Kubernetes + metadadata, such as the namespace the pod is running in + (`__meta_kubernetes_namespace`) or the name of the container inside the pod + (`__meta_kubernetes_pod_container_name`). Refer to [the Prometheus + docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config) + for the full list. +* The label `__path__` is a special label which Promtail will use afterwards to + figure out where the file to be read is located. Wildcards are allowed. +* The label `filename` is added for every file found in `__path__` to ensure + uniqueness of the streams. It contains the absolute path of the file the line + was read from. + +## `relabel_configs` (Relabeling) +The most important part of each entry is the `relabel_configs` stanza, which is a list +of operations to create, rename, modify or alter the labels. + +A single `scrape_config` can also reject logs by doing an `action: drop` if a label value +matches a specified regex, which means that this particular `scrape_config` will +not forward logs from a particular log source. +This does not mean that other `scrape_config`'s might not do, though. + +Many of the `scrape_configs` read labels from `__meta_kubernetes_*` meta-labels, +assign them to intermediate labels such as `__service__` based on +different logic, possibly drop the processing if the `__service__` was empty +and finally set visible labels (such as `job`) based on the `__service__` +label. + +In general, all of the default Promtail `scrape_configs` do the following: + + * They read pod logs from under `/var/log/pods/$1/*.log`. + * They set `namespace` label directly from the `__meta_kubernetes_namespace`. + * They expect to see your pod name in the `name` label + * They set a `job` label which is roughly `namespace/job` + +#### Examples + +* Drop the processing if a label is empty: +```yaml + - action: drop + regex: ^$ + source_labels: + - __service__ +``` +* Drop the processing if any of these labels contains a value: +```yaml + - action: drop + regex: .+ + separator: '' + source_labels: + - __meta_kubernetes_pod_label_name + - __meta_kubernetes_pod_label_app +``` +* Rename a metadata label into another so that it will be visible in the final log stream: +```yaml + - action: replace + source_labels: + - __meta_kubernetes_namespace + target_label: namespace +``` +* Convert all of the Kubernetes pod labels into visible labels: +```yaml + - action: labelmap + regex: __meta_kubernetes_pod_label_(.+) +``` + + +Additional reading: + + * [Julien Pivotto's slides from PromConf Munich, 2017](https://www.slideshare.net/roidelapluie/taking-advantage-of-prometheus-relabeling-109483749) + +## `client_option` (HTTP Client) +Promtail uses the Prometheus HTTP client implementation for all calls to Loki. +Therefore, you can configure it using the `client` stanza: +```yaml +client: [ ] +``` + +Reference for `client_option`: +```yaml +# Sets the `url` of loki api push endpoint +url: http[s]://:/api/prom/push + +# Sets the `Authorization` header on every promtail request with the +# configured username and password. +# password and password_file are mutually exclusive. +basic_auth: + username: + password: + password_file: + +# Sets the `Authorization` header on every promtail request with +# the configured bearer token. It is mutually exclusive with `bearer_token_file`. +bearer_token: + +# Sets the `Authorization` header on every promtail request with the bearer token +# read from the configured file. It is mutually exclusive with `bearer_token`. +bearer_token_file: /path/to/bearer/token/file + +# Configures the promtail request's TLS settings. +tls_config: + # CA certificate to validate API server certificate with. + # If not provided Trusted CA from sytem will be used. + ca_file: + + # Certificate and key files for client cert authentication to the server. + cert_file: + key_file: + + # ServerName extension to indicate the name of the server. + # https://tools.ietf.org/html/rfc4366#section-3.1 + server_name: + + # Disable validation of the server certificate. + insecure_skip_verify: + +# Optional proxy URL. +proxy_url: + +# Maximum wait period before sending batch +batchwait: 1s + +# Maximum batch size to accrue before sending, unit is byte +batchsize: 102400 + +# Maximum time to wait for server to respond to a request +timeout: 10s + +backoff_config: + # Initial backoff time between retries + minbackoff: 100ms + # Maximum backoff time between retries + maxbackoff: 5s + # Maximum number of retires when sending batches, 0 means infinite retries + maxretries: 5 + +# The labels to add to any time series or alerts when communicating with loki +external_labels: {} +``` + +#### Ship to multiple Loki Servers +Promtail is able to push logs to as many different Loki servers as you like. Use +`clients` instead of `client` if needed: +```yaml +# Single Loki +client: [ ] + +# Multiple Loki instances +clients: + - [ ] +``` diff --git a/docs/promtail/deployment.md b/docs/promtail/deployment.md new file mode 100644 index 0000000000000..4a95805edf03c --- /dev/null +++ b/docs/promtail/deployment.md @@ -0,0 +1,150 @@ +# Installation +Promtail is distributed in binary and in container form. + +Once it is installed, you have basically two options for operating it: +Either as a daemon sitting on every node, or as a sidecar for the application. + +This usually only depends on the configuration though. +## Binary +Every release includes binaries: + +```bash +# download a binary (adapt app, os and arch as needed) +# installs v0.2.0. Go to the releases page for up to date URLs +$ curl -fSL -o "/usr/local/bin/promtail.gz" "https://github.com/grafana/promtail/releases/download/v0.2.0/promtail-linux-amd64.gz" +$ gunzip "/usr/local/bin/promtail.gz" + +# make sure it is executable +$ chmod a+x "/usr/local/bin/promtail" +``` + +Binaries for macOS and Windows are also provided at the [releases page](https://github.com/grafana/loki/releases). + +## Docker +```bash +# adapt tag to most recent version +$ docker pull grafana/promtail:v0.2.0 +``` + +## Kubernetes +On Kubernetes, you will use the Docker container above. However, you have too +choose whether you want to run in daemon mode (`DaemonSet`) or sidecar mode +(`Pod container`) in before. +### Daemonset method (Recommended) + +A `DaemonSet` will deploy `promtail` on every node within the Kubernetes cluster. + +This deployment is great to collect the logs of all containers within the +cluster. It is the best solution for a single tenant. + +```yaml +---Daemonset.yaml +apiVersion: extensions/v1beta1 +kind: Daemonset +metadata: + name: promtail-daemonset + ... +spec: + ... + template: + spec: + serviceAccount: SERVICE_ACCOUNT + serviceAccountName: SERVICE_ACCOUNT + volumes: + - name: logs + hostPath: HOST_PATH + - name: promtail-config + configMap + name: promtail-configmap + containers: + - name: promtail-container + args: + - -config.file=/etc/promtail/promtail.yaml + volumeMounts: + - name: logs + mountPath: MOUNT_PATH + - name: promtail-config + mountPath: /etc/promtail + ... + +---configmap.yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: promtail-config + ... +data: + promtail.yaml: YOUR CONFIG + +---Clusterrole.yaml +apiVersion: rbac.authorization.k8s.io/v1 +kind: ClusterRole +metadata: + name: promtail-clusterrole +rules: + - apiGroups: + resources: + - nodes + - services + - pod + verbs: + - get + - watch + - list +---ServiceAccount.yaml +apiVersion: v1 +kind: ServiceAccount +metadata: + name: promtail-serviceaccount + +---Rolebinding +apiVersion: rbac.authorization.k9s.io/v1 +kind: ClusterRoleBinding +metadata: + name: promtail-clusterrolebinding +subjects: + - kind: ServiceAccount + name: promtail-serviceaccount +roleRef: + kind: ClusterRole + name: promtail-clusterrole + apiGroup: rbac.authorization.k8s.io +``` + +### Sidecar Method +This method will deploy `promtail` as a sidecar container within a pod. +In a multi-tenant environment, this enables teams to aggregate logs +for specific pods and deployments for example for all pods in a namespace. + +```yaml +---Deployment.yaml +apiVersion: extensions/v1beta1 +kind: Deployment +metadata: + name: my_test_app + ... +spec: + ... + template: + spec: + serviceAccount: SERVICE_ACCOUNT + serviceAccountName: SERVICE_ACCOUNT + volumes: + - name: logs + hostPath: HOST_PATH + - name: promtail-config + configMap + name: promtail-configmap + containers: + - name: promtail-container + args: + - -config.file=/etc/promtail/promtail.yaml + volumeMounts: + - name: logs + mountPath: MOUNT_PATH + - name: promtail-config + mountPath: /etc/promtail + ... + ... + +``` diff --git a/docs/promtail/examples.md b/docs/promtail/examples.md new file mode 100644 index 0000000000000..110515cb15496 --- /dev/null +++ b/docs/promtail/examples.md @@ -0,0 +1,92 @@ +# Examples + +This document shows some example use-cases for promtail and their configuration. + +## Local Config +Using this configuration, all files in `/var/log` and `/srv/log/someone_service` are ingested into Loki. +The labels `job` and `host` are set using `static_configs`. + +When using this configuration with Docker, do not forget to mount the configuration, `/var/log` and `/src/log/someone_service` using [volumes](https://docs.docker.com/storage/volumes/). + +```yaml +server: + http_listen_port: 9080 + grpc_listen_port: 0 + +positions: + filename: /tmp/positions.yaml # progress of the individual files + +client: + url: http://ip_or_hostname_where_loki_runs:3100/api/prom/push + +scrape_configs: + - job_name: system + pipeline_stages: + - docker: # Docker wraps logs in json. Undo this. + static_configs: # running locally here, no need for service discovery + - targets: + - localhost + labels: + job: varlogs + host: yourhost + __path__: /var/log/*.log # tail all files under /var/log + + - job_name: someone_service + pipeline_stages: + - docker: # Docker wraps logs in json. Undo this. + static_configs: # running locally here, no need for service discovery + - targets: + - localhost + labels: + job: someone_service + host: yourhost + __path__: /srv/log/someone_service/*.log # tail all files under /srv/log/someone_service + +``` + +## Systemd Journal +This example shows how to ship the `systemd` journal to Loki. + +Just like the Docker example, the `scrape_configs` section holds various +jobs for parsing logs. A job with a `journal` key configures it for systemd +journal reading. + +`path` is an optional string specifying the path to read journal entries +from. If unspecified, defaults to the system default (`/var/log/journal`). + +`labels`: is a map of string values specifying labels that should always +be associated with each log entry being read from the systemd journal. +In our example, each log will have a label of `job=systemd-journal`. + +Every field written to the systemd journal is available for processing +in the `relabel_configs` section. Label names are converted to lowercase +and prefixed with `__journal_`. After `relabel_configs` processes all +labels for a job entry, any label starting with `__` is deleted. + +Our example renames the `_SYSTEMD_UNIT` label (available as +`__journal__systemd_unit` in promtail) to `unit** so it will be available +in Loki. All other labels from the journal entry are dropped. + +When running using Docker, **remember to bind the journal into the container**. + +```yaml +server: + http_listen_port: 9080 + grpc_listen_port: 0 + +positions: + filename: /tmp/positions.yaml + +clients: + - url: http://ip_or_hostname_where_loki_runns:3100/api/prom/push + +scrape_configs: + - job_name: journal + journal: + path: /var/log/journal + labels: + job: systemd-journal + relabel_configs: + - source_labels: ['__journal__systemd_unit'] + target_label: 'unit' +``` diff --git a/docs/promtail/overview.md b/docs/promtail/overview.md new file mode 100644 index 0000000000000..e6b55305dcc5d --- /dev/null +++ b/docs/promtail/overview.md @@ -0,0 +1,41 @@ +# Overview +Promtail is an agent which ships the content of local log files to Loki. It is +usually deployed to every machine that has applications needed to be monitored. + +It primarily **discovers** targets, attaches **labels** to log streams and +**pushes** them to the Loki instance. + +### Discovery +Before Promtail is able to ship anything to Loki, it needs to find about its +environment. This specifically means discovering applications emitting log lines +that need to be monitored. + +Promtail borrows the [service discovery mechanism from +Prometheus](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config), +although it currently only supports `static` and `kubernetes` service discovery. +This is due to the fact that `promtail` is deployed as a daemon to every local +machine and does not need to discover labels from other systems. `kubernetes` +service discovery fetches required labels from the api-server, `static` usually +covers the other use cases. + +Just like Prometheus, `promtail` is configured using a `scrape_configs` stanza. +`relabel_configs` allows fine-grained control of what to ingest, what to drop +and the final metadata attached to the log line. Refer to the +[configuration](configuration.md) for more details. + +### Labeling and Parsing +During service discovery, metadata is determined (pod name, filename, etc.) that +may be attached to the log line as a label for easier identification afterwards. +Using `relabel_configs`, those discovered labels can be mutated into the form +they should have for querying. + +To allow more sophisticated filtering afterwards, Promtail allows to set labels +not only from service discovery, but also based on the contents of the log +lines. The so-called `pipeline_stages` can be used to add or update labels, +correct the timestamp or rewrite the log line entirely. Refer to the [log +parsing documentation](parsing.md) for more details. + +### Shipping +Once Promtail is certain about what to ingest and all labels are set correctly, +it starts *tailing* (continuously reading) the log files from the applications. +Once enough data is read into memory, it is flushed in as a batch to Loki. diff --git a/docs/logentry/processing-log-lines.md b/docs/promtail/parsing.md similarity index 99% rename from docs/logentry/processing-log-lines.md rename to docs/promtail/parsing.md index 8b5c0ca3e11ab..1b876847c20ca 100644 --- a/docs/logentry/processing-log-lines.md +++ b/docs/promtail/parsing.md @@ -1,4 +1,4 @@ -# Processing Log Lines +# Log Parsing A detailed look at how to setup promtail to process your log lines, including extracting metrics and labels. diff --git a/docs/querying.md b/docs/querying.md new file mode 100644 index 0000000000000..562f8f90d6c97 --- /dev/null +++ b/docs/querying.md @@ -0,0 +1,111 @@ +# Querying + +To get the previously ingested logs back from Loki for analysis, you need a +client that supports LogQL. +Grafana will be the first choice for most users, +nevertheless [LogCLI](logcli.md) represents a viable standalone alternative. + +## Clients +### Grafana + +Grafana ships with built-in support for Loki for versions greater than +[6.0](https://grafana.com/grafana/download). + +1. Log into your Grafana, e.g, `http://localhost:3000` (default username: + `admin`, default password: `admin`) +2. Go to `Configuration` > `Data Sources` via the cog icon on the left side bar. +3. Click the big + Add data source button. +4. Choose Loki from the list. +5. The http URL field should be the address of your Loki server e.g. + `http://localhost:3100` when running locally or with docker, + `http://loki:3100` when running with docker-compose or kubernetes. +6. To see the logs, click Explore on the sidebar, select the Loki + datasource, and then choose a log stream using the Log labels + button. + +Read more about the Explore feature in the [Grafana +docs](http://docs.grafana.org/features/explore) and on how to search and filter +logs with Loki. + +> To configure the datasource via provisioning see [Configuring Grafana via +> Provisioning](http://docs.grafana.org/features/datasources/loki/#configure-the-datasource-with-provisioning) +> and make sure to adjust the URL similarly as shown above. + +### LogCLI +If you do not want (or can) use a full Grafana instance, [LogCLI](logcli.md) is +a small command line application to run LogQL queries against a Loki server. +Refer to its [documentation](logcli.md) for reference. + +## LogQL +Loki has it's very own language for querying logs from the Loki server called *LogQL*. Think of +it as distributed `grep` with labels for selection. + +A log query consists of two parts: **log stream selector**, and a **filter +expression**. For performance reasons you need to start by choosing a set of log +streams using a Prometheus-style log stream selector. + +The log stream selector will reduce the number of log streams to a manageable +volume and then the regex search expression is used to do a distributed grep +over those log streams. + +### Log Stream Selector + +For the label part of the query expression, wrap it in curly braces `{}` and +then use the key value syntax for selecting labels. Multiple label expressions +are separated by a comma: + +`{app="mysql",name="mysql-backup"}` + +The following label matching operators are currently supported: + +- `=` exactly equal. +- `!=` not equal. +- `=~` regex-match. +- `!~` do not regex-match. + +Examples: + +- `{name=~"mysql.+"}` +- `{name!~"mysql.+"}` + +The same rules that apply for [Prometheus Label +Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) +apply for Loki Log Stream Selectors. + +### Filter Expression + +After writing the Log Stream Selector, you can filter the results further by +writing a search expression. The search expression can be just text or a regex +expression. + +Example queries: + +- `{job="mysql"} |= "error"` +- `{name="kafka"} |~ "tsdb-ops.*io:2003"` +- `{instance=~"kafka-[23]",name="kafka"} != kafka.server:type=ReplicaManager` + +Filter operators can be chained and will sequentially filter down the +expression - resulting log lines will satisfy _every_ filter. Eg: + +`{job="mysql"} |= "error" != "timeout"` + +The following filter types have been implemented: + +- `|=` line contains string. +- `!=` line does not contain string. +- `|~` line matches regular expression. +- `!~` line does not match regular expression. + +The regex expression accepts [RE2 +syntax](https://github.com/google/re2/wiki/Syntax). The matching is +case-sensitive by default and can be switched to case-insensitive prefixing the +regex with `(?i)`. + +### Query Language Extensions + +The query language is still under development to support more features, e.g.,: + +- `AND` / `NOT` operators +- Number extraction for timeseries based on number in log messages +- JSON accessors for filtering of JSON-structured logs +- Context (like `grep -C n`) diff --git a/docs/usage.md b/docs/usage.md deleted file mode 100644 index 70547a7f07ccc..0000000000000 --- a/docs/usage.md +++ /dev/null @@ -1,136 +0,0 @@ -# Using Grafana to Query your logs - -To query and display your logs you need to configure your Loki to be a datasource in your Grafana. - -> _Note_: Querying your logs without Grafana is possible by using [logcli](./logcli.md). - -## Configuring the Loki Datasource in Grafana - -Grafana ships with built-in support for Loki as part of its [latest release (6.0)](https://grafana.com/grafana/download). - -1. Log into your Grafana, e.g, http://localhost:3000 (default username: `admin`, default password: `admin`) -1. Go to `Configuration` > `Data Sources` via the cog icon on the left side bar. -1. Click the big `+ Add data source` button. -1. Choose Loki from the list. -1. The http URL field should be the address of your Loki server e.g. `http://localhost:3100` when running locally or with docker, `http://loki:3100` when running with docker-compose or kubernetes. -1. To see the logs, click "Explore" on the sidebar, select the Loki datasource, and then choose a log stream using the "Log labels" button. - -Read more about the Explore feature in the [Grafana docs](http://docs.grafana.org/features/explore) and on how to search and filter logs with Loki. - -> To configure the datasource via provisioning see [Configuring Grafana via Provisioning](http://docs.grafana.org/features/datasources/loki/#configure-the-datasource-with-provisioning) and make sure to adjust the URL similarly as shown above. - -## Searching with Labels and Distributed Grep - -A log filter query consists of two parts: **log stream selector**, and a **filter expression**. For performance reasons you need to start by choosing a set of log streams using a Prometheus-style log stream selector. - -The log stream selector will reduce the number of log streams to a manageable volume and then the regex search expression is used to do a distributed grep over those log streams. - -### Log Stream Selector - -For the label part of the query expression, wrap it in curly braces `{}` and then use the key value syntax for selecting labels. Multiple label expressions are separated by a comma: - -`{app="mysql",name="mysql-backup"}` - -The following label matching operators are currently supported: - -- `=` exactly equal. -- `!=` not equal. -- `=~` regex-match. -- `!~` do not regex-match. - -Examples: - -- `{name=~"mysql.+"}` -- `{name!~"mysql.+"}` - -The [same rules that apply for Prometheus Label Selectors](https://prometheus.io/docs/prometheus/latest/querying/basics/#instant-vector-selectors) apply for Loki Log Stream Selectors. - -### Filter Expression - -After writing the Log Stream Selector, you can filter the results further by writing a search expression. The search expression can be just text or a regex expression. - -Example queries: - -- `{job="mysql"} |= "error"` -- `{name="kafka"} |~ "tsdb-ops.*io:2003"` -- `{instance=~"kafka-[23]",name="kafka"} != kafka.server:type=ReplicaManager` - -Filter operators can be chained and will sequentially filter down the expression - resulting log lines will satisfy _every_ filter. Eg: - -`{job="mysql"} |= "error" != "timeout"` - -The following filter types have been implemented: - -- `|=` line contains string. -- `!=` line does not contain string. -- `|~` line matches regular expression. -- `!~` line does not match regular expression. - -The regex expression accepts [RE2 syntax](https://github.com/google/re2/wiki/Syntax). The matching is case-sensitive by default and can be switched to case-insensitive prefixing the regex with `(?i)`. - -### Query Language Extensions - -The query language is still under development to support more features, e.g.,: - -- `AND` / `NOT` operators -- Number extraction for timeseries based on number in log messages -- JSON accessors for filtering of JSON-structured logs -- Context (like `grep -C n`) - -## Counting logs - -Loki's LogQL support sample expression allowing to count entries per stream after the regex filtering stage. - -### Range Vector aggregation - -The language shares the same [range vector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) concept from Prometheus, except that the selected range of samples contains a value of one for each log entry. You can then apply an aggregation over the selected range to transform it into an instant vector. - -`rate` calculates the number of entries per second and `count_over_time` count of entries for the each log stream within the range. - -In this example, we count all the log lines we have recorded within the last 5min for the mysql job. - -> `count_over_time({job="mysql"}[5m])` - -A range vector aggregation can also be applied to a [Filter Expression](#filter-expression), allowing you to select only matching log entries. - -> `rate( ( {job="mysql"} |= "error" != "timeout)[10s] ) )` - -The query above will compute the per second rate of all errors except those containing `timeout` within the last 10 seconds. - -You can then use aggregation operators over the range vector aggregation. - -### Aggregation operators - -Like [PromQL](https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators), Loki's LogQL support a subset of built-in aggregation operators that can be used to aggregate the element of a single vector, resulting in a new vector of fewer elements with aggregated values: - -- `sum` (calculate sum over dimensions) -- `min` (select minimum over dimensions) -- `max` (select maximum over dimensions) -- `avg` (calculate the average over dimensions) -- `stddev` (calculate population standard deviation over dimensions) -- `stdvar` (calculate population standard variance over dimensions) -- `count` (count number of elements in the vector) -- `bottomk` (smallest k elements by sample value) -- `topk` (largest k elements by sample value) - -These operators can either be used to aggregate over all label dimensions or preserve distinct dimensions by including a without or by clause. - -> `([parameter,] ) [without|by (