Skip to content
/ shawk Public

[PoC] A socket-based tracing system for discovering network service dependencies. (renamed from transtracer)

License

Notifications You must be signed in to change notification settings

yuuki/shawk

Repository files navigation

Shawk

GitHub Actions status Latest Version Go Report Card License

shawk-logo


Shawk is a socket-based tracing infrastructure for discovering network dependencies among processes in distributed applications. Shawk has an architecture of monitoring network sockets, which are endpoints of TCP connections, to trace the dependencies.

Contributions

  • As long as applications use the TCP protocol stack in the Linux kernel, the dependencies are discovered by Transtracer.
  • The monitoring does not affect the network delay of the applications because the processing that only reads the connection information from network sockets is independent of the application communication.

System Overview

System structure

This figure shows the system configuration for matching the connection information related to multiple hosts and for creating a dependency graph. Tracer running on each host sends connection information to the central Connection Management DataBase (CMDB).

Socket diagnosis in polling mode

This figure shows how to retrieve socket information for TCP connections. When the Tracer process runs on the host, the Tracer process queries the Linux kernel and obtains a snapshot of the active TCP connection status from the socket corresponding to each connection. At the same time, the Tracer process acquires the process information corresponding to each connection. Then it links each connection and each process.

Requirements

  • OS: Linux
  • RDBMS: PostgreSQL 10+

Usage

$ shawk --help
Usage: shawk [options]

  A socket-based tracing system for discovering network dependencies in distributed applications.

Commands:
  look           show dependencies starting from a specified node.
  probe          start agent for collecting flows and processes.
  create-scheme  create CMDB scheme.

Options:
  --version         print version
  --credits         print credits
  --help, -h        print help

shawk probe

Run a daemon process of scanning connections in polling mode (default).

# SHAWK_PROBE_MODE=polling SHAWK_PROBE_INTERVAL=1s SHAWK_FLUSH_INTERVAL=10s SHAWK_CMDB_URL=postgres://shawk:[email protected]:5432/shawk?sslmode=disable&connect_timeout=1 shawk probe

Run a daemon process in streaming mode, which internaly uses eBPF.

# SHAWK_PROBE_MODE=streaming SHAWK_PROBE_INTERVAL=1s SHAWK_CMDB_URL=postgres://shawk:[email protected]:5432/shawk?sslmode=disable&connect_timeout=1 shawk probe

Run scanning connections only once.

# SHAWK_PROBE_MODE=streaming SHAWK_PROBE_INTERVAL=1s SHAWK_CMDB_URL=postgres://shawk:[email protected]:5432/shawk?sslmode=disable shawk probe --once

shawk look

# SHAWK_CMDB_URL=postgres://shawk:[email protected]:5432/shawk?sslmode=disable shawk look --ipv4 10.0.0.10
10.0.0.10:80 (’nginx’, pgid=4656)
└<-- 10.0.0.11:many (’wrk’, pgid=5982) 10.0.0.10:80 (’nginx’, pgid=4656)
└--> 10.0.0.12:8080 (’python’, pgid=6111) 10.0.0.10:many (’fluentd’, pgid=2127)
└--> 10.0.0.13:24224 (’fluentd’, pgid=2001)

Papers (including proceedings)

  1. Yuuki Tsubouchi, Masahiro Furukawa, Ryosoke Matsumoto, Transtracer: Automatically Tracing for Processes Dependencies in Distributed Systems by Monitoring Endpoints of TCP/UDP, IPSJ Internet and Operation Technology Symposium (IOTS2019), Vol. 2019, pp. 64-71, 2019. [paper] [slide]

License

MIT

Author

yuuki