Parsing metrics #219
-
This project seems perfect for what I'm working on now. I'm trying to write a parser for openTSDB formatted metrics, and hitting a dead end. Been at it a few hours now, but it's still not working. I think I'm close, but I'm not sure what I'm missing. Any help is greatly appreciated package main
import (
"fmt"
"os"
"github.com/alecthomas/participle/v2"
"github.com/alecthomas/participle/v2/lexer"
)
// A OpenTSDB format metric
// put <metric> <timestamp> <value> <tagk1=tagv1[ tagk2=tagv2 ...tagkN=tagvN]>
type Point struct {
prefix string `@"put"`
Name string `@(Name|Qname)`
Timestamp int64 `@Timestamp?`
Value string `@Value`
Tags []*PointTag `@@*`
}
type PointTag struct {
Name string `(@Tagname "="`
Value string `@(Tagvalue|Qtagvalue))`
}
var metricLexer = lexer.MustSimple([]lexer.Rule{
{"Name", `[a-zA-Z0-9\-_\.]+`, nil},
{"Qname", `"[a-zA-Z0-9\-_\.]"`, nil},
{"Value", `[+-]?[0-9\.eE]+`, nil},
{"Timestamp", `[0-9]+`, nil},
{"Tagname", `[a-zA-Z0-9_\.\-]+`, nil},
{"Tagvalue", `[a-zA-Z0-9_\.\-]+`, nil},
{"Qtagvalue", `".*"`, nil},
{"Whitespace", `[\s]+`, nil},
})
func NewParser() *participle.Parser {
return participle.MustBuild(&Point{},
participle.Lexer(metricLexer),
participle.Unquote("Qname", "Qtagvalue"),
participle.Trace(os.Stdout),
)
}
var tests = []string{
`put some.test.metric 21.349 1649450391 host=laptop test="5"`,
`put some.test.metric 2.1e19 1649450391 source=some-testpod-332bcf423b-mcdke test="5"`,
}
func main() {
var p = NewParser()
var point Point
for _, test := range tests {
err := p.ParseString("tsdb", test, &point)
if err != nil {
fmt.Println(err)
}
fmt.Printf("%#v\n", point)
}
} With that, I get this output;
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 3 replies
-
|
Beta Was this translation helpful? Give feedback.
-
Related, you seem to have tokens with overlapping patterns, eg.
And also
Traditional lexers (which Participle uses) operate in a completely separate pass to the parser. They walk over the text, splitting it into tokens, then those tokens are passed to the parser. So in the case of the tokens I mention above, the second token will never appear because the first one will always "win". |
Beta Was this translation helpful? Give feedback.
-
Hope this works for getting back my identity |
Beta Was this translation helpful? Give feedback.
invalid input text
means that the lexer didn't match anything - it looks like you don't have a lexer rule for=
?