Skip to content
/ hdq Public
forked from goplus/hdq

HTML DOM Query Language for Go+

License

Notifications You must be signed in to change notification settings

xushiwei/hdq

Repository files navigation

hdq - HTML DOM Query Language for Go+

Build Status Go Report Card GitHub release Coverage Status Language GoDoc

Summary about hdq

hdq is a Go+ package for processing HTML document.

Tutorials

Collect links of a html page

How to collect all links of a html page? If you use hdq, it is very easy.

import "github.com/qiniu/hdq"

func links(url interface{}) []string {
	doc := hdq.Source(url)
	return [link for a <- doc.any.a, link := a.hrefVal?:""; link != ""]
}

At first, we call hdq.Source(url) to create a node set named doc. doc is a node set which only contains one node, the root node.

Then, select all a elements by doc.any.a. Here doc.any means all nodes in the html document.

Then, we visit all these a elements, get href attribute value and assign it to the variable link. If link is not empty, collect it.

At last, we return all collected links. Goto tutorial/01-Links to get the full source code.

About

HTML DOM Query Language for Go+

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Go 100.0%