Skip to content

Commit

Permalink
bilibili去掉活动作品 (iawia002#1001)
Browse files Browse the repository at this point in the history
现状:B站的下载文件总是带有“活动作品”4个字
想要达到的效果:不带有“活动作品”
分析: B站的up主因收益,经常选择做活动作品,而活动作品在页面h1里面带有“活动作品”的标签
解决方案:在可能的情况下,取h1的title属性

Co-authored-by: xiaochi <[email protected]>
  • Loading branch information
picasso250 and xiaochi authored Jan 18, 2022
1 parent 46f6796 commit a4aac9c
Showing 1 changed file with 6 additions and 3 deletions.
9 changes: 6 additions & 3 deletions parser/parser.go
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,12 @@ func GetImages(html, imgClass string, urlHandler func(string) string) (string, [
// Title get title
func Title(doc *goquery.Document) string {
var title string
title = strings.Replace(
strings.TrimSpace(doc.Find("h1").First().Text()), "\n", "", -1,
)
h1Elem := doc.Find("h1").First()
h1Title, found := h1Elem.Attr("title")
if !found {
h1Title = h1Elem.Text()
}
title = strings.Replace(strings.TrimSpace(h1Title), "\n", "", -1)
if title == "" {
// Bilibili: Some movie page got no h1 tag
title, _ = doc.Find("meta[property=\"og:title\"]").Attr("content")
Expand Down

0 comments on commit a4aac9c

Please sign in to comment.