Skip to content

Commit

Permalink
Ignore certain MIME types in fixCharset
Browse files Browse the repository at this point in the history
Some MIME types do not involve any text data and should not be processed
in fixCharset.  This should be done instead of forcing the user to
toggle the Collector.DetectCharset flag as it could prove problematic.
Fix gocolly#312
  • Loading branch information
vosmith committed Mar 28, 2019
1 parent 9192be9 commit b71a630
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions response.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,16 @@ func (r *Response) fixCharset(detectCharset bool, defaultEncoding string) error
return nil
}
contentType := strings.ToLower(r.Headers.Get("Content-Type"))

if strings.Contains(contentType, "image/") ||
strings.Contains(contentType, "video/") ||
strings.Contains(contentType, "audio/") ||
strings.Contains(contentType, "font/") {
// These MIME types should not have textual data.

return nil
}

if !strings.Contains(contentType, "charset") {
if !detectCharset {
return nil
Expand Down

0 comments on commit b71a630

Please sign in to comment.