diff --git a/README.md b/README.md index 667dafa147..958df8d596 100644 --- a/README.md +++ b/README.md @@ -6,12 +6,9 @@ We use this library at GitHub to detect blob languages, highlight code, ignore b ### Language detection -Linguist defines a list of all languages known to GitHub in a [yaml file](https://github.com/github/linguist/blob/master/lib/linguist/languages.yml). In order for a file to be highlighted, a language and lexer must be defined there. +Linguist defines a list of all languages known to GitHub in a [yaml file](https://github.com/github/linguist/blob/master/lib/linguist/languages.yml). In order for a file to be highlighted, a language and a lexer must be defined there. -Most languages are detected by their file extension. This is the fastest and most common situation. - -For disambiguating between files with common extensions, we first apply -some common-sense heuristics to pick out obvious languages. After that, we use a +Most languages are detected by their file extension. For disambiguating between files with common extensions, we first apply some common-sense heuristics to pick out obvious languages. After that, we use a [statistical classifier](https://github.com/github/linguist/blob/master/lib/linguist/classifier.rb). This process can help us tell the difference between, for example, `.h` files which could be either C, C++, or Obj-C.