Skip to content

Commit

Permalink
Merge branch 'master' into 950
Browse files Browse the repository at this point in the history
Conflicts:
	lib/linguist/languages.yml
  • Loading branch information
arfon committed May 5, 2014
2 parents e2b1fe3 + a2f721d commit 0a54df3
Show file tree
Hide file tree
Showing 193 changed files with 65,249 additions and 717 deletions.
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
Gemfile.lock
.bundle/
vendor/
4 changes: 1 addition & 3 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,8 @@ before_install:
- sudo apt-get install libicu-dev -y
- gem update --system 2.1.11
rvm:
- 1.8.7
- 1.9.2
- 1.9.3
- 2.0.0
- ree
- 2.1.1
notifications:
disabled: true
5 changes: 0 additions & 5 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,2 @@
source 'https://rubygems.org'
gemspec

if RUBY_VERSION < "1.9.3"
# escape_utils 1.0.0 requires 1.9.3 and above
gem "escape_utils", "0.3.2"
end
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,29 @@ To update the `samples.json` after adding new files to [`samples/`](https://gith

bundle exec rake samples

### A note on language extensions

Linguist has a number of methods available to it for identifying the language of a particular file. The initial lookup is based upon the extension of the file, possible file extensions are defined in an array called `extensions`. Take a look at this example for example for `Perl`:

```
Perl:
type: programming
ace_mode: perl
color: "#0298c3"
extensions:
- .pl
- .PL
- .perl
- .ph
- .plx
- .pm
- .pod
- .psgi
interpreters:
- perl
```
Any of the extensions defined are valid but the first in this array should be the most popular.

### Testing

Sometimes getting the tests running can be too much work, especially if you don't have much Ruby experience. It's okay: be lazy and let our build bot [Travis](http://travis-ci.org/#!/github/linguist) run the tests for you. Just open a pull request and the bot will start cranking away.
Expand Down
6 changes: 4 additions & 2 deletions github-linguist.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
require File.expand_path('../lib/linguist/version', __FILE__)

Gem::Specification.new do |s|
s.name = 'github-linguist'
s.version = '2.10.11'
s.version = Linguist::VERSION
s.summary = "GitHub Language detection"
s.description = 'We use this library at GitHub to detect blob languages, highlight code, ignore binary files, suppress generated files in diffs, and generate language breakdown graphs.'

Expand All @@ -12,7 +14,7 @@ Gem::Specification.new do |s|
s.executables << 'linguist'

s.add_dependency 'charlock_holmes', '~> 0.6.6'
s.add_dependency 'escape_utils', '>= 0.3.1'
s.add_dependency 'escape_utils', '~> 1.0.1'
s.add_dependency 'mime-types', '~> 1.19'
s.add_dependency 'pygments.rb', '~> 0.5.4'

Expand Down
1 change: 1 addition & 0 deletions lib/linguist.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,4 @@
require 'linguist/language'
require 'linguist/repository'
require 'linguist/samples'
require 'linguist/version'
2 changes: 1 addition & 1 deletion lib/linguist/blob_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ def language
data = lambda { (binary_mime_type? || binary?) ? "" : self.data }
end

@language = binary? ? nil : Language.detect(name.to_s, data, mode)
@language = Language.detect(name.to_s, data, mode)
end

# Internal: Get the lexer of the blob.
Expand Down
13 changes: 12 additions & 1 deletion lib/linguist/generated.rb
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,8 @@ def generated?
generated_protocol_buffer? ||
generated_jni_header? ||
composer_lock? ||
node_modules?
node_modules? ||
vcr_cassette?
end

# Internal: Is the blob an XCode project file?
Expand Down Expand Up @@ -235,5 +236,15 @@ def node_modules?
def composer_lock?
!!name.match(/composer.lock/)
end

# Is the blob a VCR Cassette file?
#
# Returns true or false
def vcr_cassette?
return false unless extname == '.yml'
return false unless lines.count > 2
# VCR Cassettes have "recorded_with: VCR" in the second last line.
return lines[-2].include?("recorded_with: VCR")
end
end
end
10 changes: 10 additions & 0 deletions lib/linguist/heuristics.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,9 @@ def self.find_by_heuristics(data, languages)
if languages.all? { |l| ["Common Lisp", "OpenCL"].include?(l) }
disambiguate_cl(data, languages)
end
if languages.all? { |l| ["Rebol", "R"].include?(l) }
disambiguate_r(data, languages)
end
end
end

Expand Down Expand Up @@ -73,6 +76,13 @@ def self.disambiguate_cl(data, languages)
matches
end

def self.disambiguate_r(data, languages)
matches = []
matches << Language["Rebol"] if /\bRebol\b/i.match(data)
matches << Language["R"] if data.include?("<-")
matches
end

def self.active?
!!ACTIVE
end
Expand Down
59 changes: 24 additions & 35 deletions lib/linguist/language.rb
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ class Language
@extension_index = Hash.new { |h,k| h[k] = [] }
@interpreter_index = Hash.new { |h,k| h[k] = [] }
@filename_index = Hash.new { |h,k| h[k] = [] }
@primary_extension_index = {}

# Valid Languages types
TYPES = [:data, :markup, :programming, :prose]
Expand Down Expand Up @@ -80,12 +79,6 @@ def self.create(attributes = {})
@extension_index[extension] << language
end

if @primary_extension_index.key?(language.primary_extension)
raise ArgumentError, "Duplicate primary extension: #{language.primary_extension}"
end

@primary_extension_index[language.primary_extension] = language

language.interpreters.each do |interpreter|
@interpreter_index[interpreter] << language
end
Expand Down Expand Up @@ -191,8 +184,7 @@ def self.find_by_alias(name)
# Returns all matching Languages or [] if none were found.
def self.find_by_filename(filename)
basename, extname = File.basename(filename), File.extname(filename)
langs = [@primary_extension_index[extname]] +
@filename_index[basename] +
langs = @filename_index[basename] +
@extension_index[extname]
langs.compact.uniq
end
Expand Down Expand Up @@ -299,15 +291,6 @@ def initialize(attributes = {})
@interpreters = attributes[:interpreters] || []
@filenames = attributes[:filenames] || []

unless @primary_extension = attributes[:primary_extension]
raise ArgumentError, "#{@name} is missing primary extension"
end

# Prepend primary extension unless its already included
if primary_extension && !extensions.include?(primary_extension)
@extensions = [primary_extension] + extensions
end

# Set popular, and searchable flags
@popular = attributes.key?(:popular) ? attributes[:popular] : false
@searchable = attributes.key?(:searchable) ? attributes[:searchable] : true
Expand Down Expand Up @@ -395,20 +378,6 @@ def initialize(attributes = {})
# Returns the extensions Array
attr_reader :extensions

# Deprecated: Get primary extension
#
# Defaults to the first extension but can be overridden
# in the languages.yml.
#
# The primary extension can not be nil. Tests should verify this.
#
# This attribute is only used by app/helpers/gists_helper.rb for
# creating the language dropdown. It really should be using `name`
# instead. Would like to drop primary extension.
#
# Returns the extension String.
attr_reader :primary_extension

# Public: Get interpreters
#
# Examples
Expand All @@ -426,6 +395,27 @@ def initialize(attributes = {})
#
# Returns the extensions Array
attr_reader :filenames

# Public: Return all possible extensions for language
def all_extensions
(extensions + [primary_extension]).uniq
end

# Deprecated: Get primary extension
#
# Defaults to the first extension but can be overridden
# in the languages.yml.
#
# The primary extension can not be nil. Tests should verify this.
#
# This method is only used by app/helpers/gists_helper.rb for creating
# the language dropdown. It really should be using `name` instead.
# Would like to drop primary extension.
#
# Returns the extension String.
def primary_extension
extensions.first
end

# Public: Get URL escaped name.
#
Expand Down Expand Up @@ -485,7 +475,7 @@ def searchable?
#
# Returns html String
def colorize(text, options = {})
lexer.highlight(text, options = {})
lexer.highlight(text, options)
end

# Public: Return name as String representation
Expand Down Expand Up @@ -568,9 +558,8 @@ def inspect
:group_name => options['group'],
:searchable => options.key?('searchable') ? options['searchable'] : true,
:search_term => options['search_term'],
:extensions => options['extensions'].sort,
:extensions => [options['extensions'].first] + options['extensions'][1..-1].sort,
:interpreters => options['interpreters'].sort,
:primary_extension => options['primary_extension'],
:filenames => options['filenames'],
:popular => popular.include?(name)
)
Expand Down
Loading

0 comments on commit 0a54df3

Please sign in to comment.