Skip to content

Commit

Permalink
Expose methods for determining MIME type
Browse files Browse the repository at this point in the history
It's useful for developers building a file attachment flow to just be
able to just extract MIME type from an IO object, and use that value for
their own purposes. Therefore we expose `Shrine.determine_mime_type` and
`Shrine.mime_type_analyzers` methods:

  Shrine.determine_mime_type(io) # calls the defined analyzer
  #=> "image/jpeg"

  Shrine.mime_type_analyzers[:file].call(io) # calls a built-in analyzer
  #=> "image/jpeg"

During refactoring we extract mime type analyzing from the uploader into
a separate class, to avoid polluting the uploader with methods.

We also remove the :magic_header plugin option, because users shouldn't
have know about that. If the magic header size turns out to be too small
for certain file types, we'll bump the number up in Shrine directly. The
:magic_header option wasn't documented.
  • Loading branch information
janko committed Apr 2, 2017
1 parent 5d0c393 commit b509e42
Show file tree
Hide file tree
Showing 3 changed files with 99 additions and 54 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
## HEAD

* Remove the undocumented `:magic_header` option from `determine_mime_type` plugin (@janko-m)

* Expose `Shrine.determine_mime_type` and `Shrine.mime_type_analyzers` in `determine_mime_type` plugin (@janko-m)

* Add `signature` plugin for calculating a SHA{1,256,384,512}/MD5/CRC32 hash of a file (@janko-m)

* Return the resolved plugin module when calling `Shrine.plugin` (@janko-m)
Expand Down
133 changes: 82 additions & 51 deletions lib/shrine/plugins/determine_mime_type.rb
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@
class Shrine
module Plugins
# The `determine_mime_type` plugin stores the actual MIME type of the
# uploaded file.
# The `determine_mime_type` plugin allows you to determine and store the
# actual MIME type of the file analyzed from file content.
#
# plugin :determine_mime_type
#
# By default the UNIX [file] utility is used to determine the MIME type, but
# you can change it:
# By default the UNIX [file] utility is used to determine the MIME type,
# and the result is automatically written to the `mime_type` metadata
# field. You can choose a different built-in MIME type analyzer:
#
# plugin :determine_mime_type, analyzer: :filemagic
#
# The plugin accepts the following analyzers:
# The following analyzers are accepted:
#
# :file
# : (Default). Uses the [file] utility to determine the MIME type from file
Expand All @@ -33,32 +34,30 @@ module Plugins
# guaranteed to return the actual MIME type of the file.
#
# :default
# : Uses the default way of extracting the MIME type, and that is from the
# "Content-Type" request header, which might not hold the actual MIME type
# of the file.
#
# The `:mimemagic` analyzer can work on the IO object directly, so it will
# read however many bytes it needs, but to `:file` and `:filemagic`
# analyzers a fixed number of bytes is given (256KB by default), which can
# be changed with the `:magic_header` option.
#
# plugin :determine_mime_type, magic_header: 500*1024 # 500KB
# : Uses the default way of extracting the MIME type, and that is reading
# the `#content_type` attribute of the IO object, which might not hold
# the actual MIME type of the file.
#
# A single analyzer is not going to properly recognize all types of files,
# so you can build your own custom analyzer for your requirements, where
# you can combine the built-in analyzers.
#
# For example, if you want to accept .css, .js, .json, .csv, .xml, or
# similar text-based files, the `file` analyzer will detect all of these
# files as `text/plain`. So in that case you can additionally call the
# `mime_types` analyzer to determine the MIME type from file extension.
# you can combine the built-in analyzers. For example, if you want to
# correctly determine MIME type of .css, .js, .json, .csv, .xml, or similar
# text-based files, you can combine `file` and `mime_types` analyzers:
#
# plugin :determine_mime_type, analyzer: ->(io, analyzers) do
# mime_type = analyzers[:file].call(io)
# mime_type = analyzers[:mime_types].call(io) if mime_type == "text/plain"
# mime_type
# end
#
# You can also use methods for determining the MIME type directly:
#
# Shrine.determine_mime_type(io) # calls the defined analyzer
# #=> "image/jpeg"
#
# Shrine.mime_type_analyzers[:file].call(io) # calls a built-in analyzer
# #=> "image/jpeg"
#
# [file]: http://linux.die.net/man/1/file
# [Windows equivalent]: http://gnuwin32.sourceforge.net/packages/file.htm
# [ruby-filemagic]: https://github.com/blackwinter/ruby-filemagic
Expand All @@ -67,11 +66,26 @@ module Plugins
module DetermineMimeType
def self.configure(uploader, opts = {})
uploader.opts[:mime_type_analyzer] = opts.fetch(:analyzer, uploader.opts.fetch(:mime_type_analyzer, :file))
uploader.opts[:mime_type_magic_header] = opts.fetch(:magic_header, uploader.opts.fetch(:mime_type_magic_header, MAGIC_NUMBER))
end

# How many bytes we need to read in order to determine the MIME type.
MAGIC_NUMBER = 256 * 1024
module ClassMethods
def determine_mime_type(io)
analyzer = opts[:mime_type_analyzer]
analyzer = mime_type_analyzers[analyzer] if analyzer.is_a?(Symbol)
args = [io, mime_type_analyzers].take(analyzer.arity.abs)

mime_type = analyzer.call(*args)
io.rewind

mime_type
end

def mime_type_analyzers
@mime_type_analyzers ||= MimeTypeAnalyzer::SUPPORTED_TOOLS.inject({}) do |hash, tool|
hash.merge!(tool => MimeTypeAnalyzer.new(tool).method(:call))
end
end
end

module InstanceMethods
private
Expand All @@ -80,27 +94,43 @@ module InstanceMethods
# that value was already determined by this analyzer. Otherwise it calls
# a built-in analyzer or a custom one.
def extract_mime_type(io)
analyzer = opts[:mime_type_analyzer]
return super if analyzer == :default
if opts[:mime_type_analyzer] == :default
super
else
self.class.determine_mime_type(io)
end
end

analyzer = mime_type_analyzers[analyzer] if analyzer.is_a?(Symbol)
args = [io, mime_type_analyzers].take(analyzer.arity.abs)
def mime_type_analyzers
self.class.mime_type_analyzers
end
end

mime_type = analyzer.call(*args)
io.rewind
class MimeTypeAnalyzer
SUPPORTED_TOOLS = [:file, :filemagic, :mimemagic, :mime_types]
MAGIC_NUMBER = 256 * 1024

mime_type
attr_reader :tool

def initialize(tool)
raise ArgumentError, "unsupported mime type analyzer tool: #{tool}" unless SUPPORTED_TOOLS.include?(tool)

@tool = tool
end

def mime_type_analyzers
Hash.new { |hash, key| method(:"_extract_mime_type_with_#{key}") }
def call(io)
mime_type = send(:"extract_with_#{tool}", io)
io.rewind
mime_type
end

def _extract_mime_type_with_file(io)
private

def extract_with_file(io)
require "open3"

cmd = ["file", "--mime-type", "--brief", "-"]
options = {stdin_data: magic_header(io), binmode: true}
options = {stdin_data: io.read(MAGIC_NUMBER), binmode: true}

begin
stdout, stderr, status = Open3.capture3(*cmd, options)
Expand All @@ -114,26 +144,24 @@ def _extract_mime_type_with_file(io)
stdout.strip
end

def _extract_mime_type_with_mimemagic(io)
require "mimemagic"

mime = MimeMagic.by_magic(io)
io.rewind

mime.type if mime
end

def _extract_mime_type_with_filemagic(io)
def extract_with_filemagic(io)
require "filemagic"

filemagic = FileMagic.new(FileMagic::MAGIC_MIME_TYPE)
mime_type = filemagic.buffer(magic_header(io))
mime_type = filemagic.buffer(io.read(MAGIC_NUMBER))
filemagic.close

mime_type
end

def _extract_mime_type_with_mime_types(io)
def extract_with_mimemagic(io)
require "mimemagic"

mime = MimeMagic.by_magic(io)
mime.type if mime
end

def extract_with_mime_types(io)
begin
require "mime/types/columnar"
rescue LoadError
Expand All @@ -146,12 +174,15 @@ def _extract_mime_type_with_mime_types(io)
end
end

def magic_header(io)
content = io.read(opts[:mime_type_magic_header])
io.rewind
content
def extract_filename(io)
if io.respond_to?(:original_filename)
io.original_filename
elsif io.respond_to?(:path)
File.basename(io.path)
end
end
end

end

register_plugin(:determine_mime_type, DetermineMimeType)
Expand Down
16 changes: 13 additions & 3 deletions test/plugin/determine_mime_type_test.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
require "test_helper"
require "shrine/plugins/determine_mime_type"
require "stringio"
require "open3"

describe Shrine::Plugins::DetermineMimeType do
describe ":file analyzer" do
Expand All @@ -19,13 +20,11 @@
end

it "raises error if file command is not found" do
require "open3"
Open3.stubs(:capture3).raises(Errno::ENOENT)
assert_raises(Shrine::Error) { @uploader.send(:extract_mime_type, image) }
end

it "raises error if file command failed" do
require "open3"
failed_result = Open3.capture3("file", "--foo")
Open3.stubs(:capture3).returns(failed_result)
assert_raises(Shrine::Error) { @uploader.send(:extract_mime_type, image) }
Expand All @@ -34,7 +33,6 @@
it "fowards any warnings to stderr" do
assert_output(nil, "") { @uploader.send(:extract_mime_type, image) }

require "open3"
stderr_result = Open3.capture3("echo stderr 1>&2")
Open3.stubs(:capture3).returns(stderr_result)
assert_output(nil, "stderr\n") { @uploader.send(:extract_mime_type, image) }
Expand Down Expand Up @@ -118,4 +116,16 @@
@uploader.send(:extract_mime_type, file = image)
assert_equal 0, file.pos
end

it "provides class-level methods for extracting metadata" do
@uploader = uploader { plugin :determine_mime_type, analyzer: ->(io) { "foo/bar" } }
mime_type = @uploader.class.determine_mime_type(fakeio)
assert_equal "foo/bar", mime_type

analyzers = @uploader.class.mime_type_analyzers
mime_type = analyzers[:file].call(fakeio(filename: "file.json"))
assert_equal "text/plain", mime_type
mime_type = analyzers[:mime_types].call(fakeio(filename: "file.json"))
assert_equal "application/json", mime_type
end
end

0 comments on commit b509e42

Please sign in to comment.