2.2.1. Saving the filename and content-type

When a file is uploaded, there’s more involved than just the data inside the file. You might be interested in the filename and the content-type as well.

The uploaded file contains all the extras were looking for, we just have to know where to look for the info, and what to do with what we find.

The object returned by CGI has two methods that we’re particularly interested in: original_filename() and content_type().

Example

class PictureController < ApplicationController
  ...
  def new
    picture = Picture.new()
    @picture.uploaded_file = @params['picture_file']
    @picture.save
    redirect_to :action => 'index'
  end
  ...
end
class Picture < ActiveRecord::Base
  ...
  def uploaded_file=(incoming_file)
    self.file_name = incoming_file.original_filename
    self.content_type = incoming_file.content_type
    self.file = incoming_file
  end

  def file_name=(new_file_name)
    write_attribute("file_name", sanitize_filename(new_file_name))
  end

 private

  def sanitize_filename(file_name)
    # get only the filename, not the whole path (from IE)
    just_filename = File.basename(file_name) 
    # replace all non-alphanumeric, underscore or periods with underscores
    just_filename.gsub(/[^\w\.\-]/,'_') 
  end
  ...
end

The approach taken in the solution above encapsulates the filehandling logic in the model instead of the controller. The only duty the controller has is to instantiate a new Picture and hand over the uploaded file.

  # in PictureController
  @picture = Picture.new()
  @picture.uploaded_file = @params['picture_file']

Special Treatments

Because an uploaded file is not a simple object, and because we want to retrieve certain information from it, a custom setter method was written in the Picture model to handle processing the uploaded file.

  # in Picture class
  def uploaded_file=(incoming_file)
    self.file_name = incoming_file.original_filename
    self.content_type = incoming_file.content_type
    self.file = incoming_file
  end
This setter does three things:
  • retrieve and assign the orginal filename (using the original_filename method)
  • retrieve and assign the content-type (using the content_type method)
  • set @file (to be processed elsewhere)

We’re still not quite there yet. The incoming filename needs some attention. To that end we have another setter method (file_name=) which attempts to clean-up the filename before assigning it to the model.

The sanitize_filename method does two things
  • trim the filename down (removing any excess paths—see below)
  • replace any funny characters with an underscore

Files uploaded from Internet Explorer

Internet Explorer includes the entire path of a file in the filename sent, so the original_filename routine will return something like:

C:\Documents and Files\user_name\Pictures\My File.jpg 
instead of just:
My File.jpg

This is easily handled by File.basename, which strips out everything before the filename.

def sanitize_filename(file_name)
  # get only the filename, not the whole path (from IE)
  just_filename = File.basename(file_name) 
  # replace all none alphanumeric, underscore or perioids with underscore
  just_filename.sub(/[^\w\.\-]/,'_') 
end

In the given solutions, this is a method of the Picture class. In practice, if you were to have numerous different classes that dealt with uploaded files, the method would do well placed in a helper or library.

What’s missing?

  • validation
  • saving the file content, either to the filesystem or a database
  • a bulletproof sanitize_filename method. This one was hastily written and is not well tested.