It's common to write scripts to convert files from one format to another. Everybody wrote tons of utilities with interface like water_to_wine_converter source.water target.wine or somesuch. That's the easy part. Pretty much every time the next step is immediately - well, what if I want to convert whole directory of those?
I keep running into this problem over and over, and the solutions I end up writing converged to fairly similar form every time, so I thought I'll write about it.
Well, first, we're going to require pathname library. Once upon a time I used to just use Strings for file paths, but the more I use Pathname the more I like it. It lacks a bunch of methods I need often, and could definitely get better, but it's still a big improvement over using raw Strings.
I'll assume for simplicity we don't need any fancy command line argument processing, but if you do, it doesn't change the rest of the pattern.
require "pathname"
class WaterToWineConverter
def initialize(input_path, output_path)
@input_path = input_path
@output_path = output_path
end
# Actual code
end
unless ARGV.size == 2
STDERR.puts "Converts water to wine format"
STDERR.puts "Usage #{$0} deck.water deck.wine"
STDERR.puts " or #{$0} water_folder/ wine_folder/"
exit 1
end
input_path = Pathname(ARGV[0])
output_path = Pathname(ARGV[1])
WaterToWineConverter.new(input_path, output_path).run!
So far so good. Alternatively we could pass raw Strings to constructor, and convert them to Pathname there:
require "pathname"
class WaterToWineConverter
def initialize(input_path, output_path)
@input_path = Pathname(input_path)
@output_path = Pathname(output_path)
end
# Actual code
end
unless ARGV.size == 2
STDERR.puts "Converts water to wine format"
STDERR.puts "Usage #{$0} deck.water deck.wine"
STDERR.puts " or #{$0} water_folder/ wine_folder/"
exit 1
end
WaterToWineConverter.new(ARGV[0], ARGV[1]).run!
You might even do both just for extra robustness - passing Pathname object to Pathname() constructor works just fine.
Use of Pathname is usually a matter of preference, but in this case it's part of the pattern.
Well, let's write the #run! method:
class WaterToWineConverter
def run!
if @input_path.directory?
@input_path.find do |source_path|
next if source_path.directory?
target_path = map_path(source_path)
next if target_path.exist?
target_path.parent.mkpath
convert!(source_path, target_path)
end
else
convert!(@input_path, @output_path)
end
end
end
That's some nice code. If input path is a file, we just call convert! method.
If it's a directory, we use #find to find all files in input directory, use map_path to decide where the file goes, create folder to put that file if it doesn't exist yet. target_path.parent.mkpath is an extremely common pattern that frees you from ever worrying about directories existing or not. Just do that before you open any file for writing and you're good to go.
In this example we decided to next if target already exists - this is common if you're trying to synchronize two directories, let's say converting your .epubs to .mobis, and you don't want to redo this work. But just as well we could decide to overwrite or raise exception or print warning or whatever makes most sense.
convert!(source_path, target_path) is just a straightforward method that doesn't need to care about any of that - it already knows if target is safe to write, that directory to create target in has been created and so on.
Now the last remaining part of the pattern is to write #map_path(path) method. If both source and target use the same extension, it's really simple thanks to the power of Pathname:
class WaterToWineConverter
def map_path(path)
@output_path + path.relative_path_from(@input_path)
end
end
Unfortunately there's no such easy way if we need to change extension as well. I feel like they should add a few methods to Pathname, especially for file extension manipulation, but we'll avoid monkeypatching and do it the hard way.
Fortunately it's not too messy if we're only working with one extension type, and the somewhat ugly bit is encapsulated in one method:
class WaterToWineConverter
def map_path(path)
@output_path +
path.relative_path_from(@input_path).dirname +
"#{path.basename(".water")}.wine"
end
end
And that's it. It's fairly short, elegant (except for that extension changing part), and robust code that's easy to adapt to pretty much every converter's needs.
Instead of using + use File.join.
ReplyDeleteAnonymous: Never ever use File.join, forget it even exists. Pathname#+ is superior to methods like File.join in every way.
ReplyDelete