Here's the sequel to my "collection of small Unix utilities written in Ruby" post and github repository.
Useful technique - Pathname
One thing I forgot to mention the last time - Pathname library.Pathname is an objects-oriented way to look at paths in a file system. A Pathname object is not the same as a File or Directory object since it's not opened - and might not even exist yet. It's also not like String since it has all the filesystem awareness.
For very simple scripts it's fine to use just plain Strings to represent filesystem paths, but once it gets a bit more complicated your script will get a lot more readable with Pathname - and it costs you nothing.
Let's just look at fix_permissions utility. Here's the core part:
class Pathname
def script?
read(2) == "#!"
end
def file_type
`file -b #{self.to_s.shellescape}`.chomp
end
def should_be_executable?
script? or file_type =~ /\b(Mach-O|executable)\b/
end
end
def fix_permissions(path)
Pathname(path).find do |fn|
next if fn.directory?
next if fn.symlink?
next unless fn.executable?
fn.chmod(0644) unless fn.should_be_executable?
end
end
Since Pathname overloads #to_str method it can be transparently used in most contexts where String is expected - including printing it, file operations, system/exec commands and so on. You'll rarely need to use #to_s - mostly when you want to regexp it.
I feel Pathname#shellescape should exist, but since it doesn't that's one place where you need to use .to_s.shellescape for now.
So what does this script do? First we add a few methods to Pathname class. It already knows if something is a directory?, symlink?, and executable? (that is - has +x flag).
We want to know if it is a script. And that's easy - just read(2) as if it was a File to read first two bytes. It looks much more elegant than File.read(path, 2) != "#!" we'd need if we used Strings - not to mention how String class is really no place for #script? method so we'd probably use a standalone procedure.
Next let's make file_type method - and use #shellescape to do it safely. Unfortunately that one is only defined on Strings.
After that it's just one regexp away from should_be_executable?.
Once we defined that notice how easy it is to dig into directory trees with Pathname#find, and then just use a few #query? methods to ask the path what it is about, then #chmod to setup proper flags.
Other very useful methods not present in the script are + for adding relative paths, #basename/#dirname for splitting it into components, and #relative_path_from for creating relative paths.
While I'm at it, use URI objects for URIs you want to do something complicated with rather than regexping them - usually your code will look better too.
Individual commands
colcut
Cuts long lines to specific number of characters for easy previewing. colcut 80 < file.xml
fix_permissions
Removes executable flag from files which shouldn't have it. Useful for archives that went through a Windows system, zip archive, or other system not aware of Unix executable flag.
It doesn't turn +x flag, only removes it if a file neither starts with #!, nor is an executable according to file utility.
Usage example:
fix_permissions ~/Downloads
If no parameters are passed, it fixes permissions in current directory.
progress
Display progress for piped file.Usage examples:
cat /dev/urandom | progress | gzip >/dev/null
progress -l <file.txt | upload
By default it's in bytes mode. Use -l to specify line mode.
If progress is piped a file and it's in byte mode, it checks its size and uses that to display relative progress (like 18628608/104857600 [17%]). Otherwise it will only display number of bytes/lines piped through.
You can also specify what counts as 100% explicitly:
progesss 123456
progress 128m
progress -l 42042
It will happily go over 100% on display.
since_soup
Link to soup posts starting from the post before one specified.Usage example:
since_soup http://taw.soup.io/post/307955954/Image
sortby
Sort input through arbitrary Ruby expression. A lot more flexible than Unix sort utility.
Usage example:
sortby '$_.length' <file.txt
5 comments:
The pre-existing 'fmt' command does what your 'colcut' does and more and is likely much faster.
Unknown: They do different things - fmt reformats, I just brutally chop excess. It's mostly usable for reading machine-generated xml and json files after putting them through auto-indentation.
cut(1) does this as well. It does both characters (cut -c1-80) and fields (cut -f2,3).
Nice series of tools though :)
Julien: I guess I reimplemented that for no good reason then ;-)
It's not really surprising, often it's easier to write a Ruby or Perl one-liner than to find options for existing command to do the same thing.
Post a Comment