For followup post with some solutions, check this.
A while ago I wrote a list of big ideas for Ruby. But there's also a lot of small things it could do.
Kill 0 octal prefix
I'm limiting this list to backwards compatible changes, but this is one exception. It technically breaks backwards compatibility, but in reality it's far more likely to quietly fix bugs than to introduce them.Here's a quick quiz. What does this code print:
p 0123
p "0123".to_i
p Integer("0123")
Now check in actual ruby and you'll see my point.
If your answer wasn't what you expected, then it proves my point. The whole damn thing is far more likely to be an accident than intentional behaviour - especially in user input.
If you actually want octal - which nobody ever uses other than Unix file permissions, use
0o123
.
Add missing Hash
methods
Ruby has a bad habit of treating parts of standard library as second class, and nowhere it's more mind boggling than with Hash
, which might be the most commonly used object after String
.It only just got
transform_values
in 2.4, which was probably the most necessary one.Some other methods which I remember needing a lot are:
- Hash#compact
- Hash#compact!
- Hash#select_values
- Hash#select_values!
- Hash#reject_values
- Hash#reject_values!
Hash#zip
Technically Hash#zip
will call Enumerable#zip
so it returns something, but that something is completely meaningless.I needed it crazy often. With
a = {x: 1, y: 2}
and b = {y: 3, z: 4}
to run a.zip(b)
and get {x: [1, nil], y: [2,3], z: [nil, 4]}
, which I can then map or transform_values to merge them in meaningful way.Current workaround of (a.keys|b.keys).map{|k| [k, [a[k], b[k]]]}.to_h works but good luck understanding this code if you run into it, so most people would probably just loop.
Enumerable#count_by
Here's a simple SQL:SELECT author, COUNT(*) count FROM posts GROUP BY author;
Now let's try doing this in ruby:
posts.count_by(&:author)
Well, there's nothing like it, so let's try to do it with existing API:
posts.group_by(&:author).map{|author, posts| [author, posts.size]}.to_h
For such a common operation having to do
group_by / map / to_h
feels real bad - and most people would just loop and +=
like we're coding in some javascript and not in a civilized language.I'm not insisting on
count_by
- there could be a different solution (maybe some kind of posts.map(&:author).to_counts_hash
).
URI
query parameters access
Ruby is an old language, and it added a bunch of networking related APIs back then internet was young. I don't blame anyone for these APIs not being very good, but by now they really ought to be fixed or replaced.One mindbogglingly missing feature is access to query parameters in
URI
objects to extract or modify them. The library treats the whole query as opaque string with no structure, and I guess expects people to use regular expressions and manual URI.encode
/ URI.decode
.There are gems like
Addressable::URI
that provide necessary functionality, and URI
needs to either adapt or get replaced.
Replace net/http
It's similar story of API added back when internet was young and we didn't know any better. By today's needs the API feels so bad quite a few people literally use `curl ...`
, and a lot more use one of hundred replacement gems.Just pick one of those gems, and make it the new official
net/http
. I doubt you can do worse than what's there now.Again, I'm not blaming anyone, but it's time to move on. Python had
urllib
, urllib2
, urllib3
, and by now it's probably up to urllib42
or so.
Make bundler
chill out about binding.pry
For better or worse bundler
became the standard dependencies manager for ruby, and pry
its standard debugger.But if you try to use
require "pry"; binding.pry
somewhere in your bundle exec
enabled app, it will LoadError: cannot load such file -- pry
, so you either need to add pry
to every single Gemfile
, or edit that, bundle install
every time you need to debug anything, then undo that afterwards.I don't really care how that's done - by moving
pry
to standard library, by some unbundled_require "pry"
, or special casing pry
, the current situation is just too silly.
Actually, just make binding.pry
work without any require
I have this ~/.rubyrc.rb
:
begin
require "pry"
rescue LoadError
end
which I load with
RUBYOPT=-r/home/taw/.rubyrc.rb
shell option.It's such a nice quality of life improvement to type
binding.pry
instead of require "pry"; binding.pry
, it really ought to be the default, whichever way that's implemented.
Pathname#glob
Pathname
suffers from being treated as second class part of the stdlib.Check out this code for finding all big text files in
path = Pathname("some/directory")
:path.glob("*/*.txt").select{|file| file.size > 1000}
Sadly this API is missing.
In this case can use:
glob("#{path}/*/*.txt").map{|subpath| Pathname(subpath)}.select{|file| file.size > 1000}
which not only looks ugly, it would also fail if
path
contains any funny characters.
system
should to_s its argument
If wait_time = 5
and uri = URI.parse("https://en.wikipedia.org/wiki/Fidget_Spinner")
, then this code really ought to work:system "wget", "-w", wait_time, uri
Instead we need to do this:
system "wget", "-w", wait_time.to_s, uri.to_s
This is especially annoying with
Pathname
objects, which naturally are used as command line arguments all the time. Oh and at least for Pathname
s it used to work in Ruby 1.8 before they removed Pathname#to_str
, so it's not like I'm asking for anything crazy.Ruby Object Notation
Serializing some data structures to send over to another program or same in a text file is a really useful feature, and it's surprising ruby doesn't have such functionality yet.
So people use crazy things like:
Marshal
- binary code, no guarantees of compatibility, no security, can't use outside RubyYAML
- there's no compatibility between every library's idea of what counts as "YAML", really horrible ideaJSON
- probably best solution now, but not human readable, no comments, dumb ban on line final commas, and data loss on conversionJSON5
- fixes some of problems with JSON, but still data loss on conversion
What we really need is Ruby Object Notation. It would basically:
- have strict standard
- have implementations in different languages
- with comments allowed, mandatory trailing commas before newline when generated, and other such sanity features
- Would use same
to_rbon
/RBON.parse
interface. - And have some pretty printer.
- Support all standard Ruby objects which can be supported safely - so it could include
Set.new(...)
,Time.new(...)
,URI.parse(...)
etc., even though it'd actually treat them as grammar and noteval
them directly. - Optionally allow apps to explicitly support own classes, and handle missing ones with excepions.
This is unproved concept and it should be gem somewhere, not part of standard library, but I'm surprised it's not done yet.
11 comments:
posts.count_by(&:author) => posts.count(&:author)
binding.pry => binding.irb since 2.4
posts.count(&:author) will return number of posts with non-nil author, which is not even close to what we need.
binding.irb is like a small step in right direction, but irb is far too limiting compared to pry.
'posts.group(:author).count' will return a hash with post ids as keys and the count of comments as values.
Ack, that's ActiveRecord stuff, not Enumerable. My bad
I'm 100% with you on the octal and count_by suggestions. I see no reason why octal literals shouldn't follow the same pattern as hex literals. I've also done the group_by-map-count dance often enough to wish there was a count_by method for this.
I disagree with regards to Hash#zip. I don't know what you want to do with your hashes, but probably you could get away with passing a block for key conflicts to merge: Hash#merge { |key, old_val, new_val| ... }
Also regarding Hash#select_values or #reject_values.. just do hash.reject { |_, v| .... } ?
Kai:
So starting with 2.4 (or using hash-polyfill gem) you can now do posts.group_by(&:author).transform_values(&:count) which is much better than previous 3-step process.
select_values is mostly convenience feature so you can do .select_values(&:present?) instead of .select{|_,v| v.present?} Not a huge thing, but especially in long expression it will make things look nicer.
.zip is useful for merge conflicts, but a lot of other things as well. For example when tests fail you can quickly display differences with a.zip(b).select_values{|x,y| x != y}
There's a lot of cases like that.
Re: net/http I think https://github.com/httprb/http has the cleanest and most matured API.
As to count_by, I agree with the idea but the name is just as confusing, as demonstrated in the comments above already. :)
For serialization, I wouldn't use Ruby Object Notation anyway, it's a little bit of improvement over Marshal in terms of compatibility, but I fail to see any improvements in terms of security. Complexity is the enemy of security, and that's why JSON survived as a universal API format IMO.
Other than that, all good. :) Especially Pathname, yeah... I think it should be a first-class citizen in Ruby.
kenn: In different codebases I used different http libraries, and any of them is far better than what ruby is doing.
If you like Pathname, you might find this gem I wrote useful https://github.com/taw/pathname-glob
taw: I've used more than a dozen myself, but anything before http.rb had rough corners here and there to be a part of standard lib. I feel the pain because I have a couple of gems where I didn't want to include external dependency, and had to work with net/http. It's a PITA for sure...
Thanks for pointing me to pathname-glob! That's the biggest thing missing in Pathname. Great work!
I rarely find myself to agree so completely with another developer. Thanks!
Post a Comment