Avoiding Inject
Posted by Adam Wiggins on April 07, 2008 at 01:30 PM
I was rather taken with inject when I first started getting serious with Ruby. It allows you to turn ugly, imperative-style loops requiring temporary variables into functional-style expressions. Like this:
total = 0 items.each do |item| total += item.price * item.quantity end total
Becomes:
items.inject(0) do |total, item| total += item.price * item.quantity end
Much better. (Jay Fields expands on this, in case you're not already an inject junkie.)
And yet, somehow, this hasn't always sat quite right with me. I had to look up the syntax quite a bit when I was first learning it (I always wanted to do |item, total| rather than the reverse), and from time to time my use of inject seems to create subtle bugs that take me a while to figure out. It feels right in theory, but in practice something is a little off.
My partner and coding buddy Orion pointed out to me recently that map is actually a simpler solution in most cases. The trick is knowing the right Enumerable methods to go with it. Like sum, perfect for the example above:
items.map { |item| item.price * item.quantity }.sum
This breaks the process into two steps: extracting the information you want, then operating on the result set. (I daresay this might be a simple version of map/reduce.) Plus, it works elegantly for hashes, something that always frustrates me about inject. For example, turning a hash into a key-value string with inject is:
hash.inject("") do |string, key_value| string += "#{key_value[0]}=#{key_value[1]}\n" end
With map, this is:
params.map { |key, value| "#{key}=#{value}" }.join("\n")
The two-step process wins out on elegance here, too.
Comments
There are 5 comments on this post. Post yours →
Hi Adam,
it's also possible to write
Granted, in your example case, the "map" wins.
Cheers
I think the key thing is that inject is not intuitive to many programmers coming to Ruby—as a pattern it's not needed frequently enough to justify wrapping one's head around and as a paradigm shift Ruby doesn't offer any/enough similar features to make it worthwhile.
The name itself is fairly ambiguous ("inject" could mean many things), and the fact that it's not obvious what "inject" is doing when you look at an inject loop violates the principle of least surprise, and so when you're trying to remember how to use it, you have to look it up. I bet "inject" has a higher lookup-to-usage ratio than most other ruby builtins. It would be easier to understand if it were called something less slick and concise, like "eachwithprevious_return".
Incidentally all of the examples above overlook one of the primary features of inject: that the "accumulator" placeholder variable is set to the return value of the previous iteration of the loop. Thus in an inject loop,
total += item.price * item.quantity
is the same as:
total + item.price * item.quantity
. . .
As far as I can tell, all the obvious uses for inject are better expressed with other idioms (e.g., summing values, concatenating strings, etc.), and all of the less obvious uses are rather obtuse in an inject statement and would probably be easier to read with twice as many lines and no inject. For example:
peeps_by_age = peeps.inject({}) do |hash, person| if hash[person.age].nil? hash[person.age] = person else hash[person.age] = [hash[person.age], person] end hash end
Or:
peeps.inject(File.new("test.txt", "w")) {|file, person| file
Probably the best case I can see for inject is when you want to keep track of something to alternate your responses somehow, such as:
peeps.inject(false) {|even, person| puts "#{(even ? "+++++" : "——-")} #{person.name}"; !even}
arguably more readable than
peeps.eachwithindex {|person, index| puts "#{(index % 2 == 0 ? "+++++" : "——-")} #{person.name}"}
It starts to get messier with non-boolean values, in which case it's less clear why you're using inject:
prefix_chars = ["+", "-", "="]
peeps.inject(0) do |prefix_type, person| prefix = prefixchars[prefixtype] * 10 puts "#{prefix} #{person.name}" (prefix_type + 1) % 3 end
For the particular example of this blog post, I probably would have defined a method #total on class Item, and then I can use [...items...].sum(&:total) instead (in ActiveSupport, Enumerable). This is better encapsulation.
But I do agree that, while #inject is slick, sometimes it breaches the line I draw between readable code versus concise code.
Overdoing inject reminds me of what Martin Fowler wrote about overusing new ideas in his intro to rake (http://martinfowler.com/articles/rake.html)
I agree you should not use #inject where #map is appropriate, but it still has its place. I think of #inject as the "accumulate" part of Enumerate -> Filter -> Map -> Accumulate. Chapter 2.2.3 of SICP (http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-15.html#%sec2.2.3) talks about representing algorithms with this pattern and claims that 90% of a scientific fortran library fits. Ruby has a lot of ways to accumulate enumerables already: sum, join, etc. I'm not saying you should reinvent the wheel. But when you need to write your own, seeing #inject says to me that you are reducing the list.
I struggled w/inject at first too but I think it has it's uses. Part of the problem for me, was that the name "inject" has never made sense. It doesn't imply anything about a sequence or iteration. Why didn't they just call it "accumulate?"
Post a comment
Required fields in bold.