Thursday 8 January 2015

More Ruby idioms

In this post, I discuss three Ruby idioms I wrote as equivalents for Python code snippets that I had come across.

The first example is the for block.
To loop over a list called "fields", you have the following code in Python:
for i in range(len(fields)):

I don't like this syntax. In fact, the for-loop was one of the reasons I started looking at other languages. 15 years ago, I was using Perl for all my administrative tasks and glue programs. But beyond a point the concise awesomeness of Perl lost its charm with me as the language did not bring in more intuitive keywords like class.

So I came to Python. But then after using it for some time, I realised that using the range keyword for a for loop was just not tasteful, not suited for my taste. There were other syntactical reasons too. For a lot of time, I was even toying with the idea of creating my own programming language.

Then I discovered Ruby; it had both Perlisms like $_, / /, as well as C++-isms like class keyword; and I just embraced it. So I dropped the idea of creating my own programming language; thus started my Ruby journey and have been using it for writing my small pastime programs and administrative tools.

Ok, that was a digression. Back to the for-loop snippet. In Ruby it would be as follows, and reads like English:
0.upto fields.length-1 do |i|


The second example.
self.frequencies.setdefault(item, {})

The Python dictionary method setdefault takes two arguments : a key and an initial value. If the key does not exist in the dictionary it is added to dictionary with the initial value. Otherwise it returns the current value of the key. Looks and is a setter. But also is a getter. Definitely not to my taste. setdefault by functionality is actually setValueIfKeyNotPresentElseGetValue.

There are many ways to do this in Ruby. The Ruby equivalent I came up with is clear and straightforward. It reads very much like English, and does not need explanation. One again, reason to love Ruby.
@frequencies[item] = {} if not @frequencies.has_key? item


The third and last example.
return sum(map(lambda v1, v2: abs(v1 - v2), vector1, vector2))

Idiomatically this looks like the decorator pattern used in the Java i/o classes. There are three calls here. lambda is the innermost one. lambdas are anonymous functions. Here, it is taking two variables v1, and v2 and returning the absolute difference. The call outer to lambda is the map.

What does map do? It applies a function call to each item in the iterable object(s). In this case, it is applying the absolute difference, done by the lambda anonymous function, to the members of vector1 and vector2. The outermost call is sum, which adds up the items in a sequence. Here it is summing up the absolute differences.

So finally, what we get is the sum of the absolute differences of the corresponding members of two numeric sequences. If you are a follower of data mining / analytics, you would recognize this as the Manhattan distance.

Even Ruby has lambdas. But I tried something different (after a lot of stackoverflowing and googling) and ended up with a very Rubyish idiom:
return vector1.map.with_index{|v, i| (v - vector2[i]).abs}.inject(:+)

map interacts with the block of code that follows it. The next keyword is with_index. Given an enumerator, with_index "yields" the original value of the iterator and an index. In this example, we use the index to get the value in the second vector, and take the absolute difference and call inject.

What does inject do? Its alias is reduce, and it reduces a collection using a cumulative value from each iteration. If you pass a method (or an operator), each element in the collection will be passed to the method (or operator) and the result will be the new value. In short, inject(:+) is an idiom for sum.

Of course, you may want me to stick to lambda and sum and achieve the same result, but then we are not Pythonistas, are we?

No comments:

Post a Comment