Functional Dreams
I’ve always been a fan of Rich Hickey, from the first time I heard him talk about how much of a gift it had been to have a year of focused time to work on a new programming language. In an effort to resolve a race condition at write time, I was reading recently about database isolation levels. Along the way I briefly wondered, “how would Datomic solve this? Would it be possible to solve a race condition at read time instead?” Along the way, I bumped into this fantastic post on Cognitect’s blog. I so enjoyed the post that I ordered Russ Olson’s Clojure book as soon as I had finished reading it.
Why do I bring this up? I had another small project recently to write a script to transform some data. The notion of this script was: read in a multiline file, transform each line according to a set of rules, write a new file. I ended up with something mostly elegant, but for each of the steps I called Ruby’s map
function:
def with_first_rule_applied(line)
line # apply the rule and return a new line
end
def with_second_rule_applied(line)
line # apply the rule and return a new line
end
My main
method contained something like this:
lines
.map {|line| with_first_rule_applied(line) }
.map {|line| with_second_rule_applied(line) }
This was fine. The script worked. I had other things to do. But I kept wondering if there was a way to build such a pipeline in a more elegant fashion. It turns out there is! I specifically remembered function compose(funcs)
from Russ Olson’s post above.
A more functional approach
Let’s move to a completely contained example (in Ruby). Say we have some data:
data = [1, 2, 3]
It needs to be transformed according to some rules – in this case we will multiply, divide, and add to each value.
times_two = ->(x) { x * 2 }
divided_by_two = ->(x) { x / 2 }
plus_ten = ->(x) { x + 10 }
This is what we originally had, and it works great:
original = data.map { |x| times_two[x] }
.map { |x| divided_by_two[x] }
.map { |x| plus_ten[x] }
However, we can compose a pipeline that will apply each function in succession:
def pipeline_of_maps(*functions)
proc do |data|
functions.reduce(data) do |processed_data, function|
processed_data.map { |datum| function.call(datum) }
end
end
end
This returns a proc (that we call my_first_pipeline
).
my_first_pipeline = pipeline_of_maps(
times_two,
divided_by_two,
plus_ten
)
The pipeline takes in data
.
We can see that the two approaches yield the same result:
original == my_first_pipeline.call(data)
original == [11, 12, 13]