A few days ago I discovered a feature in the redis gem that is called pipeline
.
In the code for this post I'll use REDIS
as your redis connection. I use that because in most of the rails projects I've seen it's like that.
Given a connection, you can perform a set of quieries like that:
5000.times do
REDIS.lpush "test_#{rand(100)}", "hello world"
end
# About 3.3 seconds in heroku with redistogo
This pushes 5000 times the value "hello world"
to a random key from test_0
to test_99
in the redis database. It can be improved doing only one mass-push fo each key, but even that you will run 100 queries.
If your redis database is remote, it will connect with the database 5000 times (or 100 with mass push), which will take some time just to perform the connection remotelly.
To solve that problem, the redis gem have pipelined queries. Let's see the same code but pipelined:
REDIS.pipelined do
5000.times do
REDIS.lpush "test_#{rand(100)}", "hello world"
end
end
# About 0.1 seconds in heroku with redistogo
This looks the same, right? Well, internally REDIS
is not performing the query at all, it's just saving the commands to an array. When the pipelined block is closed, it performs all the queued commands in one single connection.
It sounds perfect but, like everything, has some cons. In a pipelined block you can't get the response of anything in redis because the request it's not performed when you call it.
In the following example we can't use pipelines:
5000.times do |i|
value = REDIS.get("test_#{i}")
REDIS.lpush("all", value)
end
The REDIS.get
within a pipeline doesn't return the value, so we can't wrap inside pipelines anything that performs operations to get data and do something based on that data.