We’ve all heard about caching. Usually, we associate caching with web development (at least I did); a way to temporarily store data as you browse the internet. But, “temporarily” store data can be use on so many different applications outside of web development. Recently, I encountered an optimization problem where I had to run the same computation and comparison thousands of times. And, in researching ways to optimize this problem, caching came to mind. It hit me that there wasn’t an infinite amount of pair combinations and these pairs could be temporarily stored instead of computed every single time. I saw an incredible improvement in performance. I am talking running long and tedious NLP flows in a matter of seconds. As new technologies, specially bulky Transformers, it is crucial we find ways to minimize unnecessary load to our servers and applications.

Given how quick technologies come and go (we all remember when we tried to make Julia happen), it is more important to stick to basics that we know can work, and caching is one of those techniques.

Let’s take a step back and get a better understanding of caching.

caching

Caching is technique to store and access data in-memory which in turn alleviates the workload on your machine’s main data stores. It’s essentially a giant JSON. Caching comes in many different flavors, but for this particular post we will focus on Redis. I found Redis to be incredibly straight-forward to set up- big thumbs up for that! Redis is fast given that the data is stored in RAM and you can even schedule it to delete the data at a given time depending on the use-case of your application. Moreover, it stores data in key-value pairs which is not unique to Redis, but very useful nonetheless. It is important to mention that you can have more complex data structures inside a Redis cache such as lists, sets, hash sets, etc., though I wouldn’t recommend this since the goal here is quick simplicity. Keep in mind that all this data should fit in memory. Redis is great for machines that have sufficient RAM for all of your data; otherwise, you have to split your process and risk losing information.

See below a simple example of how you could use caching in a ML flow.

In theory, we will have an application (a model of some sorts) and our application will then do something (a calculation, a check, whatever you fancy), it will store the value if it hasn’t been seen then feed the model or if the value has been seen, it will retrieve the value and feed the model. Keep in mind that this is temporary storage and you must decide what the best timeframe is to delete your cache.

setting it up

First things first, here’s the official link to Redis.

Second, you can either run Redis locally or externally. If you run it locally, make sure to download the appropriate file.

After you’ve installed the file, go to your terminal and:

pip install redis

Once it is installed, let’s write a quick script.

import redis

r = redis.Redis(
    host='hostname',
    port=port,
    decode_responses = True)

Let’s break this apart.

host refers to your database name, e.g. “localhost”

port is your database’s port.

decode_responses = True is only needed if you need your responses decoded, e.g. utf-8, etc.

Now, let’s go over a few of the key methods of Redis.

Let’s say we want to add to the cache and get a value back:

r.set('some value' : 0)
value = r.get('some_value')
print(value)
0

As mentioned earlier, we can also store other types of data inside a Redis cache.

r.set({'some value' : 0, 'some string' : "test})

Now, let’s say we want to add a time for the cache to expire (in seconds).

r.set({'some value' : 0, 'some string' : "test}, ex=86400)

That’s a wrap for now, feel free to drop me a message if you have any follow up comments!

Posted by:Aisha Pectyo

Astrophysicist turned data rockstar who speaks code and has enough yarn and mod podge to survive a zombie apocalypse.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s