Categories
Other PHP Programming

Creating a Flexible Caching Module

I work for a great company, I really do.  Sure we have our problems (like just coming out of the developer stone age), but overall I work with smart, friendly people who are passionate about what they do.  One of the things that always bugged me though was the lack of any sort of caching in our CMS.  Most times it’s not needed, but when an operation takes about 100 queries or so to finish, then it’s time to start caching.

Since I’m a bit of an efficiency freak, I thought I would take a crack at writing a flexible caching module that is easy for our developers to use.  So what do “easy” and “flexible” mean?  To be “easy”, the caching module must be usable by even a novice developer and have a limited number of options.  For instance, I ended up deciding that we really only need two public methods, and one public property.

  • $cache->exists – If “$cache” is a cache object, calling exists checks to see if the cached object already exists in the database.  It also checks to see if it’s expired or not.  If it’s expired or non-existent, it returns false.   It returns true if the cached object exists and is up to date.
  • $cache->put($val) – This is how you store something in cache.  It can be any type of serializable PHP object.  So basically, resource types are off limits but objects, arrays, variables, entire web pages, etc. can be used.
  • $cache->get() – This fetches the object stored  in the cache.  It handles the re-serialization of it as well, so it really makes things pretty idiot proof.

What about flexible?  Well, by that I mean we need to be able to transparently implement several different types of caching.  Since we’re just crawling out of the dark ages, I opted to implement a fallback caching mechanism.  Here’s how it works.

  • The programmer defines a variable in our settings area to be which caching option he/she wants to use.  Options are memcached, file, mysql.
  • If the setting isn’t defined, we try memcached by default.  This is by far the best caching system to use, so it makes sense to try it first.
  • If memcached fails, we go to a database caching schema.  While not nearly as good as using memcached, it’s possible it could save you tons of queries on your database.
  • If the user chooses file caching, we do that.  It’s a pretty bad idea to use in most cases, but may still have it’s uses.

So why did it come this?  It’s not that we host terribly high-volume sites, but that our CMS is super slow.  A full-on page will take about 4 seconds to load to your screen completely, and that’s running local on the network.  One of the main problems is that we use output bufferring extensively.  The ENTIRE FRIGGIN PAGE is buffered.  This has 3 side effects:

  1. Slight performance loss due to bufferring.
  2. Apparent page load time sucks because the browser has to wait for the entire page to be generated before getting output.
  3. Development is super easy because you don’t ever have to worry about output being sent before doing a call like “header()”.

We can’t remove output buffering unfortunately.  It’s at the very core of our CMS and development practices, so it just won’t work.  To get the load time to generate the page as low as possible, I decided that caching was needed.

So what sort of problems do we run in to with this caching module?  Glad you asked!  Many of the problems aren’t specific to this caching module, but to caching in general.  The quick list:

  • If the original query wasn’t complicated, it’s not worth storing the results.  The number of queries the caching module does in MySQL mode is 3.  If your initial query was less than that, or not a super-complex-mega-join, it’s not worth using.  This caveat goes away in memcached mode.
  • Smart naming and design.  You have to be very careful out what you cache, and when.  Remember, page content and queries probably change when a user is logged in or on a different device.  Just things to keep in mind.
  • Getting developers to use it.  Not everyone likes to learn, let alone change their habits.  The biggest barrier to this is getting people to use it.  Some people don’t care about efficiency either (sad, I know), but at least our system administrator thanks me.

The caching module seems to work pretty well too.  On one particularly SQL heavy page, I reduced page load time from 14 seconds (ridiculous) to just under 6 seconds (still bad, but getting better).

That’s it for now.  Any questions or comments are welcome.

By Jack Slingerland

Founder of Kernl.us. Working and living in Raleigh, NC. I manage a team of software engineers and work in Python, Django, TypeScript, Node.js, React+Redux, Angular, and PHP. I enjoy hanging out with my wife and son, lifting weights, and advancing Kernl.us in my free time.

2 replies on “Creating a Flexible Caching Module”

You already have an efficient caching mechanism you can enable which does not require any extra effort by developers. MySQL has a query cache built in and can use memcache to store those results in. Just enable it and everyone gets query caching for free.

Caveat: anything using datetime comparisons and NOW functions does not get cached. Since NOW goes down to micro seconds, it is constantly changing. You have to change those queries to truncate the time to the least granular that makes sense[for example, if your lazy and look for everything posted in the last week as NOW – 7 days, your doing microsecond calculation. Instead, find the date 7 days ago and then look for everything posted since midnight of that day….]

This is a good point. We currently don’t have query caching enabled, which I’m going to talk to our sys admin about.

The other use for the caching system is to store bigger HTML blocks that take awhile to generate due to the sheer volume of database calls that happen to make it. The bonus to using the build in query cache would be caching the cache 🙂

Comments are closed.