I work for a great company, I really do. Sure we have our problems (like just coming out of the developer stone age), but overall I work with smart, friendly people who are passionate about what they do. One of the things that always bugged me though was the lack of any sort of caching in our CMS. Most times it’s not needed, but when an operation takes about 100 queries or so to finish, then it’s time to start caching.
Since I’m a bit of an efficiency freak, I thought I would take a crack at writing a flexible caching module that is easy for our developers to use. So what do “easy” and “flexible” mean? To be “easy”, the caching module must be usable by even a novice developer and have a limited number of options. For instance, I ended up deciding that we really only need two public methods, and one public property.
- $cache->exists – If “$cache” is a cache object, calling exists checks to see if the cached object already exists in the database. It also checks to see if it’s expired or not. If it’s expired or non-existent, it returns false. It returns true if the cached object exists and is up to date.
- $cache->put($val) – This is how you store something in cache. It can be any type of serializable PHP object. So basically, resource types are off limits but objects, arrays, variables, entire web pages, etc. can be used.
- $cache->get() – This fetches the object stored in the cache. It handles the re-serialization of it as well, so it really makes things pretty idiot proof.
What about flexible? Well, by that I mean we need to be able to transparently implement several different types of caching. Since we’re just crawling out of the dark ages, I opted to implement a fallback caching mechanism. Here’s how it works.
- The programmer defines a variable in our settings area to be which caching option he/she wants to use. Options are memcached, file, mysql.
- If the setting isn’t defined, we try memcached by default. This is by far the best caching system to use, so it makes sense to try it first.
- If memcached fails, we go to a database caching schema. While not nearly as good as using memcached, it’s possible it could save you tons of queries on your database.
- If the user chooses file caching, we do that. It’s a pretty bad idea to use in most cases, but may still have it’s uses.
So why did it come this? It’s not that we host terribly high-volume sites, but that our CMS is super slow. A full-on page will take about 4 seconds to load to your screen completely, and that’s running local on the network. One of the main problems is that we use output bufferring extensively. The ENTIRE FRIGGIN PAGE is buffered. This has 3 side effects:
- Slight performance loss due to bufferring.
- Apparent page load time sucks because the browser has to wait for the entire page to be generated before getting output.
- Development is super easy because you don’t ever have to worry about output being sent before doing a call like “header()”.
We can’t remove output buffering unfortunately. It’s at the very core of our CMS and development practices, so it just won’t work. To get the load time to generate the page as low as possible, I decided that caching was needed.
So what sort of problems do we run in to with this caching module? Glad you asked! Many of the problems aren’t specific to this caching module, but to caching in general. The quick list:
- If the original query wasn’t complicated, it’s not worth storing the results. The number of queries the caching module does in MySQL mode is 3. If your initial query was less than that, or not a super-complex-mega-join, it’s not worth using. This caveat goes away in memcached mode.
- Smart naming and design. You have to be very careful out what you cache, and when. Remember, page content and queries probably change when a user is logged in or on a different device. Just things to keep in mind.
- Getting developers to use it. Not everyone likes to learn, let alone change their habits. The biggest barrier to this is getting people to use it. Some people don’t care about efficiency either (sad, I know), but at least our system administrator thanks me.
The caching module seems to work pretty well too. On one particularly SQL heavy page, I reduced page load time from 14 seconds (ridiculous) to just under 6 seconds (still bad, but getting better).
That’s it for now. Any questions or comments are welcome.