Categories
Other

AccuWeather Search Not Working with Pi-hole

If you’ve noticed that you can’t search for locations on the AccuWeather iOS if you are on a network protected with a Pi-hole, you aren’t alone. To fix it, you need to whitelist the following domain:

api.radar.io

After that, search will start working with AccuWeather again!

Categories
Other Other Programming

Things I Learned in my First Year as a Software Engineering Manager

For the past decade I was a software engineer at a number of different companies. At each company I learned a lot of stuff, honed my craft, and eventually decided that I wanted to be a manager. I knew that it was a much different position from being a senior or lead engineer, but I wasn’t quite prepared for the number of things that I was going to learn.

Overall it was a great experience, so I thought I’d write down some of my learnings here for posterity. So here we go, in no particular order.

  • This is completely different from being an engineer – Its not a progression from the job of a senior or lead engineer. It’s a different career entirely. If you expect it to lead engineer + some HR stuff, you are very wrong.
  • You will feel overwhelmed – This is a normal feeling. Its a lot like the first real job you took after college. You know what you should be doing, but actually doing it turns out to be a lot harder.
  • Meetings Meetings Meetings – You no longer have a maker schedule. As a manager, your schedule is now interrupt driven. This is good. People are interrupting you instead of your team. Helping remove distractions so they can stay in flow state is an important part of your job.
  • Managing up is just as important as managing down – Part of your job as a manager is to set and manage expectations within the rest of your organization for the work your team is doing. Knowing what information is worth sharing up the chain is just as important as knowing what information to keep back.
  • Relationships matter – As an individual contributor your relations within your team matter a lot. Outside? Not quite as much, but still important. As a manager, your relationships outside of your team matter lots. Cultivate them. Have a drink after work with other managers, directors, product people, whatever. Having strong bonds with these people will make it easier to resolve conflicts because your relationship won’t start out as adversarial.
  • Choose words wisely – Whether you think they should or not, the words you use now carry more weight. Choose them carefully.
  • Toxic people – After managing for awhile you start to identify people can be toxic to a team. This doesn’t mean that they always are, but they possess the ability to be. They may not even realize it. Don’t be afraid to call them out (in private) if they are being a detractor.
  • Hands off – Sometimes managing people and a team is knowing when not to manage at all. Some teams are jelled together really well. All you need to do is remove blockers and let ’em rip. Other teams need a little more hand-holding. This is ok. Just know which team is which. If you hand-hold a high performing team you will make them less effective.
  • Always be honest – You are managing groups of intelligent highly sought after engineers. Don’t treat them like children. If they have questions, answer them honestly. If you can’t tell them something for some reason, make sure they understand you aren’t trying to be evasive. In general people like candidness.
  • 1 on 1 meetings – You should try to have 1:1 meetings with all of your employees, but leave the cadence up to them. Some people want more interaction, others want less. In my experience, the more senior the person the less 1:1 time they want or need.

One last note before I go: Imposter syndrome is real. I’ve always known I was a good engineer. My track record delivering, career, and salary path all back up that idea of myself. As an engineering manager I have no track record, so I feel like an imposter sometimes. You get over it, but it hits hard.

Categories
Kernl News Wordpress Development

Load Testing the WP Super Cache Plugin with Kernl

So you have a WordPress site and you are expecting a spike of traffic to it. Do you know when your site will fall over? How many users can be browsing it at once and still have a good experience? If you can’t answer these questions then you need to load test your site! Through the course of this article I’ll use a new load testing tool by Kernl to test enabling WP Super Cache on Kernl’s blog. Kernl WordPress load testing doesn’t require any coding or load testing experience and by the end you’ll know how to test any performance optimizations of your WordPress site.

What is Kernl.us?

Kernl.us is a WordPress developer tool service. It does a lot of different things to help WordPress developers be more productive including:

What we’re going to focus on is the WordPress load testing portion of Kernl. Most WordPress developers never really consider load testing for a number of different reasons. Maybe they think that their site can handle lots of load already or perhaps they don’t know know where to start with load testing. Luckily Kernl can help with both of those problems.

So lets get started! The Kernl blog runs on a $5/month Digital Ocean droplet. It has 1GB of RAM and 1 CPU. It runs your typical LAMP setup (Linux, Apache, MySQL, PHP 7). In general this setup is not known to scale well out of the box and requires quite a bit of tweaking to be performant. For this blog post though we’re only going to add a caching plugin and see how that changes performance.

Getting Started

Before starting any WordPress performance optimizations we need to know what our current WordPress performance characteristics look like. Kernl makes this a 2 step process:

  1.  Create a template – Kernl uses the WP JSON API to fetch your site’s layout. It then creates a load test template that you can tweak based on the data that it fetched. A template simply tells Kernl what routes it should test and also how frequently it should visit a given route.
  2. Start a load test – Once you have a template you can start a load test. The load test reads the template, spins up the load testing infrastructure, and then reports back the status of the load test.

For each load test we’re going to throw traffic at https://blog.kernl.us using the following parameters:

  • 200 users
  • 10 minute duration
  • 2 users per second spawn rate
  • Traffic will be produced from Digital Ocean‘s London data center. Kernl’s blog is hosted in Digital Ocean’s NYC3 (New York City) data center.

The Template

The template that we used for this blog post looks like this:

Kernl WordPress Load Test Template
Kernl WordPress Load Test Template

As you can see its pretty straight forward. Each route it accompanied by a multiplier. The multiplier simply tells the load test software how much traffic it should send to a given route relative to the other routes.  So in the example above the “/” route has a 3x multiplier. That means it will receive 3x as much traffic as any of the other 1x routes.

Baseline Load Test

Load Test Baseline - No Cache
Load Test Baseline – No Cache (Requests)

For the baseline load test we can see that within 30 seconds or so we hit our peak throughput. After that some portion of the LAMP stack starts to get overwhelmed and can only handle processing 2 requests per second. 2 requests per second is not great performance.

Load Test Baseline - No Cache
Load Test Baseline – No Cache (Failures)

You can see from the failure graph that at right around the time the requests graph slows down to 2 requests/s failures start to occur. If you start to see your failure graph climb like this you’ll know that you’ve reached your maximum capacity.

The last and most important part of the baseline load test is the request distribution graph. This graph tells us a lot about how users experience our site when it is under load.

Load Test Baseline - No Cache
Load Test Baseline – No Cache (Distribution)

This graph can be a little confusing to read at first, but isn’t bad at all once you get the hang of it. Read it like this:

  1. Pick a column. We’ll use the “90%” column.
  2. Now read the value of the column. The value is milliseconds.
  3. Combine your knowledge! 90% of requests were completed in 110000 milliseconds (110 seconds).

In the context of this load test this is a really bad user experience. If you look at the 50% column you can see that it’s hovering around 70000 milliseconds (70 seconds). This means that 50% of all traffic in the load test took at least 70 seconds to complete 🙁

Cache Enabled Load Test

Now that the baseline load test is complete we can enable a caching plugin and see how much better it makes our site perform! For this test we’re using the excellent WP Super Cache plugin. To make sure we were comparing performance in an “apples to apples” manner, we’re going to use the exact same Kernl load test configuration as we did for the first test.

Load Test With Cache Enabled (Requests)
Load Test With Cache Enabled (Requests)

WOW! WP Super Cache made a huge difference in the throughput that the Kernl blog could handle. We maxed out at around 20 requests/s sustained which is a 10x improvement over our baseline sustained requests. And what about request failures?

Load Test With Cache Enabled (Failures)
Load Test With Cache Enabled (Failures)

Once again, pretty fantastic results. Through the entire load test only a single request failed. To put that number in to perspective a bit: 12,174 requests were made during the 10 minute load test. Only one failed. Thats a failure rate of 0.008%!

High throughput and low failures are great, but what about the distribution? If the user has to wait for 20 seconds for the page to load the site may as well be down. Lets check out the graph.

Load Test With Cache Enabled (Distribution)
Load Test With Cache Enabled (Distribution)

As you can see the distribution with caching enabled tells a very different story than without caching enabled. All requests finished in under 3 seconds and 50% of requests finish in under 2 seconds. Not perfect performance but certainly usable by an end user. Its also worth remembering a few things:

  • This is a $5/month droplet on Digital Ocean.
  • No server tuning was done.
  • No WP Super Cache tuning was done.
  • 20 requests/s is about 1.7 million requests per day.

Next time you need to test performance improvements to your WordPress site, be sure to check out Kernl so that you can be confident in your site’s ability to handle traffic.

Categories
Javascript Kernl News Other Programming

0 to 1 Million: Scaling my side project to 1 million requests a day

In the Beginning

In late 2014 I decided that I needed a side project.  There were some technologies that I wanted to learn, and in my experience building an actual project was the best way to do that.  As I sat on my couch trying to figure out what to build, I remembered an idea I had back when I was still a junior dev doing WordPress development.  The idea was that people building commercial plugins and themes should be able to use the automated update system that WordPress provides.  There were a few self-managed solutions out there for this, but I thought building a SaaS product would be a good way to learn some new tech.

Getting Started

My programming history in 2014 looked something like: LAMP (PHP, MySQL, Apache) -> Ruby on Rails -> Django.  In 2014 Node.js was becoming extremely popular and MongoDB had started to become mature.  Both of these technologies interested me, so I decided to use them on this new project.  As to not get too overwhelmed with learning things, I decided to use Angular for fronted since I was already familiar with it.

A few months after getting started, I finally deployed https://kernl.us for the world to see.  To give you an idea of the expectations I had for this project, I deployed it to a $5/month Digital Ocean droplet.  That means everything (Mongo, Nginx, Node) was on a single $5 machine.  For the next month or two, this sufficed since my traffic was very low.

The First Wave

In December of 2014 things started to get interesting with Kernl.  I had moved Kernl out of a closed alpha and into beta, which led to a rise in sign ups.  Traffic steadily started to climb, but not so high that it couldn’t be handled by a single $5 droplet.

Around December 5th I had a customer with a large install base start to use Kernl.  As you can see the graph scale completely changes.  Kernl went from ~2500 requests per day, to over 2000 requests per hour.  That seems like a lot (or it did at the time), but it was still well within what a single $5 droplet could handle.  After all, thats less that 1 request per second.

Scaling Up

Through the first 3 months of 2015 Kernl experienced steady growth.  I started charging for it in February, which helped fuel further growth as it made customers feel more comfortable trusting it with something as important as updates.  Starting in March, I noticed that resource consumption on my $5 droplet was getting a bit out of hand.  Wanting to keep costs low (both in my development time and actual money) I opted to scale Kernl vertically to a $20 per month droplet.  It had 2GB of RAM and 2 cores, which seemed like plenty.  I knew that this wasn’t a permanent solution, but it was the lowest friction one at the time.

During the ‘Scaling Up’ period that Kernl went through, I also ran into issues with Apache.  I started out by using Apache as a reverse proxy because I was familiar with it, but it started to fall over on me when I would occasionally receive requests rates of about 20/s.  Instead of tweaking Apache, I switched to using Nginx and have yet to run in to any issues with it.  I’m sure Apache can handle far more that 20 requests/s, but I simply don’t know enough about tweaking it’s settings to make that happen.

SCaling Out & Increasing Availability

For the rest of 2015 Kernl saw continued steady growth.  As Kernl grew and customers started to rely on it for more than just updates (Bitbucket / Github push-to-build), I knew that it was time to make things far more reliable and resilient than they currently were.  Over the course of 6 months, I made the following changes:

  • Moved file storage to AWS S3 – One thing that occasionally brought Kernl down or resulted in dropped connections was when a large customer would push an update out.  Lots of connections would stay open while the files were being download, which made it hard for other requests to get through without timing out.  Moving uploaded files to S3 was a no-brainer, as it makes scaling file downloads stupid-simple.
  • Moved Mongo to Compose.io – One thing I learned about Mongo was that managing a cluster is a huge pain in the ass.  I tried to run my own Mongo cluster for a month, but it was just too much work to do correctly.  In the end, paying Compose.io $18/month was the best choice.  They’re also awesome at what they do and I highly recommend them.
  • Moved Nginx to it’s own server – In the very beginning, Nginx lived on the same box as the Node application.  For better scaling (and separation of concerns) I moved Nginx to it’s own $5 droplet.  Eventually I would end up with 2 Nginx servers when I implemented a floating ip address.
  • Added more Node servers – With Nginx living on it’s own server, Mongo living on Compose.io, and files being served off of S3, I was able to finally scale out the Node side of things.  Kernl currently has 3 Node app servers, which handle requests rates of up to 170/second.

Final Thoughts

Over the past year I’ve wondered if taking the time to build things right the first time through would have been worth it.  I’ve come to the conclusion that optimizing for simplicity is probably what kept me interested in Kernl long enough to make it profitable.  I deal with enough complication in my day job, so having to deal with it in a “fun” side project feels like a great way to kill passion.

Categories
PHP Programming Wordpress Development

Kernl Goes Beta!

Screen Shot 2015-11-22 at 3.02.31 PM

In May of this year I launched the Kernl alpha with hopes that WordPress developers would be interested in it.  And interested they were!  6 months later Kernl has over 65 users from all around the globe and a host of new capabilities to make WordPress plugin and theme development easier.  For instance, since launch we’ve added:

  • Continuous Integration with BitBucket
  • Continuous Integration with GitHub
  • Purchase Code Validation

But new features aren’t all that make a service great.  For people to trust in something it must be reliable, and thats what the beta phase of Kernl is all about: improving reliability.  We’ve reached a point where we feel Kernl provides enough value to the WordPress community to allow us to take some time to refactor code and add a lot more tests.

What does this mean for you?  Not much.  If we do our job right you won’t notice anything.  The beta is still free and everyone will get big “heads up” before we start charging for the service.

Thank you to all of the alpha users who have made this possible.  Without you Kernl wouldn’t be where it is today.

 

Categories
PHP Programming Wordpress Development

Continuous Deployment of WordPress Plugins Using Kernl

One of the problems I’ve always had with WordPress plugin development is doing it in a modern build pipeline. I really wanted to be able to merge a branch into master, build the zip file, and push the update out to my clients. For the longest time I wasn’t able to do this, so I built Kernl to enable a more modern development approach to WordPress plugin development.

What is Continuous Deployment?

Continuous Deployment (or Continuous Delivery) is a software development strategy where you ship code frequently. Your pipeline is fully automated, so as soon as some event on your version control repository is triggered the deploy process starts. For me, that event is when I merge a pull request into master.

What is Kernl

Kernl started out as a way to provide private plugin and theme updates for WordPress, which grew out of my frustration at having to update clients manually every time a small bug was patched. Once I had the updates working manually, the next step was automating everything. This is where “push to build” came in.

How Push To Build Works

Getting Started with Kernl

Getting push-to-build updates on your plugin or theme is pretty easy to set up with Kernl.

  1. Go to https://kernl.us and sign up. After you’ve logged in, click “Continuous Integration”.
  2. Now connect BitBucket.  This will authorize Kernl to access your BitBucket account so that it can enable push-to-build functionality.
  3. The next step is adding a WebHook to BitBucket.  This tells BitBucket to send a message to Kernl after every code push.  To do this, go to your repository settings, scroll down to “Integrations” and click “WebHooks”.  Set the new Webhook to point at https://kernl.us/api/v1/repositories/bitbucket/webhook.
  4. In order for Kernl to know when to build a new version of your plugin, it looks for a file named kernl.version in the root directory of your repository.  Go ahead and add this file now and commit it.  The kernl.version should contain a semantic version that looks like “1.0.1”.
  5. Next, you need to add a plugin.  In Kernl, click “Plugins” on the left and then click “Add Plugin” on the upper-right.  Fill out the name, slug, and description fields, then scroll to the bottom.Kernl Select Repository and Branch You should now be able to select from a list of repositories from your BitBucket account.  You can also choose what branch Kernl should make its builds from.  The default is master, but it can be anything that you want.  Select a repository now and press “Save”.
  6. Next, you need to add the first version to Kernl manually.  Click the “versions” button for the plugin you just created, and then click “Add Version”.  The most important part of the process here is to make sure that the version number in Kernl, kernl.version, and your plugin match.  If you put 1.0.0 in the kernl.version file, make sure that it matches in your plugin’s main file, as well as in Kernl when you upload the first version.  If this still isn’t clear, check out the example plugin on BitBucket.  The kernl.version should contain one line, and on that line will be your version.  Once you have the versions figured out, zip up the plugin as if you were going to distribute it and upload it to Kernl.
  7. Thats it!  Distribute this copy of the plugin to your clients and they’ll receive private updates whenever you upload a new copy or push a new version to your BitBucket repository.

Pushing a New Version

With all the boilerplate setup complete, getting a new update out to your clients is super easy.  Follow the steps below and you’ll be good to go.

  1. Make code changes.  Whatever change you want to push out, go ahead and make it.
  2. Update your plugin’s version.  This is typically in the comment document block in your functions.php file.
  3. Update the kernl.version file.  This should match your functions.php version.
  4. Commit
  5. Push to the branch you specified in your plugin setup on Kernl.  If you didn’t specify a branch, that means you’ll need to push to master.
  6. Done.  If all went well, you’ll receive an email from Kernl that lets you know about the new version that was pushed.  You can also verify that the plugin was built by visiting Kernl and looking in the version list for your plugin.

Plugin build email

If you’ve ever wanted to modernize your WordPress development pipeline, I highly suggest you check out Kernl.  Automatic updates triggered by changes in your repository will save you tons of time and get bug fixes and updates out to your clients faster.

Categories
Django Python

Using the Django Per-Site Cache with the Nginx HTTP Memcached Module

For a long time I thought that the most interesting problems in my field were in scalability. Some people may be more interested in scaling, and others might be more into slick interfaces and fast animations. But for me, scalability has continued to be my passion. For awhile though, it was a unicorn. That unattainable thing that I wanted to work on but couldn’t find anywhere to do it at. That is, until I started work at Future US.

Future is a media company. Originally they started in old media focusing heavily on gaming and tech magazines. Eventually the internet became prominent in everyday life, so more of their old media properties made the transition to the web. The one that really matters to me though is PC Gamer. I’ve been a huge fan of PC Gamer since I was about 7 years old. I still have fond memories getting demo disks in the mail with my subscription.

When I was hired at Future it was to help facilitate the move of PC Gamer from its existing platform (WordPress) to Django. Future had experienced success moving other properties to Django, so it made sense to do it with PC Gamer. When it eventually came time to implement our caching layer, we thought about a lot of different ways that it could be done. Varnish came up as an option, but we decided against it since nobody on the team had experience configuring it (and people elsewhere in the organization had experienced issues with it). Eventually we settled on having Nginx serve pages directly from Memcache. For us, this method works great because PC Gamer doesn’t have a lot of interaction (its almost completely consumption from the user end). Anything that does require back-and-forth between the server is handled via javascript, which makes full page caching super easy to do.

The high level architecture for pc gamer.
The high level architecture for pc gamer.

So how does it all work? The image above describes PC Gamer’s server architecture from a high level. Its pretty basic and works quite well for us. We end up having two types of requests: cache hits & cache misses. The flow for a cache hit is: request -> load balancer -> nginx -> memcache -> your browser. The flow for a cache miss is: request -> load balancer -> nginx -> application server (django) -> (store page in cache) -> your browser.

Since we’re basically running a static site, deciding what content to cache is easy: EVERYTHING!

Cache all the things!
Cache all the things!

Luckily for us Django already has a nice way of doing this: The per-site cache. But it is not without its issues. First of all, the cache keys it creates are insane. We needed something a little simpler for our setup so Nginx could build the cache key of the current request on the fly.

How It Works

The meat and potatoes of overriding Django’s per-site cache key comes in the `_generate_cache_key` function.

def _generate_cache_key(request, method, headerlist, key_prefix):
    if key_prefix is None:
        key_prefix = settings.CACHE_MIDDLEWARE_KEY_PREFIX
    cache_key = key_prefix + get_absolute_uri(request)
    return hashlib.md5(cache_key).hexdigest()

To make things easier for Nginx to understand we just take the url and md5 it. Simple!

On the Nginx side of things, the setup is equally simple.

        set            $combined_string "$host$request_uri";
        set_by_lua     $memcached_key "return ngx.md5(ngx.arg[1])" $combined_string;
 
        # 404 for cache miss
        # 502 for memcached down
        error_page     404 502 504 = @fallback;
 
        memcached_pass {{ cache.private_ip }}:11211;

All this setup does is take the MD5 of the host + request URI and then check to see if that cache key exists in memcache. If it does then we serve the content at that cache key, if it doesn’t we fall back to our Django application servers and they generate the page.

Thats it. Seriously. It’s simple, extremely fast, and works for us. Your mileage may vary, but if you have relatively simple caching requirements I highly suggest looking into this method before looking at something like Varnish. It could help you remove quite a bit of complexity from your setup.

Categories
Django Python Uncategorized

Getting around memory limitations with Django and multi-processing

I’ve spent the last few weeks writing a data migration for a large high traffic website and have had a lot of fun trying to squeeze every bit of processing power out of my machine. While playing around locally I can cluster the migration so it executes on fractions of the queryset. For instance.

./manage.py run_my_migration --cluster=1/10
./manage.py run_my_migration --cluster=2/10
./manage.py run_my_migration --cluster=3/10
./manage.py run_my_migration --cluster=4/10

All this does is take the queryset that is generated in the migration and chop it up into tenths. No big deal. The part that is a big deal is that the queryset contains 30,000 rows. In itself that isn’t a bad thing, but there are a lot of memory and cpu heavy operations that happen on each row. I was finding that when I tried to run the migration on our Rackspace Cloud servers the machine would exhaust its memory and terminate my processes. This was a bit frustrating because presumably the operating system should be able to make use of the swap and just deal with it. I tried to make the clusters smaller, but was still running into issues. Even more frustrating was that this happened at irregular intervals. Sometimes it took 20 minutes and sometimes it took 4 hours.

Threading & Multi-processing

My solution to the problem utilized the clustering ability I already had built into the program. If I could break the migration down into 10,000 small migrations, then I should be able to get around any memory limitations. My plan was as follows:

  1. Break down the migration into 10,000 clusters of roughly 3 rows a piece.
  2. Execute 3 clustered migrations concurrently.
  3. Start the next migration after one has finished.
  4. Log the state of the migration so we know where to start if things go poorly.

One of the issues with doing concurrency work with Python is the global interpreter lock (GIL). It makes writing code a lot easier, but doesn’t allow Python to spawn proper threads. However, its easy to skirt around if you just spawn new processes like I did.

Borrowing some thread pooling code here, I was able to get pretty sweet script running in no time at all.

import sys
import os.path
 
from util import ThreadPool
 
def launch_import(cluster_start, cluster_size, python_path, command_path):
    import subprocess
 
    command = python_path
    command += " " + command_path
    command += "{0}/{1}".format(cluster_start, cluster_size)
 
    # Open completed list.
    completed = []
    with open("clusterlog.txt") as f:
        completed = f.readlines()
 
    # Check to see if we should be running this command.
    if command+"\n" in completed:
        print "lowmem.py ==> Skipping {0}".format(command)
    else:
        print "lowmem.py ==> Executing {0}".format(command)
        proc = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        output = proc.stdout.read() # Capture the output, don't print it.
 
        # Log completed cluster
        logfile = open('clusterlog.txt', 'a+')
        logfile.write("{0}\n".format(command))
        logfile.close()
 
 
if __name__ == '__main__':
 
    # Simple command line args checking
    try:
        lowmem, clusters, pool_size, python_path, command_path = sys.argv
    except:
        print "Usage: python lowmem.py <clusters> <pool_size> <path/to/python> <path/to/manage.py>"
        sys.exit(1)
 
    # Initiate log file.
    if not os.path.isfile("clusterlog.txt"):
        logfile = open('clusterlog.txt', 'w+')
        logfile.close()
 
    # Build in some extra space.
    print "\n\n"
 
    # Initiate the thread pool
    pool = ThreadPool(int(pool_size))
 
    # Start adding tasks
    for i in range(1, int(clusters)):
        pool.add_task(launch_import, i, clusters, python_path, command_path)
 
    pool.wait_completion()

Utilizing the code above, I can now run a command like:

python lowmem.py 10000 3 /srv/www/project/bin/python "/srv/www/project/src/manage.py import --cluster=" &

Which breaks the queryset up into 10,000 parts and runs the import 3 sets at a time. This has done a great job of keeping the memory footprint of the import low, while still getting some concurrency so it doesn’t take forever.