Updating WordPress core with zero downtime - I mean zero

Question

Updating WordPress core with zero downtime - I mean zero

mayersdesign

2022年6月1日 05:58

I have a critical website which is under version control. When I update the core WordPress files I do so locally, then commit the changes. When I do that the live site (obviously) does not even display a maintenance notice, it simply errors out for a few minutes as the core files are deployed, and presumably are either read part way through, or are temporarily incompatible one to the other.

I could update from the live admin, and then deploy, which would swap those events around. There would be a maintenance notice, and the file incompatibilities would be minimised.

So - question: How could this be accomplished with ZERO downtime? Is there a way? Dual servers with some kind of failover? How do the big companies handle this situation? How can I update the Wordpress core with (literally) zero downtime?

Update: I deploy using a third party SVN provider Beanstalk, who's own FTP seems to be slow incidentally. But in terms of deployment, even if I FTP the files up from local, there is going to be some downtime. Please regard this question as more of a thought experiment. What is the best approach to updating a WordPress site with ZERO downtime?

Topic core version-control updates Wordpress

Category Web

Mark Kaplun · Accepted Answer · 2022年6月1日 05:58

The first step is to use a load balancer. You should have one in any case for any critical site, and have two "application" servers behind it even if you do not expect major traffic spikes, just to have a failover in case of one of the servers failing.

When the time to upgrade comes, you get the server being upgraded off the load balancer, upgrade it (wpcli or direct administrator access via some specific IP address or domain which bypasses the balancer). Once upgraded, you add it back to the balancer and upgrade the other the other application server.

This is not a zero effort solution as it requires some planning regarding image storage, cache clearing and keeping the same DB structure while there is a version mismatch between the application servers.

As someone who had done it, I know it is not "rocket science", but you need to ask yourself if just getting the site offline for an hour (probably much less) every 4 months is such a bad thing that it is worth complicating your site's infrastructure. Do you really need a load balancer just for that reason?

Note regarding deploying from Git or SVN. You should write a script getting WordPress into maintenance mode. Fixing a corrupted DB because some part of the code was on version X while the other was on X+1 can cost you much more than you lose because the site is down for a minute.

orithena · Accepted Answer · 2022年5月31日 21:57

Thomas' answer is good, but repeating it will result in strangely nested directories. I'd like to expand on it to give a full recipe on the poor man's solution using ssh (and also scripts):

One often-used pattern (for a Linux developer) is to create separate directories for each version:

wordpress-5.9.2/
wordpress-5.9.3/
wordpress-6.0/

and then create one symbolic link to access the "currently active" version (which your web server should use as base directory, a.k.a. document root):

ln -s -T -f wordpress-5.9.3 wordpress

This gives you the following directory listing:

wordpress-5.9.2/
wordpress-5.9.3/
wordpress-6.0/
wordpress -> wordpress-5.9.3

Switching between versions then is one command—and you can easily switch back in case of errors:

ln -s -T -f wordpress-6.0 wordpress

-T tells ln to always use the behaviour as if the target was a file, avoiding putting the symbolic link inside a directory if the target is a symbolic link pointing to a directory.
-f overwrites the target if it already exists (instead of throwing an error).
-s just says that it should be a symbolic link instead of a hard link.

Tom J Nowell · Accepted Answer · 2022年5月31日 21:55

There is no general solution to this problem. Your solution will be specific to your situation, and unusable for many other users, and vice versa. All with their own varying tradeoffs.

For example, some hosting companies spin up new instances with the new update, that then replace the old instances, but this is not possible on many hosts. This also glosses over the fact that what that sentence means can vary wildly depending on the technologies used. For example, a new AWS instance is very different from replacing a container. It's also not going to work for people on shared hosts.

To find a truly concrete answer you will need to work with your host. If their deployment process is not atomic then that's an issue with that host that you need to work with or work around. How you do that is unique to your host and technology stack. The solution for AWS environments for example only works on AWS.

Or you may choose to rely on a cache based system, which only works for certain kinds of sites, and requires additional hardware and software. This might work great for a newspaper for example, but it won't work for a membership site, and has other tradeoffs too.

And for some hosts, usually shared hosts accessed directly, it simply cannot be done as there is no way to bulk update files in one single atomic step. The best they can hope for is a maintenance mode or holding page.

Either way, it's highly dependent on your hosting platform, tech stack, budget, the type of site you have, and commercial needs. If you constrain this question down to your specific tech stack, there may be a canonical solution, but it would only apply to users in your specific use case. E.g., there is definitely solutions for AWS/EC2, but it only applies to that particular technology stack.

Thomas · Accepted Answer · 2022年5月31日 08:35

Assuming you have ssh access.

The "poor man's solution" would be to create a directory with the new code, e.g. new_code and run

mv code_dir code_dir_old && ln -s new_code code_dir

Which would rename the code_dir directory to code_dir_old and immediatly after that create a symlink from your prepared new new_code directory to code_dir.

If you expect database migrations, that only add columns (not remove) then you could run these before making the change.

iBug · Accepted Answer · 2022年5月31日 08:28

On my WP host which is a typical VPS where I have root access, I run WP using the official Docker image (wordpress:fpm) and an Nginx in front of it. WP files (wp-content) are bind-mounted from the host so I can easily manage them.

For a typical WP upgrade, I pull a new Docker image. Then I get rid of the old container and spin up a new one with otherwise identical configuration (networking in particular). This leads to a visible downtime of ~1sec. If I were to achieve zero downtime, I'd spin up the new container alongside the running one, update Nginx config and nginx -s reload, let Nginx finish existing connections while new connections are routed to the new PHP FPM container, and only after that do I destroy the old container.

Updating WordPress core with zero downtime - I mean zero

About