Mastering WordPress rewrite rules

Posted on 27 March 2019 by Bradley Taylor

At BrightMinded, we spend a lot of time working with WordPress. We really do know it inside out! However, we often find ourselves needing to go above and beyond the out-of-the-box capabilities that WordPress core provides us with. A frequent complaint we hear, both from members of our team and the wider development community, is that WordPress lacks a proper routing system, of the sort provided by modern web frameworks like Laravel, Express.js or Django. However, with a little knowledge of how the WordPress routing (rewrite) system works, there is no reason for it to limit the development possibilities available.

A brief history of WordPress

WordPress is an old project which dates back to 2003 (earlier if you include b2/cafelog, from which WordPress was forked). In those early days, WordPress was just a blog and lacked almost all the features that make WordPress the most prevalent CMS today.

If you had installed WordPress in 2003 each blog post would have had a URL that looked something like this: brightminded.com?id=3715. Notice how the URL is using a query parameter (?id=), rather than an SEO friendly path.

In January 2004 WordPress 1.0 was released which allowed “Pretty Permalinks” for the first-time. Pretty permalinks allowed the original brightminded.com?id=3715 to be mapped to something nicer – in this case brightminded.com/mastering-wordpress-rewrite-rules. However, to maintain backwards-compatibility, and in order to support servers without Apache mod_rewrite support, the original query-string routing has continued to be maintained. Even today, the WordPress installation you are currently reading supports the query-string based routing, which acts as the foundation for pretty path-based permalinks.

This use of query parameters means that a large number of query parameters are reserved by WordPress – these reserved parameters must not be used as GET parameters or as the names of form fields. A full list of reserved parameters can be viewed on the WordPress codex.

Renaming internal rewrite rules

So, armed with the knowledge that the WordPress rewrite system is based on a history of query strings, let’s see how we can put this into practice.

WordPress comes with a handful of rewrite rules out-the-box. For example:
example.com?author=authorname becomes example.com/$author_base/authorname
example.com?feed becomes example.com/feed/

Why might you want to change these? One reason would be to translate the default permalinks if you are not running an English-language site. Let’s try translating the author permalink. The easiest way to do this is to replace the hardcoded strings, which exist as class properties on WP_Rewrite.

Because the rewrite rules are generated after WordPress calls init, we can use add an action to translate the WP_Rewrite properties.

add_action('init', 'translate_author_rewrite_rules_to_french');
function translate_author_rewrite_rules_to_french() {
global $wp_rewrite;
$wp_rewrite->author_base = 'auteur';
}

The following WP_Rewrite properties can be renamed in the same way:

  • author_base
  • comments_base
  • comments_pagination_base
  • feed_base
  • pagination_base
  • search_base

(It is our opinion that these translations should be handled out-the-box like other translatable strings in WordPress core, but this looks unlikely to change.)

Removing internal rewrite rules

The default rewrite rules also have the unfortunate side-effect of “reserving” terms which can’t then be used as the permalink for posts or pages. If you set the permalink for a blog post as “author”, then that wouldn’t work, as the WordPress author rewrite rule would be triggered, and you would be served the author.php template file (dependent on your theme, and the WordPress template hierarchy).

In this case you can rename the default rewrite rule, as described above. However, you can also remove the default rewrite rules completely. The following lines can be used to remove the different default rewrite rules:

add_filter(‘post_rewrite_rules’, ‘__return_empty_array’);
add_filter(‘date_rewrite_rules’, ‘__return_empty_array’);
add_filter(‘comments_rewrite_rules’, ‘__return_empty_array’);
add_filter(‘search_rewrite_rules’, ‘__return_empty_array’);
add_filter(‘author_rewrite_rules’, ‘__return_empty_array’);
add_filter(‘page_rewrite_rules’, ‘__return_empty_array’);

Filters are also added for taxonomies and post types with the format:

add_filter(‘{post_type}_rewrite_rules’, ‘__return_empty_array’);
add_filter(‘{taxonomy}_rewrite_rules’, ‘__return_empty_array’);

Adding new rewrite rules

In the examples above, we’ve looked at how to change rewrite rules that have already been added by WordPress, but now let’s look at adding our own. The ideal use-case for rewrite-rules is something that won’t work well as a post, page or custom post type – maybe because the exact URLs are dynamic, generated using data from an external data source, or because you want to make sure the URL can’t be changed by content editors. As an example, let’s imagine we run a weather website and want a dynamically generated URL per city. The first thing to do would be to decide what the URL structure will be with pretty permalinks disabled. Let’s use the query parameter of “city”. For example:

example.com?city=london
example.com?city=brighton

Because WordPress will map this city parameter to a pretty permalink we can’t use the usual $_GET superglobal directly. Therefore, the first thing we need to do is tell WordPress about our new query parameter (query variable in WordPress speak):

function custom_rewrite_tag( $vars ) {
add_rewrite_tag('%city%', '([^&]+)');
}
add_action( init, “custom_rewrite_tag”);

(This code, and all other code in this guide, can go into your theme’s functions.php, or into a plugin file.)

The second argument to add_rewrite_tag is a regular expression (or regex) which matches any character except &. The regular expression can be changed depending on the exact use-case, but the example above is a safe-default for most cases.

In our plugins or templates we can then use this by calling get_query_var, which accepts the name of your query variable as the first parameter. This function also accepts a default value as the second parameter:

get_query_var( city, false );

We now need to use this city query variable to cause a new template to be returned. By default WordPress will decide which template to use using the template hierarchy. However, we can extend this hierarchy with our own logic using the template_include filter.

add_filter( 'template_include', 'add_city_to_hierarchy' );
function add_city_to_hierarchy( $original_template ) {
if ( get_query_var( city, false ) ) {
/* This will look for a file called city.php in the current active theme */
return get_query_template( 'city' );
} else {
return $original_template;
}
}

Within city.php we can then add whatever logic we want, with all the power and functionality that we would have within a normal template. You can even return files other than HTML pages. Yoast uses this technique to generate sitemap.xml files for each resource type on any given site.

You could also use the template_redirect action, instead of template_include, to achieve the same outcome, but this can lead to obscure bugs and is discouraged by WordPress core developers. For adding custom routing you should default to using template_include.

So we’ve now got a page which can be accessed at example.com?city={city}. We now need to map this to a pretty permalink. For this, we need the function add_rewrite_rule.

function city_rewrite_rule() {
add_rewrite_rule('^city/([^/]*)/?', 'index.php?city=$matches[1]', 'top');
}
add_action('init', 'city_rewrite_rule', 10, 0);

The first parameter is another regular expression, which describes the shape of the URL structure which we are trying to add. The regular expression '^city/([^/]*)/?' translates to city/{anything}/. The actual match is wrapped in brackets in the regular expression, and this is passed into the query-string version of the URL with $matches[1] in the second parameter.

A note on duplicate content

Eagle-eyed readers will have noticed a clear SEO issue with content being available with both query strings or with pretty permalinks. If example.com?id=123 and example.com/my-permalink contain the same content, Google will treat these as duplicate pages and are liable to either apply penalties for duplicate content, or treat the distinct URLs as competing resources. This could lead to neither URL variant ranking highly in Google’s search rankings.

The solution to this problem is to mark one URL as the “canonical” page, which is what will be indexed by Google and other search engines. By default, WordPress will only apply canonical tags to posts and pages, not to custom rewrite rules (although some SEO plugins have more comprehensive canonical functionality).

Where SEO is an important requirement, developers should consider outputting rel=’canonical’ elements on any pages generated with custom rewrite rules. This can be achieved with the wp_head action.

The rewrite cache

Despite most of the logic above running on every request using actions and filters, WordPress stores the generated rewrite rules in the database in what is known as the “rewrite cache”. This means that after any change to the rewrite rules, you need to “flush” the database cache. This is as simple as visiting “Settings” -> “Permalinks” in the WordPress admin – you don’t even need to hit save for the permalinks to be flushed.

This does mean that any computationally expensive rewrite rules can be generated “on demand” when the rewrite rules are saved, using the rewrite_rules_array. However, this runs after most of the rewrite rules have been generated, and so the code examples in this blog will need to be adapted to edit the rewrite rules directly rather than using the convenience function add_rewrite_rule.

The future

I hope that this post has demonstrated that despite WordPress rewrite rules being based on a dated system, which is less intuitive than what other frameworks offer, there is no reason why it should limit development possibilities.

Armed with your new knowledge, you can also wrap the underlying functionality in helper classes to make the system easier to work with. Humanmade’s hm-rewrite project wraps the various filters and function calls into a single function. The Timber theming framework includes something more akin to how other frameworks handle routing. Both these projects provide lightweight improvements, but give inspiration for how to cleanly add rewrite rules in your own projects.

Are there any helper libraries we are missing? Is there a better way to add custom rewrite rules? Let us know by emailing us.