How to retrieve TLD domain for the current website?

WordPress has certain variables like site_url and home_url and paths for things like plugins, themes, uploads, etc... but it seems there is no easy way to display current TLD domain.

This probably requires using PHP preg_replace or something similar but I'm not sure what is the cleanest or most reliable way to generate this snippet?

For example for a WP site installed to a URL like https://www.example.co.uk how can you detect the TLD only (and e.g. save it as a variable for use in a PHP script)... it either needs to be aware of a large variety of different TLDs, or at least customizable for different TLD types?

Topic home-url site-url subdomains domain Wordpress

Category Web


The $_SERVER['HTTP_HOST'] global will usually contain the domain/subdomain to which the current request was sent, but with a few caveats. In the context of a web request, the variable is set at the discretion of the web server and proxies, and Apache/Nginx may not pass that header along depending on their configuration - or may replace it with some other value such as the proxy's host. It's also not included in HTTP1.0 requests (though HTTP 1.0 clients are becoming scarce), and there is some concern that since it is derived from data sent by a client, there is some chance that it might contain undesired or arbitrary values.

Some environments will also populate $_SERVER['SERVER_NAME'].

Within the WordPress APIs, a function to retrieve the domain exists in multi-site networks in the form of get_clean_basedomain(). However, outside of a multisite installation this function is unavailable. But we can take cues from it's implementation and create something similar which will parse the host out of the configured site URL (whether it's a domain name, IP address, or hostname):

/**
 * Retrieves the current site's host domain, IP address, or hostname.
 * 
 * @return string The site's host.
 **/
function wpse377481_get_site_host() {
  if( function_exists( 'get_clean_basedomain' ) )
    return get_clean_basedomain();

  return parse_url( get_site_url(), PHP_URL_HOST );
}

However you obtain the host string, the TLD can be considered as the substring proceeding the last .:

/**
 * Retrieves the TLD for a given host string.
 * 
 * @param string|null $host A host string. Defaults to the current site's host.
 * @return string|false The TLD, or `false` if not applicable.
 **/
function wpse377481_get_tld( $host = null ) {
  if( $host === null )
    $host = wpse377481_get_site_host();

  if( rest_is_ip_address( $host ) )
    return false; // An IP address has no TLD.

  if( strpos( $host, '.' ) === false )
    return false; // An isolated host name has no TLD.

  return substr( $host, strrpos( $host, '.' ) + 1 );
}

While this could easily yield TLDs which are not actually part of the root zone (not to mention the last octet of an invalid IPv4 address), I think this behavior is desirable. Many people use the non-existent .test TLD for local development, for example. Or corporate intranets with internal DNS servers which leverage custom TLDs.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.