Stripping unicode characters out of slug
I am trying to strip the following characters out of slugs: ṁ, ṭ, ḍ, ṇ, ṅ, ñ, ḷ, ṃ.
I found this code here (note: I have removed public
in order to get it to work)
add_action('wp_insert_post_data', __NAMESPACE__ . 'processPermalink');
/**
* Processes the permalink so we can remove any characters that may cause a problem when communicating
* with the API.
*
* @param array $data The array of information about the post.
* @return array $data The data without the malformed information in the post name for the URL.
*/
function processPermalink($data)
{
if (!in_array($data['post_status'], array('draft', 'pending', 'auto-draft'))) {
$data['post_name'] =
preg_replace(
'/(%ef%b8%8f|™|®|©|trade;|reg;|copy;|#8482;|#174;|#169;)/',
'',
$data['post_name']
);
}
return $data;
}
I have tried replacing the preg_replace
in these three ways, but none of them work (the original code does what it shoudl):
'/(#8424;#8424;#8424;#7745;|#7789;|#7693;|#7751;|#7749;|#241;|#7735;|#7747;)/'
'/(#x1E41;|#x1E6D;|#x1E0D;|#x1E47;|#x1E45;|ntilde;|#x1E37;|#x1E43;)/'
'/(ṁ|ṭ|ḍ|ṇ|ṅ|ñ|ḷ|ṃ)/'
None of them successfully strip out the characters.
EDIT: I don't actually have to accomplish my goal using the above code. I'd be happy to do it some other way, including substituting the bad characters with their non-accented versions.
Any ideas? Thanks!