Sanitize content from wp_editor

I built a custom post type where we can find a standard textarea/tinymce generated by wp_editor() and I'm facing an issue for the saving part.

If I save the content with the following code :

update_post_meta( $post_id, $prefix.'content', $_POST['content'] );

Everything is working fine but there is no security (sanitization, validation etc...)

If I save the content with the following code :

update_post_meta( $post_id, $prefix.'content', sanitize_text_field($_POST['content']) );

I solve the security issue but I lose all the style, media etc.. in the content.

What could be a good way to save the content with all the style applied, the media inserted but including a sanitization ?

I read a bit about wp_kses() but I don't know how I could apply a good filter. (Allowing common tags, which one should I block ? etc..)

Topic wp-editor sanitization Wordpress

Category Web


In short: it is in dependence of your context, the data inside your editor.

wp_kses() is really helpful, and you can define your custom allowed HTML tags. Alternative, you can use the default functions, like wp_kses_post or wp_kses_data. These functions are helpful in ensuring that HTML received from the user only contains white-listed elements. See https://codex.wordpress.org/Data_Validation#HTML.2FXML_Fragments

WordPress defines much more functions to sanitize the input, see https://codex.wordpress.org/Validating_Sanitizing_and_Escaping_User_Data and https://codex.wordpress.org/Data_Validation These pages are really helpful.

However, in your context should the wp_kses_post function, the right choice.


You could do someting like this:

/**
 * Most of the 'post' HTML are excepted accept <textarea> itself.
 * @link https://codex.wordpress.org/Function_Reference/wp_kses_allowed_html
 */
$allowed_html = wp_kses_allowed_html( 'post' );

// Remove '<textarea>' tag
unset ( $allowed_html['textarea'] );

/**
 * wp_kses_allowed_html return the wrong values for wp_kses,
 * need to change "true" -> "array()"
 */
array_walk_recursive(
    $allowed_html,
    function ( &$value ) {
        if ( is_bool( $value ) ) {
            $value = array();
        }
    }
);
// Run sanitization.
$value = wp_kses( $value, $allowed_html );

@fuxia: as OP wrote:
"I read a bit about wp_kses() but I don't know how I could apply a good filter. (Allowing common tags, which one should I block ? etc..)"

wp_kses does the following:
"This function makes sure that only the allowed HTML element names, attribute names and attribute values plus only sane HTML entities will occur in $string. You have to remove any slashes from PHP's magic quotes before you call this function."
https://codex.wordpress.org/Function_Reference/wp_kses

My code uses wp_kses with "Allowing common tags". What are the common tags? The list available to read in the given link. It is a long list, so I did not paste it here.
https://codex.wordpress.org/Function_Reference/wp_kses_allowed_html

I think textarea itself should not be allowed in textarea.

@bueltge
wp_kses_post does the same thing, except allow '<textarea>' tag, which - I think - shouldn't be.
https://core.trac.wordpress.org/browser/tags/4.9.8/src/wp-includes/kses.php#L1575

function wp_kses_post( $data ) {
    return wp_kses( $data, 'post' );
}

wp_slash More information.

update_post_meta( $post_id, $prefix.'content',wp_slash($_POST['content']) );

Try

//save this in the database
$content=sanitize_text_field( htmlentities($_POST['content']) );

//to display, use
html_entity_decode($content);
  1. htmlentities() will convert all characters which have HTML character entity equivalents to their equivalents.
  2. sanitize_text_field() will then check for invalid UTF-8 characters and strip them off. This can now be stored in the database.
  3. html_entity_decode() will convert HTML entities to their HTML tag equivalents

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.