wp_kses() strips data attributes even if it's in the allowed list

I added a function that will return the allowed html tags array

if ( ! function_exists( 'allowed_html_tags' ) ) {
  /**
   * Allowed html tags for wp_kses() function
   *
   * @return array Array of allowed html tags.
   */
  function allowed_html_tags() {
    return array(
      'a' = array(
        'href' = array(),
        'title' = array(),
        'class' = array(),
        'data' = array(),
        'rel'   = array(),
      ),
      'br' = array(),
      'em' = array(),
      'ul' = array(
          'class' = array(),
      ),
      'ol' = array(
          'class' = array(),
      ),
      'li' = array(
          'class' = array(),
      ),
      'strong' = array(),
      'div' = array(
        'class' = array(),
        'data' = array(),
        'style' = array(),
      ),
      'span' = array(
        'class' = array(),
        'style' = array(),
      ),
      'img' = array(
          'alt'    = array(),
          'class'  = array(),
          'height' = array(),
          'src'    = array(),
          'width'  = array(),
      ),
      'select' = array(
          'id'   = array(),
          'class' = array(),
          'name' = array(),
      ),
      'option' = array(
          'value' = array(),
          'selected' = array(),
      ),
    );
  }
}

But when I have html in a variable that is populated in a foreach loop, my data attributes get stripped out.

$my_var = 'div class=my-class data-term=$term_id$content/div';

wp_kses( $my_var, allowed_html_tags() );

This will return

div class=my-classThis is my content... no data attribute.../div

I tried modifying my array to have data-* but that didn't work.

I hope that you don't have to modify the allowed array with the full data name (data-term) for this to work...

EDIT

Check Matt Thomason's answer about the update to the kses data.

Topic wp-kses Wordpress

Category Web


Update for anyone coming here post-Dec 2018:

data-* is now supported in KSES filters, since this commit - https://github.com/markjaquith/WordPress/commit/a0309e80b6a4d805e4f230649be07b4bfb1a56a5#diff-a0e0d196dd71dde453474b0f791828fe

So you can now do something like this:

add_filter('wp_kses_allowed_html', "kses_filter_allowed_html"));

function kses_filter_allowed_html( $allowed, $context )
{
    if (is_array($context))
    {
        return $allowed;
    }

    if ($context === 'post')
    {
        $allowed['a']['data-*'] = true;
        $allowed['table']['data-*'] = true; 
        // ... keep on doing these for each HTML entity you want to allow data- attributes on
    }

    return $allowed;
}

I hope that you don't have to modify the allowed array with the full data name (data-term) for this to work...

It appears to be that way. data-term and data aren't the same attribute after all, and poking around in core I don't think any sort of regular expressions can be used as supported attributes.

You shouldn't need to run wp_kses() on your own markup though, you should know it's safe. wp_kses() is generally just for handling untrusted input from users. Are users going to be submitting data- attributes, and you need to support them all?

You could do something like this instead:

$my_var = '<div class="my-class" data-term="' . esc_attr( $term_id ) . '">' . wp_kses_post( $content ) . '</div>';

That uses wp_kses_post() which will use the default allowed html for posts, but it's only going to apply to whatever $content is.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.