PHP's strip_tags in JavaScript

Here’s what our current JavaScript equivalent to PHP's strip_tags looks like.

module.exports = function strip_tags (input, allowed) { // eslint-disable-line camelcase
// discuss at: https://locutus.io/php/strip_tags/
// original by: Kevin van Zonneveld (https://kvz.io)
// improved by: Luke Godfrey
// improved by: Kevin van Zonneveld (https://kvz.io)
// input by: Pul
// input by: Alex
// input by: Marc Palau
// input by: Brett Zamir (https://brett-zamir.me)
// input by: Bobby Drake
// input by: Evertjan Garretsen
// bugfixed by: Kevin van Zonneveld (https://kvz.io)
// bugfixed by: Onno Marsman (https://twitter.com/onnomarsman)
// bugfixed by: Kevin van Zonneveld (https://kvz.io)
// bugfixed by: Kevin van Zonneveld (https://kvz.io)
// bugfixed by: Eric Nagel
// bugfixed by: Kevin van Zonneveld (https://kvz.io)
// bugfixed by: Tomasz Wesolowski
// bugfixed by: Tymon Sturgeon (https://scryptonite.com)
// bugfixed by: Tim de Koning (https://www.kingsquare.nl)
// revised by: Rafał Kukawski (https://blog.kukawski.pl)
// example 1: strip_tags('<p>Kevin</p> <br /><b>van</b> <i>Zonneveld</i>', '<i><b>')
// returns 1: 'Kevin <b>van</b> <i>Zonneveld</i>'
// example 2: strip_tags('<p>Kevin <img src="someimage.png" onmouseover="someFunction()">van <i>Zonneveld</i></p>', '<p>')
// returns 2: '<p>Kevin van Zonneveld</p>'
// example 3: strip_tags("<a href='https://kvz.io'>Kevin van Zonneveld</a>", "<a>")
// returns 3: "<a href='https://kvz.io'>Kevin van Zonneveld</a>"
// example 4: strip_tags('1 < 5 5 > 1')
// returns 4: '1 < 5 5 > 1'
// example 5: strip_tags('1 <br/> 1')
// returns 5: '1 1'
// example 6: strip_tags('1 <br/> 1', '<br>')
// returns 6: '1 <br/> 1'
// example 7: strip_tags('1 <br/> 1', '<br><br/>')
// returns 7: '1 <br/> 1'
// example 8: strip_tags('<i>hello</i> <<foo>script>world<</foo>/script>')
// returns 8: 'hello world'
// example 9: strip_tags(4)
// returns 9: '4'
const _phpCastString = require('../_helpers/_phpCastString')
// making sure the allowed arg is a string containing only tags in lowercase (<a><b><c>)
allowed = (((allowed || '') + '').toLowerCase().match(/<[a-z][a-z0-9]*>/g) || []).join('')
const tags = /<\/?([a-z0-9]*)\b[^>]*>?/gi
const commentsAndPhpTags = /<!--[\s\S]*?-->|<\?(?:php)?[\s\S]*?\?>/gi
let after = _phpCastString(input)
// removes tha '<' char at the end of the string to replicate PHP's behaviour
after = (after.substring(after.length - 1) === '<') ? after.substring(0, after.length - 1) : after
// recursively remove tags to ensure that the returned string doesn't contain forbidden tags after previous passes (e.g. '<<bait/>switch/>')
while (true) {
const before = after
after = before.replace(commentsAndPhpTags, '').replace(tags, function ($0, $1) {
return allowed.indexOf('<' + $1.toLowerCase() + '>') > -1 ? $0 : ''
})
// return once no more tags are removed
if (before === after) {
return after
}
}
}
[ View on GitHub | Edit on GitHub | Source on GitHub ]

How to use

You you can install via npm install locutus and require it via require('locutus/php/strings/strip_tags'). You could also require the strings module in full so that you could access strings.strip_tags instead.

If you intend to target the browser, you can then use a module bundler such as Parcel, webpack, Browserify, or rollup.js. This can be important because Locutus allows modern JavaScript in the source files, meaning it may not work in all browsers without a build/transpile step. Locutus does transpile all functions to ES5 before publishing to npm.

A community effort

Not unlike Wikipedia, Locutus is an ongoing community effort. Our philosophy follows The McDonald’s Theory. This means that we don't consider it to be a bad thing that many of our functions are first iterations, which may still have their fair share of issues. We hope that these flaws will inspire others to come up with better ideas.

This way of working also means that we don't offer any production guarantees, and recommend to use Locutus inspiration and learning purposes only.

Examples

Please note that these examples are distilled from test cases that automatically verify our functions still work correctly. This could explain some quirky ones.

#codeexpected result
1strip_tags('<p>Kevin</p> <br /><b>van</b> <i>Zonneveld</i>', '<i><b>')'Kevin <b>van</b> <i>Zonneveld</i>'
2strip_tags('<p>Kevin <img src="someimage.png" onmouseover="someFunction()">van <i>Zonneveld</i></p>', '<p>')'<p>Kevin van Zonneveld</p>'
3strip_tags("<a href='https://kvz.io'>Kevin van Zonneveld</a>", "<a>")"<a href='https://kvz.io'>Kevin van Zonneveld</a>"
4strip_tags('1 < 5 5 > 1')'1 < 5 5 > 1'
5strip_tags('1 <br/> 1')'1 1'
6strip_tags('1 <br/> 1', '<br>')'1 <br/> 1'
7strip_tags('1 <br/> 1', '<br><br/>')'1 <br/> 1'
8strip_tags('<i>hello</i> <<foo>script>world<</foo>/script>')'hello world'
9strip_tags(4)'4'

« More PHP strings functions


Star