PHP's utf8_decode in TypeScript

✓ Verified: PHP 8.3
Examples tested against actual runtime. CI re-verifies continuously. Only documented examples are tested.

How to use

Install via yarn add locutus and import: import { utf8_decode } from 'locutus/php/xml/utf8_decode'.

Or with CommonJS: const { utf8_decode } = require('locutus/php/xml/utf8_decode')

Use a bundler that supports tree-shaking so you only ship the functions you actually use. Vite, webpack, Rollup, and Parcel all handle this. For server-side use this is less of a concern.

Examples

These examples are extracted from test cases that automatically verify our functions against their native counterparts.

#codeexpected result
1utf8_decode('Kevin van Zonneveld')'Kevin van Zonneveld'

Here's what our current TypeScript equivalent to PHP's utf8_decode looks like.

export function utf8_decode(strData: string | number | boolean | null | undefined): string {
// discuss at: https://locutus.io/php/utf8_decode/
// parity verified: PHP 8.3
// original by: Webtoolkit.info (https://www.webtoolkit.info/)
// input by: Aman Gupta
// input by: Brett Zamir (https://brett-zamir.me)
// improved by: Kevin van Zonneveld (https://kvz.io)
// improved by: Norman "zEh" Fuchs
// bugfixed by: hitwork
// bugfixed by: Onno Marsman (https://twitter.com/onnomarsman)
// bugfixed by: Kevin van Zonneveld (https://kvz.io)
// bugfixed by: kirilloid
// bugfixed by: w35l3y (https://www.wesley.eti.br)
// example 1: utf8_decode('Kevin van Zonneveld')
// returns 1: 'Kevin van Zonneveld'

const tmpArr: string[] = []
let i = 0
let c1 = 0
let seqlen = 0
const source = String(strData)

while (i < source.length) {
c1 = source.charCodeAt(i) & 0xff
seqlen = 0

// https://en.wikipedia.org/wiki/UTF-8#Codepage_layout
if (c1 <= 0xbf) {
c1 = c1 & 0x7f
seqlen = 1
} else if (c1 <= 0xdf) {
c1 = c1 & 0x1f
seqlen = 2
} else if (c1 <= 0xef) {
c1 = c1 & 0x0f
seqlen = 3
} else {
c1 = c1 & 0x07
seqlen = 4
}

for (let ai = 1; ai < seqlen; ++ai) {
c1 = (c1 << 0x06) | (source.charCodeAt(ai + i) & 0x3f)
}

if (seqlen === 4) {
c1 -= 0x10000
tmpArr.push(String.fromCharCode(0xd800 | ((c1 >> 10) & 0x3ff)))
tmpArr.push(String.fromCharCode(0xdc00 | (c1 & 0x3ff)))
} else {
tmpArr.push(String.fromCharCode(c1))
}

i += seqlen
}

return tmpArr.join('')
}

Improve this function

Locutus is a community effort following The McDonald's Theory: we ship first iterations, hoping others will improve them. If you see something that could be better, we'd love your contribution.

View on GitHub · Edit on GitHub · View Raw


« More PHP xml functions


Star