unicode - Why is it that "using anything but a utf-8 decoder...might be insecure" in a URL percent decoding algorithm? -

- May 15, 2011

i implementing url parser , have question w3c url spec (at http://www.w3.org/tr/2014/wd-url-1-20141209/ ) in section "2. percent-encoded bytes" has following algorithm (emphasis added):

to percent decode byte sequence input, run these steps:

using utf-8 decoder when input contains bytes outside range 0x00 0x7f might insecure , not recommended.

let output empty byte sequence.

for each byte byte in input, run these steps:

if byte not '%', append byte output.

otherwise, if byte '%' , next 2 bytes after byte in input not in ranges 0x30 0x39, 0x41 0x46, , 0x61 0x66, append byte output.

otherwise, run these substeps:

let bytepoint 2 bytes after byte in input, decoded, , interpreted hexadecimal number.

append byte value bytepoint output.

skip next 2 bytes in input.

return output.

in original spec, word "decoded" (in bold above) link utf-8 decoding algorithm. assume "utf-8 decoder" referred in second sentence (italicized) above.

i understand invalid sequences of utf-8 bytes can cause security problems. however, in step uses decoder, bytes have been verified valid ascii hex digits preceding sub-substep 2, seems using utf-8 decoder here security overkill.

can explain how using other utf-8 decoder in algorithm possibly insecure, when decoder used byte values in ranges 0x30 0x39, 0x41 0x46, , 0x61 0x66? or interpreting incorrectly in spec?

it seems me bytes outside range 0x00 0x7f copied output as-is (either in substep 1 because not %, or in sub-substep 2 because not ascii hex digits), never end in decoder in algorithm.

Search This Blog

Chrom

unicode - Why is it that "using anything but a utf-8 decoder...might be insecure" in a URL percent decoding algorithm? -

Comments

Post a Comment

Popular posts from this blog

qt - Using float or double for own QML classes -

json - ORA-06502: PL/SQL: numeric or value error: character string buffer too small - Convert Clob to varchar2 -

ios - Swift Array Resetting Itself -