Skip to content
Snippets Groups Projects
  • Jared Hancock's avatar
    htmLawed: Fix corruption to UTF8 encoded text · 959661f9
    Jared Hancock authored
    On some combinations of operating systems, PHP and libpcre versions, `\s`
    will match the iso-8859-x non-breaking-space, 0xa0. This regular expression
    will munge the UTF8 encoded version, 0xc2a0 to 0xc220, which is not a valid
    UTF8 character.
    
    When inserted into a UTF8 field in mysql, the text will be truncated at and
    after the first invalid character.
    959661f9