Skip to content
Snippets Groups Projects
Commit 4da0324b authored by JediKev's avatar JediKev
Browse files

issue: ISO-8859-8-i Charset Issues

This addresses an issue where emails with `ISO-8859-8-i` character-sets
appear as "(empty)" in the system. This is due to `ISO-8859-8-i` not being a
valid character-set for `iconv()`. When you pass `ISO-8859-8-i` to `iconv()`
you will receive an error similar to `iconv(): Wrong charset, conversion
from 'ISO-8859-8-i' to 'UTF-8//IGNORE' is not allowed`. I don’t know why
it's not a valid character-set for `iconv()` but the trailing `-i` is used
to say "keep the text in logical order instead of visual order". Logical
order just means to keep the text in true right-to-left format instead of
transcoding the characters to left-to-right format.

This adds a new case to the `Charset::normalize()` switch statement to match
against `ISO-XXXX-X-i`. If a character set matches the criteria we will
remove the trailing `-i` and set the charset to `ISO-XXXX-X`. This charset
format is valid in `iconv()` which will return the correctly formatted email
instead of "(empty)".
parent f4d5adde
No related branches found
No related tags found
No related merge requests found
......@@ -29,7 +29,8 @@ class Charset {
// ks_c_5601-1987: Korean alias for cp949
case preg_match('`^ks_c_5601-1987`i', $charset):
return 'cp949';
case preg_match('`^iso-?(\S+)$`i', $charset, $match):
// Remove trailing junk from ISO charset
case preg_match('`^iso-?(\S+[^i])(-i)?$`i', $charset, $match):
return "ISO-".$match[1];
// GBK superceded gb2312 and is backward compatible
case preg_match('`^gb2312`i', $charset):
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment