Sunday, October 14, 2007

phpBB3 & UTF-8 BOM/Signature

Update: (12-6-2006) Another post about UTF-8 BOM/Signature

When phpBB.com says language files need to be saved as UTF-8 with no BOM, they have a good reason. This also goes for other PHP files as well.

Apparently PHP has a bug which causes it to treat the BOM marker in a UTF-8 encoded PHP file as data which should be sent immediately, instead of throwing it away.
This prevents PHP from sending headers via the header() method, I'm guessing it can cause some havok with cookies as well.

A tell tale sign of this is PHP error messages that look similar to the following.

With phpBB3, the error message will look somthing like this
[phpBB Debug] PHP Notice: in file /includes/functions.php on line 3372: Cannot modify header information - headers already sent by (output started at /geoip_install.php:1)


PHP files in general will have a similar error message with slightly different wording, note the character position pointer at the end of the filename being "1".
Warning: Cannot modify header information - headers already sent by (output started at /var/www-partition/phpBB3/geoip_install.php:1) in /var/www-partition/phpBB3/includes/functions.php on line 3368


Now, in case your editor causes you to become confused because of the wording in the encoding options being somthing along the lines of Unicode BOM and UTF-8 Signature, it is UTF-8 Signature that should be considered UTF-8 BOM.
I'm not sure how many editors call it a Signature instead of a BOM, I know Notepad2 does though.

1 comment:

Joe said...

There's a basic Perl script for removing the UTF-8 BOM or UTF-8 Signature in this forum thread.