Some applications insert a signature or Byte Order Mark (BOM) at the beginning of UTF-8 text. For example, Notepad always adds a BOM when saving as UTF-8.

Older text editors or browsers will display the BOM as a blank line on-screen, others will display unexpected characters, such as . This may also occur in the latest browsers if a file that starts with a BOM is included into another file by PHP.

For more information, see the article Unexpected characters or blank lines and the test pages and results on the W3C site.

If you have problems that you think might be related to this, the following may help.

Checking for the BOM

I created a small utility that checks for a BOM at the beginning of a file. Just type in the URI for the file and it will take a look. (Note, if it’s a file included by PHP that you think is causing the problem, type in the URI of the included file.)

Removing the BOM

If there is a BOM, you will probably want to remove it. One way would be to save the file using a BOM-aware editor that allows you to specify that you don’t want a BOM at the start of the file. For example, if Dreamweaver detects a BOM the Save As dialogue box will have a check mark alongside the text “Include Unicode Signature (BOM)”. Just uncheck the box and save.

Another way would be to run a script on your file. Here is some simple Perl scripting to check for a BOM and remove it if it exists (developed by Martin Dürst and tweaked a little by myself).

# program to remove a leading UTF-8 BOM from a file
# works both STDIN -> STDOUT and on the spot (with filename as argument)

if ($#ARGV > 0) {
    print STDERR "Too many arguments!\n";
    exit;
    }

my @file;   # file content
my $lineno = 0;

my $filename = @ARGV[0];
if ($filename) {
    open( BOMFILE, $filename ) || die "Could not open source file for reading.";
    while (<BOMFILE>) {
        if ($lineno++ == 0) {
            if ( index( $_, '' ) == 0 ) {
                s/^\xEF\xBB\xBF//;
                print "BOM found and removed.\n";
                }
            else { print "No BOM found.\n"; }
            }
        push @file, $_ ;
        }
    close (BOMFILE)  || die "Can't close source file after reading.";

    open (NOBOMFILE, ">$filename") || die "Could not open source file for writing.";
    foreach $line (@file) {
        print NOBOMFILE $line;
        }
    close (NOBOMFILE)  || die "Can't close source file after writing.";
    }
else {  # STDIN -> STDOUT
    while (<>) {
    if (!$lineno++) {
        s/^\xEF\xBB\xBF//;
        }
    push @file, $_ ;
    }

    foreach $line (@file) {
        print $line;
        }
    }