Perl Unicode Cookbook: Specify a File's Encoding

℞ 19: Open file with specific encoding

While setting the default Unicode encoding for IO is sensible, sometimes the default encoding is not correct. In this case, specify the encoding for a filehandle manually in the mode option to open or with the binmode operator. Perl’s IO layers will handle encoding and decoding for you. This is the normal way to deal with encoded text, not by calling low-level functions.

To specify the encoding of a filehandle opened for input:

    open(my $in_file, "< :encoding(UTF-16)", "wintext");
     # OR
     open(my $in_file, "<", "wintext");
     binmode($in_file, ":encoding(UTF-16)");

     # ...
     my $line = <$in_file>;

To specify the encoding of a filehandle opened for output:

     open($out_file, "> :encoding(cp1252)", "wintext");
     # OR
     open(my $out_file, ">", "wintext");
     binmode($out_file, ":encoding(cp1252)");

     # ...
     print $out_file "some text\n";

More layers than just the encoding can be specified here. For example, the incantation ":raw :encoding(UTF-16LE) :crlf" includes implicit CRLF handling. See PerlIO for more details.

Previous: ℞ 18: Make All I/O Default to UTF-8

Series Index: The Standard Preamble

Next: ℞ 20: Unicode Casing

Tags

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub