
######
Encode
######


****
NAME
****


Kernel::System::Encode - character encodings


***********
DESCRIPTION
***********


This module will use Perl's Encode module (Perl 5.8.0 or higher is required).


****************
PUBLIC INTERFACE
****************


new()
=====


Don't use the constructor directly, use the ObjectManager instead:


.. code-block:: perl

     my $EncodeObject = $Kernel::OM->Get('Kernel::System::Encode');



Convert()
=========


Convert a string from one charset to another charset.


.. code-block:: perl

     my $utf8 = $EncodeObject->Convert(
         Text => $iso_8859_1_string,
         From => 'iso-8859-1',
         To   => 'utf-8',
     );
 
     my $iso_8859_1 = $EncodeObject->Convert(
         Text => $utf-8_string,
         From => 'utf-8',
         To   => 'iso-8859-1',
     );


There is also a Force => 1 option if you need to force the
already converted string. And Check => 1 if the string result
should be checked to be a valid string (e. g. valid utf-8 string).


Convert2CharsetInternal()
=========================


Convert given charset into the internal used charset (utf-8).
Should be used on all I/O interfaces.


.. code-block:: perl

     my $String = $EncodeObject->Convert2CharsetInternal(
         Text => $String,
         From => $SourceCharset,
     );



EncodeInput()
=============


By default, this function assumes incoming bytes to be well-formed UTF-8 and
will set the utf8 flag to make Perl treat it as such. This Should be used on
all I/O interfaces if and only if (see the warning in Encoding/_utf8_on!)
data is already utf-8. It modifies the scalar or array referenced by its
\ ``$What``\  parameter in place!


.. code-block:: perl

     $EncodeObject->EncodeInput( \$String );
     $EncodeObject->EncodeInput( \@Array );


If there is a possibility that strings may not be UTF-8, simply setting the
UTF-8 flag will probably lead to crashes down the road. In this case, set the
\ ``$Safe``\  argument to a true value to make the function use Encode/decode.
This is a bit slower and will produce mojibake if the input is
\ *decoded*\  UTF-8 already but will always yield \ *safe*\  results.

There are four possible values for \ ``$Safe``\ :


- \ ``undef``\ : Backwards-compatible behavior—don't use any safety measures, just turn on the UTF-8 flag and call it a day.



- \ ``1``\ : A simple 1 will decode UTF-8 and replace malformed sequences with an escape code and the hex byte values e.g. \ ``\x{0d}``\ 



- A coderef will be passed to Encode/decode to format your own replacement codes



- Anything else will be interpreted as the name of an alternative charset that should be tried in case UTF-8 decoding fails, falling back to the
\ ``\x{XX}``\  escapes as a last resort.




EncodeOutput()
==============


Convert utf-8 to a sequence of bytes. All possible characters have
a UTF-8 representation so this function cannot fail.

This should be used in for output of utf-8 chars.


.. code-block:: perl

     $EncodeObject->EncodeOutput( \$String );
 
     $EncodeObject->EncodeOutput( \@Array );



ConfigureOutputFileHandle()
===========================


switch output file handle to utf-8 output.


.. code-block:: perl

     $EncodeObject->ConfigureOutputFileHandle( FileHandle => \*STDOUT );



EncodingIsAsciiSuperset()
=========================


Checks if an encoding is a super-set of ASCII, that is, encodes the
codepoints from 0 to 127 the same way as ASCII.


.. code-block:: perl

     my $IsSuperset = $EncodeObject->EncodingIsAsciiSuperset(
         Encoding    => 'UTF-8',
     );



FindAsciiSupersetEncoding()
===========================


From a list of character encodings, returns the first that
is a super-set of ASCII. If none matches, \ ``ASCII``\  is returned.


.. code-block:: perl

     my $Encoding = $EncodeObject->FindAsciiSupersetEncoding(
         Encodings   => [ 'UTF-16LE', 'UTF-8' ],
     );



RemoveUTF8BOM()
===============


Removes UTF-8 BOM (if present) from start of given string.


.. code-block:: perl

     my $StringWithoutBOM = $EncodeObject->RemoveUTF8BOM(
         String => '<BOM>....',
     );
 
     Returns given string without BOM.





