鸟语天空

C#中的编码与解码类

post by:追风剑情 2025-1-10 20:04

Encoding类

Encoding类位于 System.Text 命名空间中，主要用于在不同的编码和 Unicode 之间进行转换。下表中列出了 Encoding 类常见的属性和方法。

Encoding 类常见的属性和方法
名称		说明
属性	Default	获取系统当前 ANSI 代码页的编码
	Unicode	获取使用 Little-Endian 字节顺序的 UTF-16 格式的编码
	UTF-8	获取 UTF-8 格式的编码
	UTF-32	获取使用 Little-Endian 字节顺序的 UTF-32 格式的编码
	ASCII	获取 ASCII(7位) 字符集的编码
方法	Convert	将字节数组从一种编码转换为另一种编码
	GetBytes	将一组字符编码为一个字节序列
	GetString	将一个字节序列解码为一个字符串
	GetEncoder	获取一个编码器，该编码器将 Unicode 字符序列转换为已编码的字节序列
	GetDecoder	获取一个解码器，该解码器将已编码的字节序列转换为字符序列
	GetEncoding	返回指定格式的编码

利用 Encoding 类的 Convert 方法可将字节数组从一种编码转换为另一种编码。方法原型为：

public static byte[] Convert(Encoding srcEncoding, Encoding dstEncoding, byte[] bytes)

各参数的含义如下。
srcEncoding：表示源编码格式。
dstEncoding：表示目标编码格式。
bytes：待转换的字节数组。
返回值为包含转换结果的 Byte 类型的数组。

将 Unicode 字符串转换为 UTF-8 字符串时，可以参考以下步骤。

（1）利用 Encoding 的 UTF-8 和 Unicode 属性获取 UTF8 格式的编码实例 utf8 和 Unicode 编码实例 unicode，例如：

string unicodeString = "unicode字符串pi(\u03a0)";
Encoding unicode = Encoding.Unicode;
Encoding utf8 = Encoding.UTF8;

（2）利用 unicode 实例的 GetBytes 方法将 Unicode 字符编码为 Unicode 字节数组：

byte[] unicodeBytes = unicode.GetBytes(unicodeString);

（3）利用 Encoding 的 Convert 方法将 Unicode 字节数组转换为 UTF8 字节数组：

byte[] utf8Bytes = Encoding.Convert(Encoding.Unicode, Encoding.UTF8, unicodeBytes);

（4）最后利用实例 utf8 的 GetString 方法将 UTF8 字节数组解码为 UTF8 字符串：

string utf8String = utf8.GetString(utf8Bytes);

Encoder类与Decoder类

在网络传输和文件操作中，如果数据量比较大，需要将其划分为较小的块。对于跨块传输的情况，直接使用 Encoding 类的 GetBytes 方法编写程序比较麻烦，而 Encoder 和 Decoder 由于维护了数据块结尾信息，则可以轻松地实现跨块字符序列的正确编码和解码，因此它们在网络传输和文件操作中很有用。

Encoder 和 Decoder 类位于 System.Text 命名空间下，Encoder 可以将一组字符串转换为一个字节序列，而 Decoder 则将已编码的字节序列解码为字符序列。Encoder 编码的步骤为：

（1）获取 Encoder 实例。利用它对字符编码首先要获取 Encoder 类的实例，由于 Encoder 的构造函数为 protected，不能直接创建该类的实例，必须通过 Encoding 提供的 GetEncoder 方法创建实例，例如：

//获取 ASCII 编码的 Encoder 实例
Encoder ASCiiEncoder = Encoding.ASCII.GetEncoder();
//获取 Unicode 编码的 Encoder 实例
Encoder unicodeEncoder = Encoding.Unicode.GetEncoder();

（2）GetBytes 方法。获取 Encode 实例后，利用它的 GetBytes 方法将一组字符编码转换为字节序列。
方法原型：

public virtual int GetBytes(
   char[] chars,   //要编码的字符数组
   charIndex,      //第一个要编码的字符索引
   int charCount,  //要编码的字符的数目
   byte[] bytes,   //存储编码后的字节序列
   int byteIndex,  //开始写入所生成的字节序列的索引位置
   bool flush      //是否在转换后清楚编码器的内部状态
)

该方法将编码后的字节数组存储在参数 bytes 中，返回结果为写入 bytes 的实际字节数。如果设置 flush 为 false，则编码器会将数据块末尾的尾部字节存储在内部缓冲区中，为下次编码操作中使用这些字节做准备。

（3）GetByteCount 方法。该方法计算对字符序列进行编码后所产生的精确字节数，以确定 GetBytes 方法中 byte 类型数组实例的长度。
方法原型：

public abstract int GetByteCount(
   char[] chars,   //要编码的字符集的字符数组
   int index,      //第一个要编码的字符索引
   int count,      //要编码的字符的数目
   bool flush      //是否在转换后清楚编码器的内部状态
)

Decoder 类解码的步骤为：首先通过 Encoding 的 GetDecoder 方法创建 Decoder 实例，然后用实例的 GetChars 方法将字节序列解码为一组字符。

GetChars 方法用于将一个字节序列解码为一组字符，并从指定的索引位置开始存储这组字符。
方法原型：

public abstract int GetChars(
   byte[] bytes,   //要解码的字符序列的字节数组
   int byteIndex,  //第一个要解码的字节的索引
   int byteCount,  //要解码的字符的数目
   char[] chars,   //包含所生成的字符集的字符数组
   int charIndex   //开始写入所生成的字符集的字节数组的索引位置
)

该方法返回 chars 写入的实际字符数。

【例】利用 Encoder 和 Decoder 类实现编码和解码。

using System;
using System.Text;
public class Program
{
    static void Main(string[] args)
    {
        //Encode
        string test = "ABCED1234测试";
        Console.WriteLine("The test of string is {0}", test);
        Encoding encoding = Encoding.UTF8;
        char[] source = test.ToCharArray();
        int strLength = test.Length;
        int len = encoding.GetEncoder().GetByteCount(source, 0, strLength, false);
        byte[] result = new byte[len];
        encoding.GetEncoder().GetBytes(source, 0, strLength, result, 0, false);
        Console.WriteLine("After Encoder, the byte of test is output below.");
        foreach(byte b in result)
        {
            Console.Write("{0:X}-", b);
        }
        Console.WriteLine();
        //Decoder
        Console.Write("After Decoder, the string is ");
        int deslen = encoding.GetDecoder().GetCharCount(result, 0, result.Length);
        char[] des = new char[deslen];
        encoding.GetDecoder().GetChars(result, 0, result.Length, des, 0);
        foreach(char c in des)
        {
            Console.Write("{0}", c);
        }
        Console.WriteLine("\n");
        Console.ReadKey();
    }
}

程序运行结果

评论：

发表评论：