c++ - windows中宽字符和多字节字符串之间如何相互转换？

coder 2024-06-07 原文

我有一个 Windows 应用程序，其中字符串类型是 WCHAR*。我需要将其转换为 char* 以传递到 C API。我正在使用 MultiByteToWideChar 和 WideCharToMultiByte 函数来执行转换。

但由于某些原因，转换不正确。我在输出中看到很多乱码。以下代码是在 this 中找到的修改版本计算器答案。

WCHAR* convert_to_wstring(const char* str)
{
    int size_needed = MultiByteToWideChar(CP_UTF8, 0, str, (int)strlen(str), NULL, 0);
    WCHAR* wstrTo = (WCHAR*)malloc(size_needed);
    MultiByteToWideChar(CP_UTF8, 0, &str[0], (int)strlen(str), wstrTo, size_needed);
    return wstrTo;
}

char* convert_from_wstring(const WCHAR* wstr)
{
    int size_needed = WideCharToMultiByte(CP_UTF8, 0, wstr, (int)wcslen(wstr), NULL, 0, NULL, NULL);
    char* strTo = (char*)malloc(size_needed);
    WideCharToMultiByte(CP_UTF8, 0, wstr, (int)wcslen(wstr), strTo, size_needed, NULL, NULL);
    return strTo;
}

int main()
{
    const WCHAR* wText = L"Wide string";
    const char* text = convert_from_wstring(wText);
    std::cout << text << "\n";
    std::cout << convert_to_wstring("Multibyte string") << "\n";
    return 0;
}

最佳答案

您的转换函数有问题。

MultiByteToWideChar()的返回值是一个宽字符数，而不是像您当前正在处理的字节数。您需要将值乘以 sizeof(WCHAR)打电话时 malloc() .

您也没有考虑到返回值不包含空终止符的空间，因为您没有传递 -1在cbMultiByte范围。 Read the MultiByteToWideChar() documentation :

cbMultiByte [in]
Size, in bytes, of the string indicated by the lpMultiByteStr parameter. Alternatively, this parameter can be set to -1 if the string is null-terminated. Note that, if cbMultiByte is 0, the function fails.

If this parameter is -1, the function processes the entire input string, including the terminating null character. Therefore, the resulting Unicode string has a terminating null character, and the length returned by the function includes this character.

If this parameter is set to a positive integer, the function processes exactly the specified number of bytes. If the provided size does not include a terminating null character, the resulting Unicode string is not null-terminated, and the returned length does not include this character.

...

Return value

Returns the number of characters written to the buffer indicated by lpWideCharStr if successful. If the function succeeds and cchWideChar is 0, the return value is the required size, in characters, for the buffer indicated by lpWideCharStr.

您没有以 null 终止输出字符串。

您的 convert_from_wstring() 也是如此功能。 Read the WideCharToMultiByte() documentation :

cchWideChar [in]
Size, in characters, of the string indicated by lpWideCharStr. Alternatively, this parameter can be set to -1 if the string is null-terminated. If cchWideChar is set to 0, the function fails.

If this parameter is -1, the function processes the entire input string, including the terminating null character. Therefore, the resulting character string has a terminating null character, and the length returned by the function includes this character.

If this parameter is set to a positive integer, the function processes exactly the specified number of characters. If the provided size does not include a terminating null character, the resulting character string is not null-terminated, and the returned length does not include this character.

...

Return value

Returns the number of bytes written to the buffer pointed to by lpMultiByteStr if successful. If the function succeeds and cbMultiByte is 0, the return value is the required size, in bytes, for the buffer indicated by lpMultiByteStr.

也就是说，您的 main()代码正在泄漏分配的字符串。因为它们分配有 malloc() ，您需要使用 free() 解除分配它们当你用完它们时:

此外，您不能传递 WCHAR*字符串到 std::cout .好吧，你可以，但它没有 operator<<用于宽字符串输入，但它确实有一个 operator<<对于 void*输入，所以它最终只会输出 WCHAR* 的内存地址指的是，而不是实际的字符。如果要输出宽字符串，请使用 std::wcout相反。

尝试更像这样的东西:

WCHAR* convert_to_wstring(const char* str)
{
    int str_len = (int) strlen(str);
    int num_chars = MultiByteToWideChar(CP_UTF8, 0, str, str_len, NULL, 0);
    WCHAR* wstrTo = (WCHAR*) malloc((num_chars + 1) * sizeof(WCHAR));
    if (wstrTo)
    {
        MultiByteToWideChar(CP_UTF8, 0, str, str_len, wstrTo, num_chars);
        wstrTo[num_chars] = L'\0';
    }
    return wstrTo;
}

CHAR* convert_from_wstring(const WCHAR* wstr)
{
    int wstr_len = (int) wcslen(wstr);
    int num_chars = WideCharToMultiByte(CP_UTF8, 0, wstr, wstr_len, NULL, 0, NULL, NULL);
    CHAR* strTo = (CHAR*) malloc((num_chars + 1) * sizeof(CHAR));
    if (strTo)
    {
        WideCharToMultiByte(CP_UTF8, 0, wstr, wstr_len, strTo, num_chars, NULL, NULL);
        strTo[num_chars] = '\0';
    }
    return strTo;
}

int main()
{
    const WCHAR* wText = L"Wide string";
    const char* text = convert_from_wstring(wText);
    std::cout << text << "\n";
    free(text);

    const WCHAR *wtext = convert_to_wstring("Multibyte string");
    std::wcout << wtext << "\n";
    free(wtext);

    return 0;
}

话虽这么说，你真的应该使用 std::string和 std::wstring而不是 char*和 wchar_t*为了更好的内存管理:

std::wstring convert_to_wstring(const std::string &str)
{
    int num_chars = MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), NULL, 0);
    std::wstring wstrTo;
    if (num_chars)
    {
        wstrTo.resize(num_chars);
        MultiByteToWideChar(CP_UTF8, 0, str.c_str(), str.length(), &wstrTo[0], num_chars);
    }
    return wstrTo;
}

std::string convert_from_wstring(const std::wstring &wstr)
{
    int num_chars = WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.length(), NULL, 0, NULL, NULL);
    std::string strTo;
    if (num_chars > 0)
    {
        strTo.resize(num_chars);
        WideCharToMultiByte(CP_UTF8, 0, wstr.c_str(), wstr.length(), &strTo[0], num_chars, NULL, NULL);
    }
    return strTo;
}

int main()
{
    const WCHAR* wText = L"Wide string";
    const std::string text = convert_from_wstring(wText);
    std::cout << text << "\n";

    const std::wstring wtext = convert_to_wstring("Multibyte string");
    std::wcout << wtext << "\n";

    return 0;
}

如果您使用的是 C++11 或更高版本，请查看 std::wstring_convert 用于在 UTF 字符串之间进行转换的类，例如:

std::wstring convert_to_wstring(const std::string &str)
{
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t> conv;
    return conv.from_bytes(str);
}

std::string convert_from_wstring(const std::wstring &wstr)
{
    std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>, wchar_t> conv;
    return conv.to_bytes(wstr);
}

如果您需要与基于char* 的其他代码进行交互/wchar_t* , std::string作为接受 char* 的构造函数输入和一个 c_str()可用于 char* 的方法输出，同样适用于std::wstring和 wchar_t* .

关于c++ - windows中宽字符和多字节字符串之间如何相互转换？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/42793735/

多字 amp code wstring the c++windows unicode

有关c++ - windows中宽字符和多字节字符串之间如何相互转换？的更多相关文章

ruby - 如何使用 Nokogiri 的 xpath 和 at_xpath 方法 - 2
我正在学习如何使用Nokogiri，根据这段代码我遇到了一些问题:require'rubygems'require'mechanize'post_agent=WWW::Mechanize.newpost_page=post_agent.get('http://www.vbulletin.org/forum/showthread.php?t=230708')puts"\nabsolutepathwithtbodygivesnil"putspost_page.parser.xpath('/html/body/div/div/div/div/div/table/tbody/tr/td/div
ruby - 如何从 ruby 中的字符串运行任意对象方法？ - 2
总的来说，我对ruby还比较陌生，我正在为我正在创建的对象编写一些rspec测试用例。许多测试用例都非常基础，我只是想确保正确填充和返回值。我想知道是否有办法使用循环结构来执行此操作。不必为我要测试的每个方法都设置一个assertEquals。例如:describeitem,"TestingtheItem"doit"willhaveanullvaluetostart"doitem=Item.new#HereIcoulddotheitem.name.shouldbe_nil#thenIcoulddoitem.category.shouldbe_nilendend但我想要一些方法来使用
Ruby 解析字符串 - 2
我有一个字符串input="maybe(thisis|thatwas)some((nice|ugly)(day|night)|(strange(weather|time)))"Ruby中解析该字符串的最佳方法是什么？我的意思是脚本应该能够像这样构建句子:maybethisissomeuglynightmaybethatwassomenicenightmaybethiswassomestrangetime等等，你明白了......我应该一个字符一个字符地读取字符串并构建一个带有堆栈的状态机来存储括号值以供以后计算，还是有更好的方法？也许为此目的准备了一个开箱即用的库？
ruby-on-rails - 在 Rails 中将文件大小字符串转换为等效千字节 - 2
我的目标是转换表单输入，例如“100兆字节”或“1GB”，并将其转换为我可以存储在数据库中的文件大小(以千字节为单位)。目前，我有这个:defquota_convert@regex=/([0-9]+)(.*)s/@sizes=%w{kilobytemegabytegigabyte}m=self.quota.match(@regex)if@sizes.include?m[2]eval("self.quota=#{m[1]}.#{m[2]}")endend这有效，但前提是输入是倍数(“gigabytes”，而不是“gigabyte”)并且由于使用了eval看起来疯狂不安全。所以，功能正常，
ruby-on-rails - unicode 字符串的长度 - 2
在我的Rails(2.3，Ruby1.8.7)应用程序中，我需要将字符串截断到一定长度。该字符串是unicode，在控制台中运行测试时，例如'א'.length，我意识到返回了双倍长度。我想要一个与编码无关的长度，以便对unicode字符串或latin1编码字符串进行相同的截断。我已经了解了Ruby的大部分unicode资料，但仍然有些一头雾水。应该如何解决这个问题？最佳答案 Rails有一个返回多字节字符的mb_chars方法。试试unicode_string.mb_chars.slice(0,50)
ruby - 在 Ruby 程序执行时阻止 Windows 7 PC 进入休眠状态 - 2
我需要在客户计算机上运行Ruby应用程序。通常需要几天才能完成(复制大备份文件)。问题是如果启用sleep，它会中断应用程序。否则，计算机将持续运行数周，直到我下次访问为止。有什么方法可以防止执行期间休眠并让Windows在执行后休眠吗？欢迎任何疯狂的想法;-) 最佳答案 Here建议使用SetThreadExecutionStateWinAPI函数，使应用程序能够通知系统它正在使用中，从而防止系统在应用程序运行时进入休眠状态或关闭显示。像这样的东西:require'Win32API'ES_AWAYMODE_REQUIRED=0x0
python - 如何使用 Ruby 或 Python 创建一系列高音调和低音调的蜂鸣声？ - 2
关闭。这个问题是opinion-based.它目前不接受答案。想要改进这个问题？更新问题，以便editingthispost可以用事实和引用来回答它.关闭4年前。Improvethisquestion我想在固定时间创建一系列低音和高音调的哔哔声。例如:在150毫秒时发出高音调的蜂鸣声在151毫秒时发出低音调的蜂鸣声200毫秒时发出低音调的蜂鸣声250毫秒的高音调蜂鸣声有没有办法在Ruby或Python中做到这一点？我真的不在乎输出编码是什么(.wav、.mp3、.ogg等等)，但我确实想创建一个输出文件。
ruby-on-rails - 如何验证 update_all 是否实际在 Rails 中更新 - 2
给定这段代码defcreate@upgrades=User.update_all(["role=?","upgraded"],:id=>params[:upgrade])redirect_toadmin_upgrades_path,:notice=>"Successfullyupgradeduser."end我如何在该操作中实际验证它们是否已保存或未重定向到适当的页面和消息？最佳答案在Rails3中，update_all不返回任何有意义的信息，除了已更新的记录数(这可能取决于您的DBMS是否返回该信息)。http://ar.ru
ruby-on-rails - 'compass watch' 是如何工作的/它是如何与 rails 一起使用的 - 2
我在我的项目目录中完成了compasscreate.和compassinitrails。几个问题:我已将我的.sass文件放在public/stylesheets中。这是放置它们的正确位置吗？当我运行compasswatch时，它不会自动编译这些.sass文件。我必须手动指定文件:compasswatchpublic/stylesheets/myfile.sass等。如何让它自动运行？文件ie.css、print.css和screen.css已放在stylesheets/compiled。如何在编译后不让它们重新出现的情况下删除它们？我自己编译的.sass文件编译成compiled/t
ruby - 将差异补丁应用于字符串/文件 - 2
对于具有离线功能的智能手机应用程序，我正在为Xml文件创建单向文本同步。我希望我的服务器将增量/差异(例如GNU差异补丁)发送到目标设备。这是计划:Time=0Server:hasversion_1ofXmlfile(~800kiB)Client:hasversion_1ofXmlfile(~800kiB)Time=1Server:hasversion_1andversion_2ofXmlfile(each~800kiB)computesdeltaoftheseversions(=patch)(~10kiB)sendspatchtoClient(~10kiBtransferred)Cl

c++ - windows中宽字符和多字节字符串之间如何相互转换？

有关c++ - windows中宽字符和多字节字符串之间如何相互转换？的更多相关文章

随机推荐