草庐IT

go - 在 Go 中获取 unicode 脚本名称

coder 2024-07-08 原文

在 Go 中,我想获取给定特定语言的脚本的 RangeTable。

import (
    "golang.org/x/text/language"
    "unicode"
)

...

script, confidence := language.French.Script() 
scriptAsString := script.String() // here scriptAsString = "Latn"
rangeTable, ok := unicode.Scripts[scriptAsString]
// here ok = false, because the Scripts map has key "Latin" and not "Latn"

问题在于以下函数返回 script code

func (s Script) String() string 

而 map unicode.Scripts 的所有键都使用 script name

你知道原生 Go 库中是否有一种方法可以从 script codescript name 吗?

编辑:

在这里打开问题:github.com/golang/go/issues/31862

最佳答案

这就是我想要的那种映射:

var scriptCodeToScriptNames = map[string][]string {
    "Adlm": {"Adlam"},
    "Afak": {"Afaka"},
    "Aghb": {"Caucasian_Albanian"},
    "Ahom": {"Ahom"},
    "Arab": {"Arabic"},
    "Aran": {"Arabic"},
    "Armi": {"Imperial_Aramaic"},
    "Armn": {"Armenian"},
    "Avst": {"Avestan"},
    "Bali": {"Balinese"},
    "Bamu": {"Bamum"},
    "Bass": {"Bassa_Vah"},
    "Batk": {"Batak"},
    "Beng": {"Bengali"},
    "Bhks": {"Bhaiksuki"},
    "Bopo": {"Bopomofo"},
    "Brah": {"Brahmi"},
    "Brai": {"Braille"},
    "Bugi": {"Buginese"},
    "Buhd": {"Buhid"},
    "Cakm": {"Chakma"},
    "Cans": {"Canadian_Aboriginal"},
    "Cari": {"Carian"},
    "Cham": {"Cham"},
    "Cher": {"Cherokee"},
    "Copt": {"Coptic"},
    "Cpmn": {"Cypro-Minoan"},
    "Cprt": {"Cypriot"},
    "Cyrl": {"Cyrillic"},
    "Cyrs": {"Cyrillic"},
    "Deva": {"Devanagari"},
    "Dogr": {"Dogra"},
    "Dsrt": {"Deseret"},
    "Dupl": {"Duployan"},
    "Egyd": {"Egyptian_Demotic"},
    "Egyh": {"Egyptian_Hieratic"},
    "Egyp": {"Egyptian_Hieroglyphs"},
    "Elba": {"Elbasan"},
    "Ethi": {"Ethiopic"},
    "Geok": {"Khutsuri"},
    "Geor": {"Georgian"},
    "Glag": {"Glagolitic"},
    "Gong": {"Gunjala_Gondi"},
    "Gonm": {"Masaram_Gondi"},
    "Goth": {"Gothic"},
    "Gran": {"Grantha"},
    "Grek": {"Greek"},
    "Gujr": {"Gujarati"},
    "Guru": {"Gurmukhi"},
    "Hanb": {"Han", "Bopomofo"},
    "Hang": {"Hangul"},
    "Hani": {"Han"},
    "Hano": {"Hanunoo"},
    "Hans": {"Han"},
    "Hant": {"Han"},
    "Hatr": {"Hatran"},
    "Hebr": {"Hebrew"},
    "Hira": {"Hiragana"},
    "Hluw": {"Anatolian_Hieroglyphs"},
    "Hmng": {"Pahawh_Hmong"},
    "Hmnp": {"Nyiakeng_Puachue_Hmong"},
    "Hrkt": {"Hiragana", "Katakana"},
    "Hung": {"Old_Hungarian"},
    "Inds": {"Indus_(Harappan)"},
    "Ital": {"Old_Italic"},
    "Jamo": {"Hangul"},
    "Java": {"Javanese"},
    "Jpan": {"Han", "Hiragana", "Katakana"},
    "Jurc": {"Jurchen"},
    "Kali": {"Kayah_Li"},
    "Kana": {"Katakana"},
    "Khar": {"Kharoshthi"},
    "Khmr": {"Khmer"},
    "Khoj": {"Khojki"},
    "Kitl": {"Khitan_large_script"},
    "Kits": {"Khitan_small_script"},
    "Knda": {"Kannada"},
    "Kore": {"Hangul", "Han"},
    "Kpel": {"Kpelle"},
    "Kthi": {"Kaithi"},
    "Lana": {"Tai_Tham"},
    "Laoo": {"Lao"},
    "Latf": {"Latin"},
    "Latg": {"Latin"},
    "Latn": {"Latin"},
    "Leke": {"Leke"},
    "Lepc": {"Lepcha"},
    "Limb": {"Limbu"},
    "Lina": {"Linear_A"},
    "Linb": {"Linear_B"},
    "Lisu": {"Lisu"},
    "Loma": {"Loma"},
    "Lyci": {"Lycian"},
    "Lydi": {"Lydian"},
    "Mahj": {"Mahajani"},
    "Maka": {"Makasar"},
    "Mand": {"Mandaic"},
    "Mani": {"Manichaean"},
    "Marc": {"Marchen"},
    "Maya": {"Mayan_hieroglyphs"},
    "Mend": {"Mende_Kikakui"},
    "Merc": {"Meroitic_Cursive"},
    "Mero": {"Meroitic_Hieroglyphs"},
    "Mlym": {"Malayalam"},
    "Modi": {"Modi"},
    "Mong": {"Mongolian"},
    "Mroo": {"Mro"},
    "Mtei": {"Meetei_Mayek"},
    "Mult": {"Multani"},
    "Mymr": {"Myanmar"},
    "Narb": {"Old_North_Arabian"},
    "Nbat": {"Nabataean"},
    "Newa": {"Newa"},
    "Nkoo": {"Nko"},
    "Nshu": {"Nushu"},
    "Ogam": {"Ogham"},
    "Olck": {"Ol_Chiki"},
    "Orkh": {"Old_Turkic"},
    "Orya": {"Oriya"},
    "Osge": {"Osage"},
    "Osma": {"Osmanya"},
    "Palm": {"Palmyrene"},
    "Pauc": {"Pau_Cin_Hau"},
    "Perm": {"Old_Permic"},
    "Phag": {"Phags_Pa"},
    "Phli": {"Inscriptional_Pahlavi"},
    "Phlp": {"Psalter_Pahlavi"},
    "Phlv": {"Book_Pahlavi"},
    "Phnx": {"Phoenician"},
    "Plrd": {"Miao"},
    "Prti": {"Inscriptional_Parthian"},
    "Rjng": {"Rejang"},
    "Rohg": {"Hanifi_Rohingya"},
    "Roro": {"Rongorongo"},
    "Runr": {"Runic"},
    "Samr": {"Samaritan"},
    "Sara": {"Sarati"},
    "Sarb": {"Old_South_Arabian"},
    "Saur": {"Saurashtra"},
    "Sgnw": {"SignWriting"},
    "Shaw": {"Shavian"},
    "Shrd": {"Sharada"},
    "Shui": {"Shuishu"},
    "Sidd": {"Siddham"},
    "Sind": {"Khudawadi"},
    "Sinh": {"Sinhala"},
    "Sogd": {"Sogdian"},
    "Sogo": {"Old_Sogdian"},
    "Sora": {"Sora_Sompeng"},
    "Soyo": {"Soyombo"},
    "Sund": {"Sundanese"},
    "Sylo": {"Syloti_Nagri"},
    "Syrc": {"Syriac"},
    "Syre": {"Syriac"},
    "Syrj": {"Syriac"},
    "Syrn": {"Syriac"},
    "Tagb": {"Tagbanwa"},
    "Takr": {"Takri"},
    "Tale": {"Tai_Le"},
    "Talu": {"New_Tai_Lue"},
    "Taml": {"Tamil"},
    "Tang": {"Tangut"},
    "Tavt": {"Tai_Viet"},
    "Telu": {"Telugu"},
    "Teng": {"Tengwar"},
    "Tfng": {"Tifinagh"},
    "Tglg": {"Tagalog"},
    "Thaa": {"Thaana"},
    "Thai": {"Thai"},
    "Tibt": {"Tibetan"},
    "Tirh": {"Tirhuta"},
    "Ugar": {"Ugaritic"},
    "Vaii": {"Vai"},
    "Visp": {"Visible_Speech"},
    "Wara": {"Warang_Citi"},
    "Wcho": {"Wancho"},
    "Wole": {"Woleai"},
    "Xpeo": {"Old_Persian"},
    "Xsux": {"Cuneiform"},
    "Yiii": {"Yi"},
    "Zanb": {"Zanabazar_Square"},
    "Zinh": {"Inherited"},
    "Zyyy": {"Common"},
}

关于go - 在 Go 中获取 unicode 脚本名称,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56003838/

有关go - 在 Go 中获取 unicode 脚本名称的更多相关文章

  1. ruby-on-rails - unicode 字符串的长度 - 2

    在我的Rails(2.3,Ruby1.8.7)应用程序中,我需要将字符串截断到一定长度。该字符串是unicode,在控制台中运行测试时,例如'א'.length,我意识到返回了双倍长度。我想要一个与编码无关的长度,以便对unicode字符串或latin1编码字符串进行相同的截断。我已经了解了Ruby的大部分unicode资料,但仍然有些一头雾水。应该如何解决这个问题? 最佳答案 Rails有一个返回多字节字符的mb_chars方法。试试unicode_string.mb_chars.slice(0,50)

  2. ruby - 如何将脚本文件的末尾读取为数据文件(Perl 或任何其他语言) - 2

    我正在寻找执行以下操作的正确语法(在Perl、Shell或Ruby中):#variabletoaccessthedatalinesappendedasafileEND_OF_SCRIPT_MARKERrawdatastartshereanditcontinues. 最佳答案 Perl用__DATA__做这个:#!/usr/bin/perlusestrict;usewarnings;while(){print;}__DATA__Texttoprintgoeshere 关于ruby-如何将脚

  3. ruby-on-rails - 独立 ruby​​ 脚本的配置文件 - 2

    我有一个在Linux服务器上运行的ruby​​脚本。它不使用rails或任何东西。它基本上是一个命令行ruby​​脚本,可以像这样传递参数:./ruby_script.rbarg1arg2如何将参数抽象到配置文件(例如yaml文件或其他文件)中?您能否举例说明如何做到这一点?提前谢谢你。 最佳答案 首先,您可以运行一个写入YAML配置文件的独立脚本:require"yaml"File.write("path_to_yaml_file",[arg1,arg2].to_yaml)然后,在您的应用中阅读它:require"yaml"arg

  4. ruby - 简单获取法拉第超时 - 2

    有没有办法在这个简单的get方法中添加超时选项?我正在使用法拉第3.3。Faraday.get(url)四处寻找,我只能先发起连接后应用超时选项,然后应用超时选项。或者有什么简单的方法?这就是我现在正在做的:conn=Faraday.newresponse=conn.getdo|req|req.urlurlreq.options.timeout=2#2secondsend 最佳答案 试试这个:conn=Faraday.newdo|conn|conn.options.timeout=20endresponse=conn.get(url

  5. ruby - 从 Ruby 中的主机名获取 IP 地址 - 2

    我有一个存储主机名的Ruby数组server_names。如果我打印出来,它看起来像这样:["hostname.abc.com","hostname2.abc.com","hostname3.abc.com"]相当标准。我想要做的是获取这些服务器的IP(可能将它们存储在另一个变量中)。看起来IPSocket类可以做到这一点,但我不确定如何使用IPSocket类遍历它。如果它只是尝试像这样打印出IP:server_names.eachdo|name|IPSocket::getaddress(name)pnameend它提示我没有提供服务器名称。这是语法问题还是我没有正确使用类?输出:ge

  6. ruby - 获取模块中定义的所有常量的值 - 2

    我想获取模块中定义的所有常量的值:moduleLettersA='apple'.freezeB='boy'.freezeendconstants给了我常量的名字:Letters.constants(false)#=>[:A,:B]如何获取它们的值的数组,即["apple","boy"]? 最佳答案 为了做到这一点,请使用mapLetters.constants(false).map&Letters.method(:const_get)这将返回["a","b"]第二种方式:Letters.constants(false).map{|c

  7. ruby-on-rails - 获取 inf-ruby 以使用 ruby​​ 版本管理器 (rvm) - 2

    我安装了ruby​​版本管理器,并将RVM安装的ruby​​实现设置为默认值,这样'哪个ruby'显示'~/.rvm/ruby-1.8.6-p383/bin/ruby'但是当我在emacs中打开inf-ruby缓冲区时,它使用安装在/usr/bin中的ruby​​。有没有办法让emacs像shell一样尊重ruby​​的路径?谢谢! 最佳答案 我创建了一个emacs扩展来将rvm集成到emacs中。如果您有兴趣,可以在这里获取:http://github.com/senny/rvm.el

  8. Ruby 从大范围中获取第 n 个项目 - 2

    假设我有这个范围:("aaaaa".."zzzzz")如何在不事先/每次生成整个项目的情况下从范围中获取第N个项目? 最佳答案 一种快速简便的方法:("aaaaa".."zzzzz").first(42).last#==>"aaabp"如果出于某种原因你不得不一遍又一遍地这样做,或者如果你需要避免为前N个元素构建中间数组,你可以这样写:moduleEnumerabledefskip(n)returnto_enum:skip,nunlessblock_given?each_with_indexdo|item,index|yieldit

  9. ruby - Net::HTTP 获取源代码和状态 - 2

    我目前正在使用以下方法获取页面的源代码:Net::HTTP.get(URI.parse(page.url))我还想获取HTTP状态,而无需发出第二个请求。有没有办法用另一种方法做到这一点?我一直在查看文档,但似乎找不到我要找的东西。 最佳答案 在我看来,除非您需要一些真正的低级访问或控制,否则最好使用Ruby的内置Open::URI模块:require'open-uri'io=open('http://www.example.org/')#=>#body=io.read[0,50]#=>"["200","OK"]io.base_ur

  10. ruby - 没有类方法获取 Ruby 类名 - 2

    如何在Ruby中获取BasicObject实例的类名?例如,假设我有这个:classMyObjectSystem我怎样才能使这段代码成功?编辑:我发现Object的实例方法class被定义为returnrb_class_real(CLASS_OF(obj));。有什么方法可以从Ruby中使用它? 最佳答案 我花了一些时间研究irb并想出了这个:classBasicObjectdefclassklass=class这将为任何从BasicObject继承的对象提供一个#class您可以调用的方法。编辑评论中要求的进一步解释:假设你有对象

随机推荐