在 Go 中,我想获取给定特定语言的脚本的 RangeTable。
import (
"golang.org/x/text/language"
"unicode"
)
...
script, confidence := language.French.Script()
scriptAsString := script.String() // here scriptAsString = "Latn"
rangeTable, ok := unicode.Scripts[scriptAsString]
// here ok = false, because the Scripts map has key "Latin" and not "Latn"
问题在于以下函数返回 script code
func (s Script) String() string
而 map unicode.Scripts 的所有键都使用 script name
你知道原生 Go 库中是否有一种方法可以从 script code 到 script name 吗?
编辑:
在这里打开问题:github.com/golang/go/issues/31862
最佳答案
这就是我想要的那种映射:
var scriptCodeToScriptNames = map[string][]string {
"Adlm": {"Adlam"},
"Afak": {"Afaka"},
"Aghb": {"Caucasian_Albanian"},
"Ahom": {"Ahom"},
"Arab": {"Arabic"},
"Aran": {"Arabic"},
"Armi": {"Imperial_Aramaic"},
"Armn": {"Armenian"},
"Avst": {"Avestan"},
"Bali": {"Balinese"},
"Bamu": {"Bamum"},
"Bass": {"Bassa_Vah"},
"Batk": {"Batak"},
"Beng": {"Bengali"},
"Bhks": {"Bhaiksuki"},
"Bopo": {"Bopomofo"},
"Brah": {"Brahmi"},
"Brai": {"Braille"},
"Bugi": {"Buginese"},
"Buhd": {"Buhid"},
"Cakm": {"Chakma"},
"Cans": {"Canadian_Aboriginal"},
"Cari": {"Carian"},
"Cham": {"Cham"},
"Cher": {"Cherokee"},
"Copt": {"Coptic"},
"Cpmn": {"Cypro-Minoan"},
"Cprt": {"Cypriot"},
"Cyrl": {"Cyrillic"},
"Cyrs": {"Cyrillic"},
"Deva": {"Devanagari"},
"Dogr": {"Dogra"},
"Dsrt": {"Deseret"},
"Dupl": {"Duployan"},
"Egyd": {"Egyptian_Demotic"},
"Egyh": {"Egyptian_Hieratic"},
"Egyp": {"Egyptian_Hieroglyphs"},
"Elba": {"Elbasan"},
"Ethi": {"Ethiopic"},
"Geok": {"Khutsuri"},
"Geor": {"Georgian"},
"Glag": {"Glagolitic"},
"Gong": {"Gunjala_Gondi"},
"Gonm": {"Masaram_Gondi"},
"Goth": {"Gothic"},
"Gran": {"Grantha"},
"Grek": {"Greek"},
"Gujr": {"Gujarati"},
"Guru": {"Gurmukhi"},
"Hanb": {"Han", "Bopomofo"},
"Hang": {"Hangul"},
"Hani": {"Han"},
"Hano": {"Hanunoo"},
"Hans": {"Han"},
"Hant": {"Han"},
"Hatr": {"Hatran"},
"Hebr": {"Hebrew"},
"Hira": {"Hiragana"},
"Hluw": {"Anatolian_Hieroglyphs"},
"Hmng": {"Pahawh_Hmong"},
"Hmnp": {"Nyiakeng_Puachue_Hmong"},
"Hrkt": {"Hiragana", "Katakana"},
"Hung": {"Old_Hungarian"},
"Inds": {"Indus_(Harappan)"},
"Ital": {"Old_Italic"},
"Jamo": {"Hangul"},
"Java": {"Javanese"},
"Jpan": {"Han", "Hiragana", "Katakana"},
"Jurc": {"Jurchen"},
"Kali": {"Kayah_Li"},
"Kana": {"Katakana"},
"Khar": {"Kharoshthi"},
"Khmr": {"Khmer"},
"Khoj": {"Khojki"},
"Kitl": {"Khitan_large_script"},
"Kits": {"Khitan_small_script"},
"Knda": {"Kannada"},
"Kore": {"Hangul", "Han"},
"Kpel": {"Kpelle"},
"Kthi": {"Kaithi"},
"Lana": {"Tai_Tham"},
"Laoo": {"Lao"},
"Latf": {"Latin"},
"Latg": {"Latin"},
"Latn": {"Latin"},
"Leke": {"Leke"},
"Lepc": {"Lepcha"},
"Limb": {"Limbu"},
"Lina": {"Linear_A"},
"Linb": {"Linear_B"},
"Lisu": {"Lisu"},
"Loma": {"Loma"},
"Lyci": {"Lycian"},
"Lydi": {"Lydian"},
"Mahj": {"Mahajani"},
"Maka": {"Makasar"},
"Mand": {"Mandaic"},
"Mani": {"Manichaean"},
"Marc": {"Marchen"},
"Maya": {"Mayan_hieroglyphs"},
"Mend": {"Mende_Kikakui"},
"Merc": {"Meroitic_Cursive"},
"Mero": {"Meroitic_Hieroglyphs"},
"Mlym": {"Malayalam"},
"Modi": {"Modi"},
"Mong": {"Mongolian"},
"Mroo": {"Mro"},
"Mtei": {"Meetei_Mayek"},
"Mult": {"Multani"},
"Mymr": {"Myanmar"},
"Narb": {"Old_North_Arabian"},
"Nbat": {"Nabataean"},
"Newa": {"Newa"},
"Nkoo": {"Nko"},
"Nshu": {"Nushu"},
"Ogam": {"Ogham"},
"Olck": {"Ol_Chiki"},
"Orkh": {"Old_Turkic"},
"Orya": {"Oriya"},
"Osge": {"Osage"},
"Osma": {"Osmanya"},
"Palm": {"Palmyrene"},
"Pauc": {"Pau_Cin_Hau"},
"Perm": {"Old_Permic"},
"Phag": {"Phags_Pa"},
"Phli": {"Inscriptional_Pahlavi"},
"Phlp": {"Psalter_Pahlavi"},
"Phlv": {"Book_Pahlavi"},
"Phnx": {"Phoenician"},
"Plrd": {"Miao"},
"Prti": {"Inscriptional_Parthian"},
"Rjng": {"Rejang"},
"Rohg": {"Hanifi_Rohingya"},
"Roro": {"Rongorongo"},
"Runr": {"Runic"},
"Samr": {"Samaritan"},
"Sara": {"Sarati"},
"Sarb": {"Old_South_Arabian"},
"Saur": {"Saurashtra"},
"Sgnw": {"SignWriting"},
"Shaw": {"Shavian"},
"Shrd": {"Sharada"},
"Shui": {"Shuishu"},
"Sidd": {"Siddham"},
"Sind": {"Khudawadi"},
"Sinh": {"Sinhala"},
"Sogd": {"Sogdian"},
"Sogo": {"Old_Sogdian"},
"Sora": {"Sora_Sompeng"},
"Soyo": {"Soyombo"},
"Sund": {"Sundanese"},
"Sylo": {"Syloti_Nagri"},
"Syrc": {"Syriac"},
"Syre": {"Syriac"},
"Syrj": {"Syriac"},
"Syrn": {"Syriac"},
"Tagb": {"Tagbanwa"},
"Takr": {"Takri"},
"Tale": {"Tai_Le"},
"Talu": {"New_Tai_Lue"},
"Taml": {"Tamil"},
"Tang": {"Tangut"},
"Tavt": {"Tai_Viet"},
"Telu": {"Telugu"},
"Teng": {"Tengwar"},
"Tfng": {"Tifinagh"},
"Tglg": {"Tagalog"},
"Thaa": {"Thaana"},
"Thai": {"Thai"},
"Tibt": {"Tibetan"},
"Tirh": {"Tirhuta"},
"Ugar": {"Ugaritic"},
"Vaii": {"Vai"},
"Visp": {"Visible_Speech"},
"Wara": {"Warang_Citi"},
"Wcho": {"Wancho"},
"Wole": {"Woleai"},
"Xpeo": {"Old_Persian"},
"Xsux": {"Cuneiform"},
"Yiii": {"Yi"},
"Zanb": {"Zanabazar_Square"},
"Zinh": {"Inherited"},
"Zyyy": {"Common"},
}
关于go - 在 Go 中获取 unicode 脚本名称,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56003838/
在我的Rails(2.3,Ruby1.8.7)应用程序中,我需要将字符串截断到一定长度。该字符串是unicode,在控制台中运行测试时,例如'א'.length,我意识到返回了双倍长度。我想要一个与编码无关的长度,以便对unicode字符串或latin1编码字符串进行相同的截断。我已经了解了Ruby的大部分unicode资料,但仍然有些一头雾水。应该如何解决这个问题? 最佳答案 Rails有一个返回多字节字符的mb_chars方法。试试unicode_string.mb_chars.slice(0,50)
我正在寻找执行以下操作的正确语法(在Perl、Shell或Ruby中):#variabletoaccessthedatalinesappendedasafileEND_OF_SCRIPT_MARKERrawdatastartshereanditcontinues. 最佳答案 Perl用__DATA__做这个:#!/usr/bin/perlusestrict;usewarnings;while(){print;}__DATA__Texttoprintgoeshere 关于ruby-如何将脚
我有一个在Linux服务器上运行的ruby脚本。它不使用rails或任何东西。它基本上是一个命令行ruby脚本,可以像这样传递参数:./ruby_script.rbarg1arg2如何将参数抽象到配置文件(例如yaml文件或其他文件)中?您能否举例说明如何做到这一点?提前谢谢你。 最佳答案 首先,您可以运行一个写入YAML配置文件的独立脚本:require"yaml"File.write("path_to_yaml_file",[arg1,arg2].to_yaml)然后,在您的应用中阅读它:require"yaml"arg
有没有办法在这个简单的get方法中添加超时选项?我正在使用法拉第3.3。Faraday.get(url)四处寻找,我只能先发起连接后应用超时选项,然后应用超时选项。或者有什么简单的方法?这就是我现在正在做的:conn=Faraday.newresponse=conn.getdo|req|req.urlurlreq.options.timeout=2#2secondsend 最佳答案 试试这个:conn=Faraday.newdo|conn|conn.options.timeout=20endresponse=conn.get(url
我有一个存储主机名的Ruby数组server_names。如果我打印出来,它看起来像这样:["hostname.abc.com","hostname2.abc.com","hostname3.abc.com"]相当标准。我想要做的是获取这些服务器的IP(可能将它们存储在另一个变量中)。看起来IPSocket类可以做到这一点,但我不确定如何使用IPSocket类遍历它。如果它只是尝试像这样打印出IP:server_names.eachdo|name|IPSocket::getaddress(name)pnameend它提示我没有提供服务器名称。这是语法问题还是我没有正确使用类?输出:ge
我想获取模块中定义的所有常量的值:moduleLettersA='apple'.freezeB='boy'.freezeendconstants给了我常量的名字:Letters.constants(false)#=>[:A,:B]如何获取它们的值的数组,即["apple","boy"]? 最佳答案 为了做到这一点,请使用mapLetters.constants(false).map&Letters.method(:const_get)这将返回["a","b"]第二种方式:Letters.constants(false).map{|c
我安装了ruby版本管理器,并将RVM安装的ruby实现设置为默认值,这样'哪个ruby'显示'~/.rvm/ruby-1.8.6-p383/bin/ruby'但是当我在emacs中打开inf-ruby缓冲区时,它使用安装在/usr/bin中的ruby。有没有办法让emacs像shell一样尊重ruby的路径?谢谢! 最佳答案 我创建了一个emacs扩展来将rvm集成到emacs中。如果您有兴趣,可以在这里获取:http://github.com/senny/rvm.el
假设我有这个范围:("aaaaa".."zzzzz")如何在不事先/每次生成整个项目的情况下从范围中获取第N个项目? 最佳答案 一种快速简便的方法:("aaaaa".."zzzzz").first(42).last#==>"aaabp"如果出于某种原因你不得不一遍又一遍地这样做,或者如果你需要避免为前N个元素构建中间数组,你可以这样写:moduleEnumerabledefskip(n)returnto_enum:skip,nunlessblock_given?each_with_indexdo|item,index|yieldit
我目前正在使用以下方法获取页面的源代码:Net::HTTP.get(URI.parse(page.url))我还想获取HTTP状态,而无需发出第二个请求。有没有办法用另一种方法做到这一点?我一直在查看文档,但似乎找不到我要找的东西。 最佳答案 在我看来,除非您需要一些真正的低级访问或控制,否则最好使用Ruby的内置Open::URI模块:require'open-uri'io=open('http://www.example.org/')#=>#body=io.read[0,50]#=>"["200","OK"]io.base_ur
如何在Ruby中获取BasicObject实例的类名?例如,假设我有这个:classMyObjectSystem我怎样才能使这段代码成功?编辑:我发现Object的实例方法class被定义为returnrb_class_real(CLASS_OF(obj));。有什么方法可以从Ruby中使用它? 最佳答案 我花了一些时间研究irb并想出了这个:classBasicObjectdefclassklass=class这将为任何从BasicObject继承的对象提供一个#class您可以调用的方法。编辑评论中要求的进一步解释:假设你有对象