词_草庐IT

python - 如何在python nltk和wordnet中获取一个词/同义词集的所有下位词？

我有一个wordnet中所有名词的列表，现在我想只留下作为车辆的词并删除其余的词。我该怎么做？下面是我想做的伪代码，但我不知道如何让它工作forwordinwordlist:ifnot"vehicle"inwn.synsets(word):wordlist.remove(word) 最佳答案 fromnltk.corpusimportwordnetaswnvehicle=wn.synset('vehicle.n.01')typesOfVehicles=list(set([wforsinvehicle.closure(lambdas:

下位 python section vehicle 同义词 nltk wordnet

python - 改组一个词

如何在python中随机打乱单词的字母？例如，单词“cat”可能会更改为“act”、“tac”或“tca”。我想不使用内置函数来做到这一点最佳答案 importrandomword="cat"shuffled=list(word)random.shuffle(shuffled)shuffled=''.join(shuffled)print(shuffled)...或以不同的方式完成，灵感来自Dominic'sanswer...importrandomshuffled=''.join(random.sample(word,len(w

python 改组 section shuffled stackoverflow

python - 为什么Python(IronPython)在使用bin这个词时会报 "Illegal characters in path"？

我在IronPython中执行chdir命令时收到“路径中的非法字符”错误。这是在我的代码运行时发生的，但即使在IronPython控制台中也有这个问题。我正在使用nt模块，因为在代码中os模块不起作用(似乎是一个已知问题)。稍微研究了一下，发现“非法字符”实际上是单词bin。以下是来自控制台的文本，显示仅当我导航到bin目录时才收到错误。这是例子>>>nt.chdir('c:\Users\xxxxx\Documents\VisualStudio2010\Projects\xxx')>>>nt.chdir('c:\Users\xxxxx\Documents\VisualStudio20

会报 IronPython code section Documents python illegal-characters

python - 如何使用 FastText 查找相似词？

我正在玩弄FastText，https://pypi.python.org/pypi/fasttext，这与Word2Vec非常相似。由于它似乎是一个相当新的库，内置函数还不多，我想知道如何提取形态相似的词。例如:model.similar_word("dog")->狗。但是没有内置函数。如果我输入模型[“狗”]我只得到向量，可以用来比较余弦相似度。model.cosine_similarity(model["dog"],model["dogs"]])。我是否必须进行某种循环并对文本中所有可能的对执行cosine_similarity？这需要时间......!!!

FastText python code section https nlp word2vec

python - 使用 python2.7 和 nltk 将代词替换为其先行词

如标题所示，我正在尝试在字符串中查找代词并将其替换为它的先行词，例如:[in]:"theprincesslookedfromthepalace,shewashappy".[out]:"theprincesslookedfromthepalace,theprincesswashappy".我使用pos标签返回代词和名词。我需要知道如何在不知道句子的情况下替换，意思是如何在句子中指定主语以用它替换代词。有什么建议吗？最佳答案我不知道nltk包(从未使用过)，但它似乎可以立即给出您的答案。如果您查看nltk.org上的解析树示例，它表

python 代词句子 the princess python-2.7 nlp nltk

python - nltk 词语料库不包含 "okay"？

NLTK词语料库中没有短语“okay”、“okay”、“Okay”？>fromnltk.corpusimportwords>words.words().__contains__("check")>True>words.words().__contains__("okay")>False>len(words.words())>236736有什么想法吗？最佳答案长话短说fromnltk.corpusimportwordsfromnltk.corpusimportwordnetmanywords=words.words()+wordn

语料 amp words code section python dictionary nltk corpus

python - 只训练一些词嵌入(Keras)

在我的模型中，我使用GloVe预训练嵌入。我希望让它们不可训练，以减少模型参数的数量并避免过度拟合。但是，我有一个特殊符号，我确实想要训练其嵌入。使用提供的嵌入层，我只能使用参数“trainable”来设置所有嵌入的可训练性，方法如下:embedding_layer=Embedding(voc_size,emb_dim,weights=[embedding_matrix],input_length=MAX_LEN,trainable=False)是否有仅训练嵌入子集的Keras级解决方案？请注意:没有足够的数据来为所有单词生成新的嵌入。These答案仅与原生TensorFlow相关。

训练 python special embedding input nlp keras word-embedding

python - 如何从 Keras 嵌入层获取词向量

我目前正在使用Keras模型，该模型的第一层是嵌入层。为了可视化单词之间的关系和相似性，我需要一个函数来返回词汇表中每个元素的单词和向量的映射(例如'love'-[0.21,0.56,...,0.65,0.10]).有什么办法吗？最佳答案您可以使用嵌入层的get_weights()方法获取词嵌入(即，嵌入层的权重本质上是嵌入向量):#ifyouhaveaccesstotheembeddinglayerexplicitlyembeddings=emebdding_layer.get_weights()[0]#oraccessthe

python Keras section embeddings embedding dictionary keras-layer word-embedding

Python:审查文本中的一个词但最后一个词不审查

我正在Codecademy上做Python，试图审查文本中的一个词。该代码有效，但如果文本中的最后一个单词有该单词，则不会被审查。我认为for语句需要更改，例如forxin(text+1)但这当然会导致错误。我们不是要使用内置函数，例如replace()有什么想法吗？defcensor(text,word):text=text.split()forxintext:ifx==word:text[text.index(x)]="*"*len(word)return"".join(text)print(censor("Youdirtyguyanddirtyboydirty.","dirty"

Python 审查 code section dirty

python - 估计两个词之间的音素相似度

我正在使用卡内基梅隆大学的发音词典检测Python中的押韵，并且想知道:如何估计两个词之间的音素相似度？换句话说，是否有一种算法可以识别出“手”和“计划”比“手”和“薯条”更接近押韵这一事实？一些上下文:起初，如果两个词的主重读音节和所有后续音节相同(c06d如果您想在Python中复制)，我愿意说两个词押韵:defcreate_cmu_sound_dict():final_sound_dict={}withopen('resources/c06d/c06d')ascmu_dict:cmu_dict=cmu_dict.read().split("\n")foriincmu_dict:i

音素 python sound noreferrer final algorithm nlp linguistics phoneme