KMeans2

python openCV : kmeans example not working

我正在研究openCV示例，但有时这些示例无法运行。在许多情况下，我只需要做一些小的改动，然后它就可以工作了。在这种情况下，到目前为止我没有找到解决方案。如果我运行以下代码，我会在kmeans行上收到错误。我检查了数据类型，似乎一切正常。任何人都知道出了什么问题？谢谢!来自https://github.com/Itseez/opencv的代码示例:'''Keyboardshortcuts:ESC-exitspace-generatenewdistribution'''importnumpyasnpimportcv2fromgaussian_miximportmake_gaussians

python - 使用 Python 的 KMeans 算法聚类地理位置坐标(lat，long 对)

使用以下代码对地理位置坐标进行聚类会产生3个聚类:importnumpyasnpimportmatplotlib.pyplotaspltfromscipy.cluster.vqimportkmeans2,whitencoordinates=np.array([[lat,long],[lat,long],...[lat,long]])x,y=kmeans2(whiten(coordinates),3,iter=20)plt.scatter(coordinates[:,0],coordinates[:,1],c=y);plt.show()使用Kmeans进行位置聚类是否正确，因为它使用Eu

地理 python section coordinates noreferrer numpy geolocation scipy k-means

python - scipy 中的 kmeans 和 kmeans2 有什么区别？

我是机器学习的新手，想知道scipy中的kmeans和kmeans2之间的区别。根据文档，他们都使用'k-means'算法，但如何选择它们？最佳答案根据文档，kmeans2似乎是标准的k-means算法并且运行直到收敛到局部最优-并且允许您更改种子初始化。kmeans函数将由于缺乏变化而提前终止，因此它甚至可能无法达到局部最优。此外，它的目标是生成一个码本来映射特征向量。码本本身不一定是从停止点生成的，而是会使用具有最低“失真”的迭代来生成码本。此方法还将多次运行kmeans。该文档有更多细节。如果您只想将k-means作为算法

kmeans kmeans2 section python machine-learning scipy k-means

python - 绘制 kmeans 的输出(PyCluster impl)

在python中，kmeans聚类的plot输出如何？我正在使用PyCluster包。allUserVector是一个nxm维向量，基本上是具有m个特征的n个用户。importPyclusteraspcimportnumpyasnpclusterid,error,nfound=pc.kcluster(allUserVector,nclusters=3,transpose=0,npass=1,method='a',dist='e')clustermap,_,_=pc.kcluster(allUserVector,nclusters=3,transpose=0,npass=1,method

PyCluster python users centroids 39 cluster-analysis k-means

python - cv2.kmeans 在 Python 中的用法

我正在考虑使用OpenCV的Kmeans实现，因为它说速度更快......现在我正在使用包cv2和函数kmeans，我无法理解他们引用中的参数描述:Python:cv2.kmeans(data,K,criteria,attempts,flags[,bestLabels[,centers]])→retval,bestLabels,centerssamples–Floating-pointmatrixofinputsamples,onerowpersample.clusterCount–Numberofclusterstosplitthesetby.labels–Input/outputi

用法 python centers cv2 random opencv

python - 如何在 python 中执行具有权重/密度的集群？有权重的 kmeans 之类的东西？

我的数据是这样的:powerplantname,latitude,longitude,powergeneratedA,-92.3232,100.99,50B,,,10C,,,20D,,,40E,,,5我希望能够将数据聚类成N个聚类(比如3个)。通常我会使用kmeans:importnumpyasnpimportmatplotlib.pyplotaspltfromscipy.cluster.vqimportkmeans2,whitencoordinates=np.array([[lat,long],[lat,long],...[lat,long]])x,y=kmeans2(whiten(

python 何在 code prettyprint-override lat algorithm scipy scikit-learn cluster-analysis

python - 如何可视化用于 kmeans 聚类的 tf-idf 向量的数据点？

我有一个文档列表和整个语料库中每个唯一单词的tf-idf分数。我如何在二维图上将其可视化，以便衡量运行k-means需要多少集群？这是我的代码:sentence_list=["Hihowareyou","Goodmorning"...]vectorizer=TfidfVectorizer(min_df=1,stop_words='english',decode_error='ignore')vectorized=vectorizer.fit_transform(sentence_list)num_samples,num_features=vectorized.shapeprint"nu

python kmeans 39 section num scipy scikit-learn k-means tf-idf

python - Spark KMeans 无法处理大数据吗？

KMeans的training有几个参数,初始化模式默认为kmeans||。问题是它快速(不到10分钟)前进到前13个阶段，但随后完全挂起，没有产生错误!重现问题的最小示例(如果我使用1000点或随机初始化会成功):frompyspark.contextimportSparkContextfrompyspark.mllib.clusteringimportKMeansfrompyspark.mllib.randomimportRandomRDDsif__name__=="__main__":sc=SparkContext(appName='kmeansMinimalExample')#

大数 python noreferrer noopener nofollow apache-spark k-means apache-spark-mllib bigdata

python - kmeans 散点图 : plot different colors per cluster

我正在尝试绘制kmeans输出的散点图，该散点图将同一主题的句子聚集在一起。我面临的问题是绘制属于每个簇的特定颜色的点。sentence_list=["Hihowareyou","Goodmorning"...]#ihave10setenceskm=KMeans(n_clusters=5,init='k-means++',n_init=10,verbose=1)#with5cluster,iwant5differentcolorskm.fit(vectorized)km.labels_#[0,1,2,3,3,4,4,5,2,5]pipeline=Pipeline([('tfidf',T

different cluster code section labels python numpy matplotlib scipy k-means

python - 了解 scikit-learn KMeans 返回的 "score"

我对一组文本文档(大约100个)应用了聚类。我使用TfIdfVectorizer将它们转换为Tfidf向量，并将向量作为输入提供给scikitlearn.cluster.KMeans(n_clusters=2,init='k-means++',max_iter=100,n_init=10)。现在当我model.fit()printmodel.score()在我的向量上，如果所有文本文档都非常相似，我会得到一个非常小的值，如果文档非常不同，我会得到一个非常大的负值。我的基本目的是查找哪一组文档相似，但有人可以帮我理解这个model.score()值究竟意味着什么适合吗？我如何使用这个值来

scikit-learn amp code section strong python k-means

3 4 567 8