SciKit-Learn

python - 将 Tensorflow 输入管道与 skflow/tf learn 结合使用

我关注了TensorflowReadingData指南以TFRecord的形式获取我的应用程序数据，并在我的输入管道中使用TFRecordReader来读取此数据。我现在正在阅读有关使用skflow/tf.learn的指南构建一个简单的回归器，但我看不到如何通过这些工具使用我的输入数据。在以下代码中，应用程序在调用regressor.fit(..)时失败，出现ValueError:settinganarrayelementwithasequence.。错误:Traceback(mostrecentcalllast):File".../tf.py",line138,inrun()File

python - scikit-bio 从 gff3 文件中提取基因组特征

是否可以在scikit-bio中从基因组fasta文件中提取存储在gff3格式文件中的基因组特征？例子:基因组.fasta>sequence1ATGGAGAGAGAGAGAGAGAGGGGGCAGCATACGCATCGACATACGACATACATCAGATACGACATACTACTACTATGA注释.gff3#gff-version3sequence1sourcegene178.+.ID=gene1sequence1sourcemRNA178.+.ID=transcript1;parent=gene1sequence1sourceCDS16.+0ID=CDS1;parent=tran

基因 scikit-bio 39 source code python python-3.x bioinformatics skbio

python - 如何使用 scikit 学习多类案例绘制 ROC 曲线？

我想为我自己的数据集绘制多类案例的ROC曲线。通过documentation我读到标签必须是二进制的(我有5个标签，从1到5)，所以我按照文档中提供的示例进行操作:print(__doc__)importnumpyasnpimportmatplotlib.pyplotaspltfromsklearnimportsvm,datasetsfromsklearn.metricsimportroc_curve,aucfromsklearn.cross_validationimporttrain_test_splitfromsklearn.preprocessingimportlabel_bin

python scikit 39 random_state plt python-2.7 matplotlib machine-learning scikit-learn

python - scikit 学习管道中的后处理分类器输出

我在scikit中使用Pipeline学习将一些预处理与OneClassSVM组合在一起作为最终分类器。为了计算合理的指标，我需要一个后处理，将OneClassSVM的-1,1输出转换为0和1。是否有任何结构化的方法可以将这种后处理添加到管道？在最终估算器之后不能使用转换器。最佳答案您可以将类sklearn.preprocessing.TransformedTargetRegressor与您的SVM分类器一起用作回归器，并使用inverse_func参数在分类后转换您的标签。但是，由于TransformedTargetRegre

python scikit code section TransformedTargetRegressor scikit-learn pipeline post-processing

python - scikit KernelPCA 结果不稳定

我正在尝试使用KernelPCA将数据集的维数降低为二维(既用于可视化目的，也用于进一步的数据分析)。我尝试在各种Gamma值下使用RBF内核计算KernelPCA，但结果不稳定:(每一帧的Gamma值都略有不同，其中Gamma从0到1连续变化)看起来它不是确定性的。有没有办法稳定它/使其具有确定性？用于生成转换数据的代码:defpca(X,gamma1):kpca=KernelPCA(kernel="rbf",fit_inverse_transform=True,gamma=gamma1)X_kpca=kpca.fit_transform(X)#X_back=kpca.inverse

KernelPCA python section strong scikit-learn pca dimensionality-reduction

python - 回归数据的 Scikit-learn 特征选择

我正在尝试使用Python模块scikit-learn将单变量特征选择方法应用于svmlight格式的回归(即连续值响应值)数据集。我正在使用scikit-learn0.11版。我尝试了两种方法-第一种失败了，第二种对我的玩具数据集有效，但我认为对于真实数据集会产生毫无意义的结果。我希望获得有关可用于为回归数据集选择前N个特征的适当单变量特征选择方法的建议。我要么(a)弄清楚如何使f_regression函数工作，要么(b)听取其他建议。上述两种方式:我尝试使用sklearn.feature_selection.f_regression(X,Y)。失败并显示以下错误消息:“TypeEr

Scikit-learn python 1.000000 000000 svmlight

python - 是否可以在 Hadoop 上运行 Python 的 scikit-learn 算法？

关闭。这个问题不符合StackOverflowguidelines.它目前不接受答案。要求我们推荐或查找工具、库或最喜欢的场外资源的问题对于StackOverflow来说是偏离主题的，因为它们往往会吸引自以为是的答案和垃圾邮件。相反，describetheproblem以及迄今为止为解决该问题所做的工作。关闭8年前。Improvethisquestion我知道可以在Hadoop上使用python语言。但是可以在Hadoop上使用scikit-learn的机器学习算法吗？如果答案是否定的，是否有一些用于python和Hadoop的机器学习库？感谢您的帮助。

scikit-learn python section class notice hadoop machine-learning bigdata

python - scikit - 随机森林回归 - AttributeError : 'Thread' object has no attribute '_children'

在为随机森林回归器设置n_jobs参数>1时出现以下错误。如果我设置n_jobs=1，一切正常。AttributeError:'Thread'objecthasnoattribute'_children'我在flask服务中运行这段代码。有趣的是，在flask服务之外运行时不会发生这种情况。我只在新安装的Ubuntu机器上重现了这个。在我的Mac上它工作得很好。这是一个讨论这个问题的线程，但似乎没有解决任何问题:'Thread'objecthasnoattribute'_children'-django+scikit-learn对此有什么想法吗？这是我的测试代码:@test.route

amp 39 code python self flask scikit-learn

python - 属性错误 : lower not found; using a Pipeline with a CountVectorizer in scikit-learn

我有这样一个语料库:X_train=[['thisisandummyexample']['inrealitythislineisverylong']...['hereisalasttextinthetrainingset']]和一些标签:y_train=[1,5,...,3]我想按如下方式使用Pipeline和GridSearch:pipeline=Pipeline([('vect',CountVectorizer()),('tfidf',TfidfTransformer()),('reg',SGDRegressor())])parameters={'vect__max_df':(0.

CountVectorizer scikit-learn code 39 python pipeline

python - 无法下载和安装 scikit-learn

我是python的新手。我想使用KMean代码，我想安装scikit-learn或sklearn。我使用这段代码尝试安装这些包:pipinstall-Usklearnpipinstall-Uscikit-learn但是我得到了这个错误:Command/usr/bin/python-c"importsetuptools,tokenize;__file__='/tmp/pip_build_reihaneh/sklearn/setup.py';exec(compile(getattr(tokenize,'open',open)(__file__).read().replace('\r\n',

scikit-learn python code install pip installation

59 60 616263 64 65