草庐IT

stopWords

全部标签

python - Hadoop 和 NLTK : Fails with stopwords

我正在尝试在Hadoop上运行Python程序。该程序涉及到NLTK库。该程序还利用HadoopStreamingAPI,如所述here.映射器.py:#!/usr/bin/envpythonimportsysimportnltkfromnltk.corpusimportstopwords#printstopwords.words('english')forlineinsys.stdin:printline,reducer.py:#!/usr/bin/envpythonimportsysforlineinsys.stdin:printline,控制台命令:bin/hadoopjarco
12