LinkExtractor_草庐IT

python - LinkExtractor 和 SgmlLinkExtractor 的区别

我是scrapy框架的新手，我看过一些使用LinkExtractors的教程和一些使用SgmlLinkExtractor的教程。我曾尝试寻找两者的差异/利弊，但结果并不令人满意。谁能告诉我两者的区别？我们什么时候应该使用上述提取器？谢谢! 最佳答案为什么您找不到对SgmlLinkExtractor的引用的问题是它现在已弃用(相关changeset)。您可以找到SgmlLinkExtractor定义here-在Scrapy0.24文档中。而且，你不应该再使用SgmlLinkExtractor-Scrapy现在只留下一个链接提取器-L

零工 - 了解爬网和Linkextractor

因此，我正在尝试使用爬网，并理解以下示例废纸文档:importscrapyfromscrapy.spidersimportCrawlSpider,Rulefromscrapy.linkextractorsimportLinkExtractorclassMySpider(CrawlSpider):name='example.com'allowed_domains=['example.com']start_urls=['http://www.example.com']rules=(#Extractlinksmatching'category.php'(butnotmatching'subsecti