orbital-mechanics

ruby - 有人知道 Ruby Mechanize 的缓存插件吗？

我有一个基于Mechanize的Ruby脚本来抓取网站。我希望通过在本地缓存下载的HTML页面来加快速度，使整个“调整输出->运行->调整输出”周期更快。我宁愿不必只为这个脚本在机器上安装外部缓存。理想的解决方案是插入Mechanize并透明地缓存获取的页面、图像等。有人知道可以执行此操作的库吗？还是另一种实现相同结果的方法(脚本第二次运行得更快)？最佳答案做这类事情的一个好方法是使用(AWESOME)VCRgem.这是您将如何操作的示例:require'vcr'require'mechanize'#SetupVCR'sconf

ruby - 单击与 Mechanize 的 xpath 链接

我想单击我使用xpath(nokogiri)选择的Mechanize链接。这怎么可能？next_page=page.search"//div[@class='grid-dataset-pager']/span[@class='currentPage']/following-sibling::a[starts-with(@class,'page')][1]"next_page.click问题是nokogiri元素没有点击功能。我无法读取href(URL)并发送获取请求，因为该链接定义了onclick函数(没有href属性)。如果那不可能，有什么替代方案？最佳

Mechanize xpath code page ruby nokogiri mechanize-ruby

ruby - 如何在使用 Ruby Mechanize 加载页面之前设置 Referer header ？

是否有直接的方法来使用Mechanize2.3设置自定义header？我尝试了formersolution但是得到:$agent=Mechanize.new$agent.pre_connect_hooks':undefinedmethod`pre_connect_hooks'fornil:NilClass(NoMethodError) 最佳答案 Thedocs说:get(uri,parameters=[],referer=nil,headers={}){|page|...}例如:agent.get'http://www.google

何在 Mechanize section 39 code ruby http-headers http-referer

ruby - Mechanize 可以进行 Javascript 调用吗？

Mechanize可以进行Javascript调用吗？这在屏幕抓取时协商AJAX会很方便... 最佳答案不，它不能。如果您需要与javascript交互，您应该查看其他解决方案，例如watir. 关于ruby-Mechanize可以进行Javascript调用吗？，我们在StackOverflow上找到一个类似的问题： https://stackoverflow.com/questions/2946323/

Javascript Mechanize section stackoverflow ruby screen-scraping

ruby - 在 Ruby Mechanize 的 POST 请求中为同一个键提交多个值

我如何在Ruby的Mechanizegem中提交一个POST请求，同一个键有多个值？例如我想要发送foo=1和foo=2。我试过了parameter={'foo'=>['1','2']}Mechanize.new.post('http://somewebsite.com',parameters)但使用requestb.in，我只得到'12'代表'foo'，而不是'1'代表'foo'的一个值，'2'代表另一个'foo'值。此外:我这样做的原因是因为我想在多选列表中选择多个值，但是在选择列表上调用select_all并提交表单似乎不起作用，所以我尝试手动提交POST数据。

Mechanize ruby section foo code

ruby Mechanize 404 => 网络::HTTPNotFound

我有一个无法使用Mechanize访问的URL，我不知道为什么:#Useruby2.1.6require'mechanize'require'axlsx'#2.0.1require'roo'#1.13.2mechanize=Mechanize.newmechanize.request_headers={"Accept-Encoding"=>""}mechanize.ignore_bad_chunking=truemechanize.follow_meta_refresh=truexlsx=Roo::Excelx.new("./base_list.xlsx")xlsx.each_with

HTTPNotFound Mechanize gems section ruby mechanize-ruby

ruby - 如何让 Mechanize 自动将正文转换为 UTF8？

我找到了一些使用post_connect_hook和pre_connect_hook的解决方案，但它们似乎不起作用。我正在使用最新的Mechanize版本(2.1)。新版本没有[:response]字段，新版本不知道去哪里找。https://gist.github.com/search?q=pre_connect_hookshttps://gist.github.com/search?q=post_connect_hooks是否可以让Mechanize返回UTF8编码版本，而不必使用iconv手动转换它？最佳答案从Mechani

Mechanize ruby code connect hooks utf-8

ruby - 在 Mechanize 中使用登录表单

我知道Stackoverflow上有与此非常相似的帖子，但我似乎仍然无法弄清楚我的尝试有什么问题。#logintothesitemech.get(base_URL)do|page|l=page.form_with(:action=>"/site/login/")do|f|username_field=f.field_with(:name=>"LoginForm[username]")username_field.value=userNamepassword_field=f.field_with(:name=>"LoginForm[password]")password_field.va

Mechanize ruby 34 gt mechanize-ruby

ruby-on-rails - 使用 Mechanize 在 html 页面中查找字符串

我正在尝试查找给定的字符串，假设“Hello”存在于给定的页面中。到目前为止，我有以下内容:agent=Mechanize.newpage=agent.get('http://www.google.de/')我现在该怎么办？我见过搜索方法，但它只接受XPath/CSS表达式。我可以尝试使用xpath来搜索，但是有没有更好的方法？最佳答案您可以简单地进行一般文本搜索:page.body.include?('Hello')然而，在搜索特定的html节点时，我建议使用这样的css选择器:page.parser.css('#my_con

ruby-on-rails Mechanize section code pre ruby mechanize-python

ruby - Mechanize : Select link by classname and other questions

目前我正在查看Mechanize。我是Ruby的新手，所以请耐心等待。我写了一个小测试脚本:require'rubygems'require'mechanize'agent=WWW::Mechanize.newpage=agent.get('http://www.google.de')pppage.titlegoogle_form=page.form_with(:name=>'f')google_form.q='test'page=agent.submit(google_form)pppage.titlepage_links=Array.newpage.links.eachdo|ll|

Mechanize classname page 39 page_links ruby

15 16 171819 20