Focus on scRUBYt! v0.4.11 the powerful web scraping tool

In: Application Scanner , Configurations checks , Information Gathering , scRUBYt!

23 March 2010

scRUBYt! is a simple but powerful web scraping toolkit written in Ruby. It’s purpose is to free you from the drudgery of web page crawling, looking up HTML tags, attributes, XPaths, form names and other typical low-level web scraping stuff by figuring these out from your examples copy’n’pasted from the Web page or straight from Firebug.

scRUBYt! has only 2 dependencies, hpricot and mechanize (optionally FireWatir for AJAX scraping).

Changements :

[NEW] possibility to use FireWatir as the agent for scraping (credit: Glenn Gillen, Glen Gillen and... did I mention Glenn already?)
[FIX] navigation doesn’t crash if a 404/500 is returned (credit: Glen Gillen)
[NEW] navigation action: click_by_xpath to click arbitrary elements
[MOD] dropped dependencies: RubyInline, ParseTree, Ruby2Ruby (hooray for win32 users)
[NEW] scraping through frames (e.g. google analytics)
[MOD] exporting temporarily doesn’t work - for now, generated XPaths are printed to the screen
[MOD] possibility to wait after clicking link/filling textfield (to be able to scrape inserted AJAX stuff)
[NEW] possibility to fetch from a string, by specifying nil as the url and the html string with the :html option
[FIX] firewatir slowness (credit: jak4)
[FIX] lot of bugfixes and stability fixes

scRUBYt! is free, open source software licenced under GNU General Public License, version 2. scRUBYt! is developed by Peter Szinek, Glenn Gillen and a team of core contributors

Post scriptum

Download

Compliance Mandates

Application Scanner :
PCI/DSS 6.3, SOX A12.4, GLBA 16 CFR 314.4(b) and (2), HIPAA 164.308(a)(1)(i), FISMA RA-5, SA-11, SI-2, ISO 27001/27002 12.6, 15.2.2

Application Scanner	10 May 2010 : SQLNinja v0.2.5 released! 1 May 2010 : WhatWeb just updated to v0.4.2 29 April 2010 : WhatWeb v0.4.1 - released 28 April 2010 : NSIA (Network System Integrity Analysis) v0.8.99 released 26 April 2010 : Acunetix WVS v6.5 build 20100419 released
Configurations checks	10 May 2010 : WebTest 1.2.1 - Testing Web Application with Python 1 May 2010 : Lansweeper v4.0 released 1 May 2010 : Sysinternal AccessChk v5.0 released 1 May 2010 : Spiceworks v4.7 build 50667 released 25 April 2010 : Testing the systems highload with StressLinux v0.5.111
Information Gathering	12 May 2010 : MetaGoofil v1.4b released 1 May 2010 : OpenDLP v0.1 released 26 April 2010 : (update) Foca v2.0.1: in the wild 25 April 2010 : Skipfish v1.33b released 9 April 2010 : CSniffer Command Line Network Sniffer v1.0.0.3 released
scRUBYt!	23 March 2010 : Focus on scRUBYt! v0.4.11 the powerful web scraping tool

Focus on scRUBYt! v0.4.11 the powerful web scraping tool

Post scriptum

Compliance Mandates

Related Articles

Follow us

Like us !

Most Read

Advertising

License

Categories

COMPANY

STANDARDS

RECENT POSTS

MENU