Skip to content

Web Scraping in Ruby

May 12, 2011

Hpricot is a HTML parser, fantastic ruby library, easy to install and easy usage

To install

sudo gem install hpricot open-uri

open-uri is using a network streams

here i posted a simple web scraping code

This code to fetch the group of student results from the Annauniversity website

# Fetch my class students exam result from AnnaUniversity site
# Progamme name scrabing_exam_results.rb
# Author : Rajkumar.S
# version : 0.01
# License: GNU GPL 3

require 'rubygems'
require 'open-uri'
require 'hpricot'

url = ""
# exam_no is a range
exam_no = "52108621001".."52108621039"

exam_no.each do |each_number|
# write a file as html format easily view all results in one page"result.html","a") {|f| f.puts(data)}
# find the inside content of table tag'table').inner_html
# it is remove the html tags
# spearate an array where \n is placed
puts b+"\n"+"=======================""result.txt","a") { |f| f.puts(b+"\n\n"+"=================")}


6 Comments leave one →
  1. sanmugam k permalink
    May 13, 2011 9:11 am

    super v good job..

  2. February 7, 2012 12:06 pm

    cool code…
    i tried to rewrite code to scrap results from madras university.
    but i can’t install the open-uri
    it tells:
    ERROR: Could not find a valid gem ‘open-uri’ (>= 0) in any repository

  3. manimaran permalink
    February 8, 2012 10:19 am

    now working…

  4. February 9, 2012 10:54 pm


  5. June 7, 2012 7:41 pm

    After I start your Feed it appears to be a ton of nonsense, is the problem on my part?

  6. Glennsib permalink
    July 3, 2017 7:26 am

    nfl bears 17 cheap nfl jerseys

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: