Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: collecting raw data from the web via browser automation

From   "Neil Shephard" <>
Subject   Re: st: collecting raw data from the web via browser automation
Date   Tue, 23 May 2006 13:57:23 +0800

On 5/23/06, Michael Blasnik <> wrote:
I think what the original post asked for (and what I would be interested in
as well) is a way to access web pages that are only created when an action
is taken or selection is made on a different web page, so there is no
specific web address that holds the data you want.  I have thought about
trying to use auto-it or another scripting language to launch a browser,
make selections on a web page and then capture the data that's spawned
typically in a new window.

Do any of the tools mentioned by Kit or Phil actually do this?

I've recently started looking into using Perl to achieve exactly this.
Pass some parameters to a web-page that has a form on it and retrieve
the information that is returned by the web-page.

The module that achieves this in Perl is the LWP (Library for WWW in
Perl).  This allows you to write a script which posts form data.

There is even an O'Rielly book on it, and an accompanying web-page
with (presumably) excrepted samples can be viewed at (see page 2 on how to
write a script which submits a query to altavista).

So yes it is possible, in Perl at least, and I'd imagine that Python
has similar capabilities (although I'm not aware of them).

"Laziness is nothing more than the habit of resting before you get
tired."  - Jules Renard

()  ascii ribbon campaign - against html mail
/\                        - against microsoft attachments

Email - /
Website -

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index