Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: collecting raw data from the web via browser automation


From   "Neil Shephard" <[email protected]>
To   [email protected]
Subject   Re: st: collecting raw data from the web via browser automation
Date   Tue, 23 May 2006 13:57:23 +0800

On 5/23/06, Michael Blasnik <[email protected]> wrote:
I think what the original post asked for (and what I would be interested in
as well) is a way to access web pages that are only created when an action
is taken or selection is made on a different web page, so there is no
specific web address that holds the data you want.  I have thought about
trying to use auto-it or another scripting language to launch a browser,
make selections on a web page and then capture the data that's spawned
typically in a new window.

Do any of the tools mentioned by Kit or Phil actually do this?

I've recently started looking into using Perl to achieve exactly this.
Pass some parameters to a web-page that has a form on it and retrieve
the information that is returned by the web-page.

The module that achieves this in Perl is the LWP (Library for WWW in
Perl).  This allows you to write a script which posts form data.

There is even an O'Rielly book on it, and an accompanying web-page
with (presumably) excrepted samples can be viewed at
http://perl.com/pub/a/2002/08/20/perlandlwp.html (see page 2 on how to
write a script which submits a query to altavista).

So yes it is possible, in Perl at least, and I'd imagine that Python
has similar capabilities (although I'm not aware of them).

Neil
--
"Laziness is nothing more than the habit of resting before you get
tired."  - Jules Renard

()  ascii ribbon campaign - against html mail
/\                        - against microsoft attachments
(www.gnu.org/philosophy/no-word-attachments.html)

Email - [email protected] / [email protected]
Website - http://slack.ser.man.ac.uk/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index