Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: collecting raw data from the web via browser automation


From   "Michael Blasnik" <[email protected]>
To   <[email protected]>
Subject   Re: st: collecting raw data from the web via browser automation
Date   Tue, 23 May 2006 08:16:31 -0400

Thanks for the info. If I do start to learn Perl (I guess it's about time) will I be able to run it seamlessly from within Stata through the shell command, including passing parameters? Will it work to launch javascripts?

Michael Blasnik
[email protected]

----- Original Message ----- From: "Neil Shephard" <[email protected]>
To: <[email protected]>
Sent: Tuesday, May 23, 2006 1:57 AM
Subject: Re: st: collecting raw data from the web via browser automation



On 5/23/06, Michael Blasnik <[email protected]> wrote:

I think what the original post asked for (and what I would be interested in
as well) is a way to access web pages that are only created when an action
is taken or selection is made on a different web page, so there is no
specific web address that holds the data you want. I have thought about
trying to use auto-it or another scripting language to launch a browser,
make selections on a web page and then capture the data that's spawned
typically in a new window.

Do any of the tools mentioned by Kit or Phil actually do this?

I've recently started looking into using Perl to achieve exactly this.
Pass some parameters to a web-page that has a form on it and retrieve
the information that is returned by the web-page.

The module that achieves this in Perl is the LWP (Library for WWW in
Perl).  This allows you to write a script which posts form data.

There is even an O'Rielly book on it, and an accompanying web-page
with (presumably) excrepted samples can be viewed at
http://perl.com/pub/a/2002/08/20/perlandlwp.html (see page 2 on how to
write a script which submits a query to altavista).

So yes it is possible, in Perl at least, and I'd imagine that Python
has similar capabilities (although I'm not aware of them).

Neil
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index