Statalist The Stata Listserver

[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: collecting raw data from the web via browser automation

From   "Austin Nichols" <>
Subject   Re: st: collecting raw data from the web via browser automation
Date   Wed, 24 May 2006 18:14:39 -0400

to everyone who responded, esp. Dave Armstrong (the first response,
off-list) and Neil Shephard who points out:
<excerpted> The module that achieves this in Perl is the LWP (Library
for WWW in Perl).  This allows you to write a script which posts form
data. There is even an O'Reilly book on it, and an accompanying
web-page with (presumably) excerpted samples can be viewed at

so now I have my Google API key and am running perl scripts in a
Cygwin shell on a Windows box, and eagerly awaiting some books to come
from Amazon.  How many more non-Stata tradenames can I fit in one
Statalist post?  The reason for the query is partly loyalty to Stata,
but partly convenience: I know how to program in Stata, and the
meta-analysis will be run in Stata, so it would be helpful to have the
database of studies (perhaps with codes for study characteristics) in

The smaller point is this: running a Search interactively, getting 650
results and then clicking to get copies of some and bibliographic info
is just to painful.

I was *hoping* someone would have programmed a script to scrape Google
in C that could be adapted to become a Stata plugin.... ah well, can't
always get what you want...
but I did find, I got what I need.

*   For searches and help try:

© Copyright 1996–2015 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index