[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: collecting raw data from the web via browser automation
to everyone who responded, esp. Dave Armstrong (the first response,
off-list) and Neil Shephard who points out:
<excerpted> The module that achieves this in Perl is the LWP (Library
for WWW in Perl). This allows you to write a script which posts form
data. There is even an O'Reilly book on it, and an accompanying
web-page with (presumably) excerpted samples can be viewed at
so now I have my Google API key and am running perl scripts in a
Cygwin shell on a Windows box, and eagerly awaiting some books to come
from Amazon. How many more non-Stata tradenames can I fit in one
Statalist post? The reason for the query is partly loyalty to Stata,
but partly convenience: I know how to program in Stata, and the
meta-analysis will be run in Stata, so it would be helpful to have the
database of studies (perhaps with codes for study characteristics) in
The smaller point is this: running a Search interactively, getting 650
results and then clicking to get copies of some and bibliographic info
is just to painful.
I was *hoping* someone would have programmed a script to scrape Google
in C that could be adapted to become a Stata plugin.... ah well, can't
always get what you want...
but I did find, I got what I need.
* For searches and help try: