Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: collecting raw data from the web via browser automation


From   "Austin Nichols" <austinnichols@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: collecting raw data from the web via browser automation
Date   Wed, 24 May 2006 18:14:39 -0400

Thanks!
to everyone who responded, esp. Dave Armstrong (the first response,
off-list) and Neil Shephard who points out:
<excerpted> The module that achieves this in Perl is the LWP (Library
for WWW in Perl).  This allows you to write a script which posts form
data. There is even an O'Reilly book on it, and an accompanying
web-page with (presumably) excerpted samples can be viewed at
http://perl.com/pub/a/2002/08/20/perlandlwp.html

so now I have my Google API key and am running perl scripts in a
Cygwin shell on a Windows box, and eagerly awaiting some books to come
from Amazon.  How many more non-Stata tradenames can I fit in one
Statalist post?  The reason for the query is partly loyalty to Stata,
but partly convenience: I know how to program in Stata, and the
meta-analysis will be run in Stata, so it would be helpful to have the
database of studies (perhaps with codes for study characteristics) in
Stata.

The smaller point is this: running a Search interactively, getting 650
results and then clicking to get copies of some and bibliographic info
is just to painful.

I was *hoping* someone would have programmed a script to scrape Google
in C that could be adapted to become a Stata plugin.... ah well, can't
always get what you want...
but I did find, I got what I need.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index