Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Using the -copy- command to download google ngram data


From   Muhammad Anees <anees@aneconomist.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: Using the -copy- command to download google ngram data
Date   Thu, 15 Dec 2011 10:38:34 +0500

Although not simple or efficient in nature, I would suggest download
the zip manually, extract the csv files, using -insheet- from stata,
import the data files. It worked at least for me.

On Wed, Dec 14, 2011 at 8:53 PM, Madsen,Paul
<paul.madsen@warrington.ufl.edu> wrote:
> Dear Statalist,
>
> I would like to download google's ngram data using stata's -copy- command. The data are located here: http://books.google.com/ngrams/datasets.
>
> I'm running Stata/SE 11.2 for windows 64 bit.
>
> Here's the relevant line of Stata code, which is intended to copy the zip file to a local directory and name it download.zip:
>
> copy http://commondatastorage.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip download.zip
>
> The web address in the code was taken from the google ngram website (by right clicking the link to the file and pasting it in stata).
>
> When I run this code, I get the error:
>
> file http://commondatastorage.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip not found
> server says file temporarily redirected to http://v5.lscache6.c.bigcache.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip
>
> This looks like an issue on google's end. If I copy the new file location from the error text and run the stata code:
>
> copy http://v5.lscache6.c.bigcache.googleapis.com/books/ngrams/books/googlebooks-eng-us-all-1gram-20090715-0.csv.zip download.zip
>
> I get the error message "unexpected end of file." This problem is not isolated to the specific google ngram file in the example code. I've tried it on several of them with the same problem. I have also tested the code on a different zip file from a different website and the code works well when it is used on another dataset.
>
> It is hard for me to believe that google's files would have some fundamental flaw that makes download directly to Stata impossible. Can something be done in Stata to deal with such a problem (maybe using the shell command)?
>
> Thanks!
>
> Paul E. Madsen
> University of Florida
>
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/



-- 

Regards
---------------------------
Muhammad Anees
Assistant Professor
COMSATS Institute of Information Technology
Attock 43600, Pakistan
www.aneconomist.com

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index