[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

From |
Ulrich Kohler <kohler@wzb.eu> |

To |
statalist@hsphsun2.harvard.edu |

Subject |
Re: st: Insheet with dictionary |

Date |
Wed, 15 Jul 2009 14:37:35 +0200 |

Ekaterina Hertog wrote > > Dear Statalist, > > I need to import 12 large datasets from Excel to Stata and I was wondering whether anyone could suggest a more efficient way than the one I came up with. > > The variables in my datasets are named in Japanese so if I simply insheet the datasets the resulting Stata files have v1, v2 etc instead of variable names and contain question marks instead of variable labels. My datasets all have similar structure, so I thought that the most efficient way would be writing a dictionary for one of them and use this dictionary to insheet the relevant Excel file and then modifying the dictionary for each subsequent dataset (they are similar, but not identical). > > The problem is that writing even one dictionary is proving extremely repetitive work and I was wondering whether there might be a way of somehow automating it with something like foreach? > > > > > > > > The beginning of my dictionary looks as follows: > > > > dictionary using general data Tokyo ku merged.csv { > > totalall1519 labmarkall1519 wrkall1519 paywrkall1519 unempall1519 notlabmarkall1519 empstunknall1519 unemprateall1519 unmarrall1519 marrall1519 widall1519 divall1519 marstunkn1519 > > } > > > > The numbers in the name of each variable mean age => these are various demographic characteristics of 15-19 year olds, the dataset then goes on and contains the same demographic measurements for older age groups until 70-74 year olds. I of course can simply copy paste this part of the dictionary for each age group and only replace the numbers representing age groups, but I was wondering if anyone could suggest a better way? > > I would be most grateful for any advice, and I answered: > Perhaps something along the following lines: > > ----------------------------------------------------------------------- > forv lage=15(5)70 { > foreach name in total labmark wrk paywer unem notlab empunkn /// > unemprate unmarr widall divall marstunkn { > local varnames `"`varnames' `name'`lage'`=`lage'+4'"' > } > } > > di "`varnames'" > > file open dict using mydict.dct, replace write > file write dict `"dictionary using general data Tokyo ku merged.csv > { "' /// > _n `" `varnames' "' /// > _n `" } "' > file close dict > exit > > ----------------------------------------------------------------------- > > This produces the dictionary file mydict.dct. Of course it makes > assumptions about your data and will fail if these are not met. Here is a version that is more robust against wrapping of long lines by the E-mail system: ---------------------------------------------------------------- forv lage=15(5)70 { foreach name in total labmark wrk paywer unem notlab /// empunkn unemprate unmarr widall divall marstunkn { local varnames `"`varnames' `name'`lage'`=`lage'+4'"' } } file open dict using mydict.dct, replace write file write dict /// `"dictionary using general data Tokyo ku merged.csv { "' /// _n `" `varnames' "' /// _n `" } "' file close dict exit ----------------------------------------------------------------- -- kohler@wzb.eu 030 25491-361 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/

**Follow-Ups**:**Re: st: Insheet with dictionary***From:*Ekaterina Hertog <ekaterina.hertog@sociology.ox.ac.uk>

**References**:**st: Useing e(over_labels)***From:*Richard Palmer-Jones <richard.palmerjones@gmail.com>

**Re: st: RE: RE: Useing e(over_labels)***From:*Richard Palmer-Jones <richard.palmerjones@gmail.com>

**RE: st: RE: RE: Useing e(over_labels)***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: RE: RE: Useing e(over_labels)***From:*Austin Nichols <austinnichols@gmail.com>

**RE: st: RE: RE: Useing e(over_labels)***From:*"Nick Cox" <n.j.cox@durham.ac.uk>

**Re: st: RE: RE: Useing e(over_labels)***From:*Richard Palmer-Jones <richard.palmerjones@gmail.com>

**Re: st: RE: RE: Useing e(over_labels)***From:*Austin Nichols <austinnichols@gmail.com>

**Re: st: RE: RE: Useing e(over_labels)***From:*Richard Palmer-Jones <richard.palmerjones@gmail.com>

**Re: st: RE: RE: Useing e(over_labels)***From:*Austin Nichols <austinnichols@gmail.com>

**Re: st: RE: RE: Useing e(over_labels)***From:*Richard Palmer-Jones <richard.palmerjones@gmail.com>

**Re: st: RE: RE: Useing e(over_labels)***From:*Richard Palmer-Jones <richard.palmerjones@gmail.com>

**st: Insheet with dictionary***From:*Ekaterina Hertog <ekaterina.hertog@sociology.ox.ac.uk>

**Re: st: Insheet with dictionary***From:*Ulrich Kohler <kohler@wzb.eu>

- Prev by Date:
**st: need help with spatwmat** - Next by Date:
**st: AW: need help with spatwmat** - Previous by thread:
**Re: st: Insheet with dictionary** - Next by thread:
**Re: st: Insheet with dictionary** - Index(es):

© Copyright 1996–2016 StataCorp LP | Terms of use | Privacy | Contact us | What's new | Site index |