Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Insheet with dictionary


From   Ulrich Kohler <[email protected]>
To   [email protected]
Subject   Re: st: Insheet with dictionary
Date   Wed, 15 Jul 2009 14:37:35 +0200

Ekaterina Hertog wrote
> > Dear Statalist,
> > I need to import 12 large datasets from Excel to Stata and I was wondering whether anyone could suggest a more efficient way than the one I came up with.
> > The variables in my datasets are named in Japanese so if I simply insheet the datasets the resulting Stata files have v1, v2 etc instead of variable names and contain question marks instead of variable labels. My datasets all have similar structure, so I thought that the most efficient way would be writing a dictionary for one of them and use this dictionary to insheet the relevant Excel file and then modifying the dictionary for each subsequent dataset (they are similar, but not identical).
> > The problem is that writing even one dictionary is proving extremely repetitive work and I was wondering whether there might be a way of somehow automating it with something like foreach?
> > 
> > 
> > 
> > The beginning of my dictionary looks as follows: 
> > 
> > dictionary using general data Tokyo ku merged.csv {
> > totalall1519 labmarkall1519 wrkall1519 paywrkall1519 unempall1519 notlabmarkall1519 empstunknall1519 unemprateall1519 unmarrall1519 marrall1519 widall1519 divall1519 marstunkn1519
> > }  
> > 
> > The numbers in the name of each variable mean age => these are various demographic characteristics of 15-19 year olds, the dataset then goes on and contains the same demographic measurements for older age groups until 70-74 year olds. I of course can simply copy paste this part of the dictionary for each age group and only replace the numbers representing age groups, but I was wondering if anyone could suggest a better way?
> > I would be most grateful for any advice,

and I answered:

> Perhaps something along the following lines:
> 
> -----------------------------------------------------------------------
> forv lage=15(5)70 {
> 	foreach name in total labmark wrk paywer unem notlab empunkn ///
> 	  unemprate unmarr  widall divall marstunkn {
> 		local varnames `"`varnames' `name'`lage'`=`lage'+4'"'
> 	}
> }
> 
> di "`varnames'"
> 
> file open dict using mydict.dct, replace write
> file write dict `"dictionary using general data Tokyo ku merged.csv
> { "' ///
> 	_n `" `varnames' "' ///
> 	_n `" } "'
> file close dict
> exit
> 
> -----------------------------------------------------------------------
> 
> This produces the dictionary file mydict.dct. Of course it makes
> assumptions about your data and will fail if these are not met. 

Here is a version that is more robust against wrapping of long lines by
the E-mail system:


----------------------------------------------------------------
forv lage=15(5)70 {
  foreach name in total labmark wrk paywer unem notlab ///
    empunkn unemprate unmarr  widall divall marstunkn {
       local varnames `"`varnames' `name'`lage'`=`lage'+4'"'
  }
}
 
file open dict using mydict.dct, replace write
file write dict ///
   `"dictionary using general data Tokyo ku merged.csv { "' ///
	_n `" `varnames' "' ///
	_n `" } "'
file close dict
exit
-----------------------------------------------------------------



-- 
[email protected]
030 25491-361

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index