Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Insheet with dictionary


From   Ulrich Kohler <[email protected]>
To   [email protected]
Subject   Re: st: Insheet with dictionary
Date   Thu, 16 Jul 2009 16:18:12 +0200

Am Donnerstag, den 16.07.2009, 14:59 +0100 schrieb Ekaterina Hertog:
> Dear Dr. Kohler,
> Thank you so much for the advice. This worked beautifull for me with only 2 small hitches.
> One was that Stata seemed to complan that my dictionary name contained spaces, but worked fine once I changed the name for something without spaces. Secondly, the dictionary generated with forv for some reason did not add enter after the last } and so when I embedded the dictionary generating lines into the rest of my do file, it refused to run continuously and I always have to stop after generating the dictionary, open it and add press enter manually so taht Stata sees that the last } is on its own separate line. I wonder if there might be a fix for it?
> But generally it is a really minor inconvenience, you advice is saving me a lot of time!
> thank you very much again,

This one should fix both problems. 

----------------------------------------------------------------
forv lage=15(5)70 {
   foreach name in total labmark wrk paywer unem notlab ///
     empunkn unemprate unmarr  widall divall marstunkn {
        local varnames `"`varnames' `name'`lage'`=`lage'+4'"'
   }
}
  
file open dict using mydict.dct, replace write
file write dict ///
   `"dictionary using "general data Tokyo ku merged.csv" { "' ///
	_n `" `varnames' "' ///
	_n `" } "' _n
file close dict
exit
-----------------------------------------------------------------
 




> warm regards,
> Ekaterina
> 
> In message <1247661455.17101.23.camel@kohler> [email protected] writes:
> > Ekaterina Hertog wrote
> > > > Dear Statalist,
> > > > I need to import 12 large datasets from Excel to Stata and I was wondering whether anyone could suggest a more efficient way than the one I came up with.
> > > > The variables in my datasets are named in Japanese so if I simply insheet the datasets the resulting Stata files have v1, v2 etc instead of variable names and contain question marks instead of variable labels. My datasets all have similar structure, so I thought that the most efficient way would be writing a dictionary for one of them and use this dictionary to insheet the relevant Excel file and then modifying the dictionary for each subsequent dataset (they are similar, but not identical).
> > > > The problem is that writing even one dictionary is proving extremely repetitive work and I was wondering whether there might be a way of somehow automating it with something like foreach?
> > > > 
> > > > 
> > > > 
> > > > The beginning of my dictionary looks as follows: 
> > > > 
> > > > dictionary using general data Tokyo ku merged.csv {
> > > > totalall1519 labmarkall1519 wrkall1519 paywrkall1519 unempall1519 notlabmarkall1519 empstunknall1519 unemprateall1519 unmarrall1519 marrall1519 widall1519 divall1519 marstunkn1519
> > > > }  
> > > > 
> > > > The numbers in the name of each variable mean age => these are various demographic characteristics of 15-19 year olds, the dataset then goes on and contains the same demographic measurements for older age groups until 70-74 year olds. I of course can simply copy paste this part of the dictionary for each age group and only replace the numbers representing age groups, but I was wondering if anyone could suggest a better way?
> > > > I would be most grateful for any advice,
> > 
> > and I answered:
> > 
> > > Perhaps something along the following lines:
> > > 
> > > -----------------------------------------------------------------------
> > > forv lage=15(5)70 {
> > > 	foreach name in total labmark wrk paywer unem notlab empunkn ///
> > > 	  unemprate unmarr  widall divall marstunkn {
> > > 		local varnames `"`varnames' `name'`lage'`=`lage'+4'"'
> > > 	}
> > > }
> > > 
> > > di "`varnames'"
> > > 
> > > file open dict using mydict.dct, replace write
> > > file write dict `"dictionary using general data Tokyo ku merged.csv
> > > { "' ///
> > > 	_n `" `varnames' "' ///
> > > 	_n `" } "'
> > > file close dict
> > > exit
> > > 
> > > -----------------------------------------------------------------------
> > > 
> > > This produces the dictionary file mydict.dct. Of course it makes
> > > assumptions about your data and will fail if these are not met. 
> > 
> > Here is a version that is more robust against wrapping of long lines by
> > the E-mail system:
> > 
> > 
> > ----------------------------------------------------------------
> > forv lage=15(5)70 {
> >   foreach name in total labmark wrk paywer unem notlab ///
> >     empunkn unemprate unmarr  widall divall marstunkn {
> >        local varnames `"`varnames' `name'`lage'`=`lage'+4'"'
> >   }
> > }
> >  
> > file open dict using mydict.dct, replace write
> > file write dict ///
> >    `"dictionary using general data Tokyo ku merged.csv { "' ///
> > 	_n `" `varnames' "' ///
> > 	_n `" } "'
> > file close dict
> > exit
> > -----------------------------------------------------------------
> > 
> > 
> > 
> > -- 
> > [email protected]
> > 030 25491-361
> > 
> > *
> > *   For searches and help try:
> > *   http://www.stata.com/help.cgi?search
> > *   http://www.stata.com/support/statalist/faq
> > *   http://www.ats.ucla.edu/stat/stata/
> 
-- 
[email protected]
030 25491-361

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index