Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: RE: RE: mapping a value from 2 variables


From   [email protected]
To   [email protected]
Subject   st: RE: RE: mapping a value from 2 variables
Date   Wed, 24 Jul 2002 15:04:14 -0400

[email protected] wrote
> 
> I don't know of anything quite like this, but
> for once a looping over observations would seem
> to solve the problem:
> 
> local N = _N
> forval i = 1/`N' {
>     local val = naics[`i']
>     local label = labelnaics[`i']
>     label def naicslab `val' "`label'" , modify
> }
> 
> Nick
> [email protected]

This will work, no doubt.  The reason I hadn't considered -forvalues- for
this purpose was that I wanted to avoid looping over observations.  Such
looping is not so bad with my current data which contains approx. 2000
observations but may be computationaly intensive and slow if I try to extend
the procedure to situations where labels may take up to 65,536 different
coding values -- the Stata limit for value labels.  I tested the loop on a
dataset of 30,000 observations and it took 2 minutes to complete, which is
not the end of the world for the use I'll make of it.

But what escaped me in my own proposed solution below is that step 2 (where
I would use -file- to substitute a space-character for the first comma)
would itself require looping over observations (!).  I will probably go with
Nick's solution as I don't see anything else for now.  


My initial posting was
> [email protected]
> >
> > Has anyone written a routine to define a value label
> > mapping based on two
> > variables (one numeric, one string)?  Several searches
> > using findit did not
> > turn up anything.  Normally, I wouldn't bother asking but I
> > would find it
> > surprising if no one had tried to automate this type of task before.
> >
> > For instance, I have many data files comprising industrial
> > classification
> > codes which I routinely merge to data for various projects.
> >  One such
> > example is my file for the NAICS (North American Industry
> > Classification
> > System) Codes, which contains the numerical variable
> > _naics_ and string
> > variable _labelnaics_
> >
> >  . list
> >
> >               naics                                   labelnaics
> >     1.           11   Agriculture, Forestry, Fishing and Hunting
> >     2.          111                              Crop Production
> >     3.         1111                    Oilseed and Grain Farming
> >     4.        11111                              Soybean Farming
> >     5.       111110                              Soybean Farming
> >     6.        11112             Oilseed (except Soybean) Farming
> >     7.       111120             Oilseed (except Soybean) Farming
> >     8.        11113                     Dry Pea and Bean Farming
> >    ...
> >
> > I would like to define a value label such that each value
> > of _naics_ would
> > map to the corresponding value of _labelnaics_.   Encode
> > will not do the
> > trick here since it would be equivalent to:
> >
> >   label define naicslab 1 "Agriculture, Forestry, Fishing
> > and Hunting"
> >   label define naicslab 2 "Crop Production", add
> >   label define naicslab 3 "Oilseed and Grain Farming", add
> >   ...
> >
> >
> > I could write my own .ado to create a do file similar to
> > those generated by
> > -label save ...-.  This could be done by:
> > - sending the data to a comma-delimited file via -outsheet-
> > - using the -file-command, replacing the first comma by a
> > space character
> > - prefixing each line by -label define-
> > - suffixing by -, modify-
> >
> > But before I do that, has anyone seen such an .ado?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index