Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: creating a newvar with sequential values from a bunchof string variables


From   Suzy <[email protected]>
To   [email protected]
Subject   Re: st: RE: creating a newvar with sequential values from a bunchof string variables
Date   Sun, 26 Jun 2005 13:45:42 -0400

Thanks Nick. In actuality, I did the entire thing manually last night and this morning - what a nightmare and carpal tunnel syndrome to boot. But I am going to try your code to reproduce my output and for future reference. I appreciate your help. I have all Stata 9 documentation, is there a reference or info on forval=i (I've never come across this before - I'm not familiar with it).

Nick Cox wrote:


Thanks for adding extra detail.
My comments still stand.
What you tried is legal, but will not do what you hope it does.
As I pointed out in another posting, Richard's suggestion is just illegal.
"71600"/"71699" is not a legal string expression, so violates -replace-'s syntax.
Your guess that there should be a way of specifying a range of string values was correct. In addition to my earlier suggestion, consider

forval i = 1/4 { replace newvar = 1 if inrange(var`i',"71600","71699") }
Nick [email protected]
Suzy



The do believe the variables are string variables - the variable font is in red and many other values within the variable look like strings (eg. v8123, 3401-, e78345).

Each of these string variables may or may not have the same value for particular observations. For example, stringvar1 may have "71611" for observation #10 and stringvar2 might have "71611" for observation #2 and stringvariable3 might have "71611" for observation # 3,000. and stringvar4 might not have any "71611" for any observation.

I'm creating a newvar which can account for all the "71611" values across all the string variables. In other words, the newvar must include all values from "71600" to "71699" across all the stringvars (of which any stringvar may or may not include these values).

The reason I thought it was impossible to have no changes is because for example, the value of "71623" across all string variables resulted in 10 observations for my newvar. Newvar =44 is just a category I created to account for the value of "71623" across all the string variables. [output eg. replace newvar=44 if var1 =="71623" | var2=="71623" | var33=="71623" | var4=="71623" (10 real changes made)]

So for example, newvar=1 could have represented all the "71600" values across all string variables, whereas newvar=2 could represent all "71601" values across all string variables and newvar=3 could represent all "71602" values across all string variables, etc, etc...

These are just examples of what's been done - not actual categories per se. Thus, newvar is a categorical variable that is accounting for all the values "71600 to 71699" across all the string variables. The code I had used to account for all of it in one shot didn't work,

replace newvar=1 if var1 =="71600/71699" | var2=="71600/71699" | var3=="71600/71699" | var4=="71600/71699" (0 changes made)

but Richard's idea of closing each value in parenth might work...I haven't tried it yet.

replace newvar=1 if var1 =="71600"/"71699" | var2=="71600"/"71699" | var3=="71600"/"71699" | var4=="71600"/"71699"
Hope this e-mail explanation is easier to understand. I am using Stata 9 (windows xp).

Thank you.





Nick Cox wrote:



It's not impossible at all. Stata takes literally
what you include between double quotes " ". Evidently you had no strings with values like "71600/71699".
However, it may be that what you want is accomplishable by something like
forval i = 1/4 { replace newvar = 1 if inrange(real(var`i'),71600,71699) }
but like Richard Williams I am not completely clear from your examples what exactly you do want.
Nick [email protected]
Suzy





I created a new variable (newvar) that is supposed to be a
collection

of sequential values from different string variables. The values that are included in the new variable go from 71600 to 71699. I've been doing it manually for hours- replacing value after value.
see example

for value # 71623:

replace newvar=44 if var1 =="71623" | var2=="71623" | var33=="71623" | var4=="71623"
(10 real changes made)


I tried to do it this way - see example:

replace newvar=1 if var1 =="71600/71699" | var2=="71600/71699" | var3=="71600/71699" | var4=="71600/71699"
(0 changes made)

But no changes occurred, which is impossible.

I'm hoping there is a simple and quick syntax for what I'd like to do. Ideally if newvar =1 with all the values from 71600 t0
71699 located

from the existing variables (1-4) included in newvar - then


that would be less painful than what I've been doing.

Any simple solution?? explicit code is appreciated!!

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/




*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index