Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: RE: strings hnadling


From   Philip Ryan <[email protected]>
To   [email protected]
Subject   Re: st: RE: strings hnadling
Date   Wed, 08 Feb 2006 16:22:58 +1030

Another possible solution makes use of the -noccur()- function written by Nick Winter for the -egenmore- set of user -written add-ons to -egen- (see findit egenmore).

Here I assume you want the number of elements in each variable, not the number of separators (semi-colons). In general the number of elements will be one more than the number of separators:

forvalues i = 1/4 {
egen k`i' = noccur(grp`i'), string(";")
replace k`i' = k`i' + 1
}

You would need to be careful that the separators are indeed all semi-colons (and not a mixture of semi-colons and commas such as you show in your message) and that there were no additional semi-colons - doubling up or trailing ones, for example.

Phil




At 10:27 PM 7/02/2006 -0600, you wrote:

In your example, isn't the number of semicolons: 2, 2, 0, 0 ?

Or, do you mean something like this?

forv i = 1/4 {
   qui gen gr`i' = .
}
levelsof id, local(levels)
foreach l of loca levels {
   local i = 1
   foreach v of varlist grp* {
     qui split `v' if id == `l', p(;) gen(_split)
     qui replace gr`i' = `=r(nvars)' if id == `l'
     drop _split*
     local ++i
   }
}


For example:


. l, noobs

  +----------------------------------------------+
  | id    grp1         grp2   grp3          grp4 |
  |----------------------------------------------|
  |  1   2;3;4      10;99;2     01   11;2;25;2;3 |
  |  2     2;3   10;99;2;44     01     11;2;25;2 |
  +----------------------------------------------+

. forv i = 1/4 {
  2. qui gen gr`i' = .
  3. }

. levelsof id, local(levels)
1 2

. foreach l of loca levels {
  2. local i = 1
  3. foreach v of varlist grp* {
  4.         qui split `v' if id == `l', p(;) gen(_split)
  5.         qui replace gr`i' = `=r(nvars)' if id == `l'
  6.         drop _split*
  7.         local ++i
  8.         }
  9.         }

. l,noobs

  +----------------------------------------------------------------------+
  | id    grp1         grp2   grp3          grp4   gr1   gr2   gr3   gr4 |
  |----------------------------------------------------------------------|
  |  1   2;3;4      10;99;2     01   11;2;25;2;3     3     3     1     5 |
  |  2     2;3   10;99;2;44     01     11;2;25;2     2     4     1     4 |
  +----------------------------------------------------------------------+


Hope this helps,
Scott


> -----Original Message-----
> From: [email protected] [mailto:owner-
> [email protected]] On Behalf Of Alexander Nervedi
> Sent: Tuesday, February 07, 2006 7:04 PM
> To: [email protected]
> Subject: st: strings hnadling
>
> Hi List users,
>
> I have data which has been entered awkwardly.
>
> Instead of taking each a seperate variable for each item - all items of a
> category are entered together in a variable.
>
> ID      Grp1     Grp2       Grp3   Grp4
> 001   2;3;4    10;99;2    01     11,2,25,2,3
>
>
> I'd like to convert this to a dataset that looks like
>
> ID      Grp1     Grp2       Grp3   Grp4
> 001    3          3            1        5
>
> i.e. the count of the number of semi-colons within each variable.  I am
> sure
> there is a neat way of doing this but I am missing it. So i thought i'd
> write in and ask for u r help.
>
> thanks
>
> Alnerdy
>


*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
Philip Ryan
Associate Professor,
Department of Public Health
Associate Dean (Information Technology)
Head, Data Management & Analysis Centre
Faculty of Health Sciences

postal address:
Department of Public Health
Mail Drop 511
University of Adelaide 5005
South Australia

location:
Level 6
Bice Building
Royal Adelaide Hospital
North Terrace
Adelaide

tel 61 8 8303 3570
fax 61 8 8223 4075
http://www.public-health.adelaide.edu.au/
CRICOS Provider Number 00123M
-----------------------------------------------------------
This email message is intended only for the addressee(s)
and contains information that may be confidential and/or
copyright. If you are not the intended recipient please
notify the sender by reply email and immediately delete
this email. Use, disclosure or reproduction of this email
by anyone other than the intended recipient(s) is strictly
prohibited. No representation is made that this email or
any attachments are free of viruses. Virus scanning is
recommended and is the responsibility of the recipient.

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index