# Re: st: RE: Substr() problem.

 From adiallo5@worldbank.org To statalist@hsphsun2.harvard.edu Subject Re: st: RE: Substr() problem. Date Wed, 31 Aug 2005 12:15:47 -0400

```Oops!!!,
My mistake. It works perfect.
Many thanks.

"Nick Cox"
<n.j.cox@durham.ac.uk>           To:       <statalist@hsphsun2.harvard.edu>
Sent by:                         cc:
owner-statalist@hsphsun2.        Subject:  st: RE: Substr() problem.
harvard.edu

08/31/2005 12:06 PM
statalist

The one thing that looks unwise is

g str9 hhid = substr(c, 1, 12)

Stata will have difficulties in putting
character strings of length 12 in a -str9-
variable.

In (British) English, we say that you
can't get a quart in a pint pot. This
is a similar case. (Note for those
with rational units: 1 quart = 2 pints.)

g hhid = substr(c,1,12)

should be sufficient in Stata 8 up.

Nick
n.j.cox@durham.ac.uk

> Hi, I want to get the household ID from a DHS women file.
> But somehow, substr() is not working properly.
> My identification variables are string formats (with spaces
> between them somethimes).
>
>
> I go:
>
> . use hhid using amhr41rt
>
> . so hhid
>
> . ta hhid in 1/10
>
>         Case |
> Identificati |
>           on |      Freq.     Percent        Cum.
> -------------+-----------------------------------
>         1  2 |          1       10.00       10.00
>         1  3 |          1       10.00       20.00
>         1  5 |          1       10.00       30.00
>         1  6 |          1       10.00       40.00
>         1  7 |          1       10.00       50.00
>         1  8 |          1       10.00       60.00
>         1  9 |          1       10.00       70.00
>         1 10 |          1       10.00       80.00
>         1 11 |          1       10.00       90.00
>         1 12 |          1       10.00      100.00
> -------------+-----------------------------------
>        Total |         10      100.00
>
> . ta hhid in 5970/5980
>
>         Case |
> Identificati |
>           on |      Freq.     Percent        Cum.
> -------------+-----------------------------------
>       260 19 |          1        9.09        9.09
>       260 20 |          1        9.09       18.18
>       260 21 |          1        9.09       27.27
>       260 22 |          1        9.09       36.36
>       260 23 |          1        9.09       45.45
>       260 24 |          1        9.09       54.55
>       260 25 |          1        9.09       63.64
>       260 26 |          1        9.09       72.73
>       260 27 |          1        9.09       81.82
>       260 28 |          1        9.09       90.91
>       260 29 |          1        9.09      100.00
> -------------+-----------------------------------
>        Total |         11      100.00
>
> . g le=length(hhid)
>
> . ta le
>
>          le |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>          12 |      5,980      100.00      100.00
> ------------+-----------------------------------
>       Total |      5,980      100.00
>
> . ta hh
> (output omitted)
> Max observation: 260 29
>
>
> . use caseid using amir41rt, clear
>
> . so ca
>
> . ta c in 1/10
>
>            Case |
>  Identification |      Freq.     Percent        Cum.
> ----------------+-----------------------------------
>         1  3  1 |          1       10.00       10.00
>         1  5  3 |          1       10.00       20.00
>         1  7  2 |          1       10.00       30.00
>         1  8  2 |          1       10.00       40.00
>         1  9  4 |          1       10.00       50.00
>         1 10  2 |          1       10.00       60.00
>         1 10  3 |          1       10.00       70.00
>         1 11  2 |          1       10.00       80.00
>         1 12  2 |          1       10.00       90.00
>         1 13  2 |          1       10.00      100.00
> ----------------+-----------------------------------
>           Total |         10      100.00
>
> . ta c in 6420/6430
>
>            Case |
>  Identification |      Freq.     Percent        Cum.
> ----------------+-----------------------------------
>       260 18  7 |          1        9.09        9.09
>       260 21  2 |          1        9.09       18.18
>       260 22  2 |          1        9.09       27.27
>       260 22  4 |          1        9.09       36.36
>       260 24  4 |          1        9.09       45.45
>       260 25  2 |          1        9.09       54.55
>       260 25  3 |          1        9.09       63.64
>       260 26  2 |          1        9.09       72.73
>       260 26  3 |          1        9.09       81.82
>       260 27  3 |          1        9.09       90.91
>       260 29  4 |          1        9.09      100.00
> ----------------+-----------------------------------
>           Total |         11      100.00
>
> . ta caseid
> (output omitted)
> max obs:  260 29  4
>
>
> . g le=length(c)
>
> . ta le
>
>          le |      Freq.     Percent        Cum.
> ------------+-----------------------------------
>          15 |      6,430      100.00      100.00
> ------------+-----------------------------------
>       Total |      6,430      100.00
>
> . g str9 hhid = substr(c, 1, 12)
>
> What is the problem?
> Is there a way to get the correct hhid?

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```