Statalist The Stata Listserver


[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

[no subject]



	. gen long code = real(rest) if first=="NUMB"

	. assert code!=. if first=="NUMB"        // check assumption

        . replace code = code[_n-1] if code==.   // carry down

Now look at the data.  I assume that code is filled in right from the 
first observation.  Assuming that, 

	. sort code recnum

We have our code variable.


Step 6:  Get the rest
---------------------

I suggest we code 

	. replace first = strlower(first)

With that, we may be done.  Do we have the data in long form?  If so, 
we can now set about converting it wide, and then changing the numeric 
variables from string to numeric.  If not, we have more to do, so we do it.


Final comment
-------------

Note how I proceeded:  I just work interactively to solve little problems.
I don't know what the ultimate solution is, but I do know how to get 
closer, and I keep doing that until I'm done.

Working interactively, however, is dangerous.  It is too easy to make a 
mistake and not detect it.  So what I do is start a do-file.  It started 
like this:

	------------------------------------------------- input.do ---
	clear
        infix str line 1-80 using <filename>
	compress
	------------------------------------------------- input.do ---

I ran that, then I looked around, tried a few things, and added to my 
do-file:

	------------------------------------------------- input.do ---
	clear
        infix str line 1-80 using <filename>
	compress
	assert strlen(line)<80              <- notice this line 

	gen long recnum = _n

	gen blank = strpos(line, " ")

	gen str first = strtrim(substr(line, 1, blank)) if blank 
	replace first = line if blank==0

	gen str rest = strtrim(substr(line, blank, .)) if blank
	------------------------------------------------- input.do ---

Then I rerun the do-file, and repeat the process.  Thus, I build a do-file
as I go.

Note the line I flagged.


	assert strlen(line)<80              <- notice this line 

After each group of lines, I add -assert-s verifying what I found and 
the assumptions I am making.  This way, I can rerun the do-file later 
on an updated version of the dataset and, if it completes, be certain 
my original assumptions are still true and thus reasonably certain that 
I just created correctly an updated Stata dataset.


-- Bill
wgould@stata.com
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index