Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: drop variables in panel data with loop


From   Lisa Wang <lhwang0925@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   st: drop variables in panel data with loop
Date   Sun, 22 Jul 2012 20:51:19 +1000

I am having trouble with Stata and would like some guidance on what I
am doing incorrectly. I am new to Stata (only 1 month into it), so I
am still trying to learn and sometimes still thinking like in Excel.

I will try to be as detailed as possible, so you can understand my question.

To describe my data set, I have some panel data and a variable i,
which is the names (eg. Mary, Tom...) but encoded into a numeric as
such: - encode symbol1, generate (i) -. There are 59732 rows and the
count of i is 30.

What I would like to achieve is to tell the program to drop the
observations that have missing values for a variable for a specific
period (variable window). E.g. If there is no data for "Mary" for day
102 then drop all the rows pertaining to "Mary"  from day 1...T - not
only drop the the observation for Mary on day 102.

This is my code to try to achieve this:

version 12.1 	
clear all
set more off

cd "C:\Users\Admin\Desktop"

use window_students, clear

xtset i t 													//check panel structure is correct


summ i   // this tells me that the max of variable i is 30, which is
correct as I have 30 people I need to analyse

tabulate i t if window==1 & r==.
  //r is another variable stored in another column, which represents
their rates. There are 8 people that don't have any rates within my
window.
///I would like to remove all the observations pertaining to these peopl

levelsof i if window==1 & r==., local(entities)        //tried to
store the people that were missing into a local macro - these are i =
2 4 6 7 9 14 21 25



Then I tried this:

*Method 1 - but then results window has return code 198 and invalid
'4' in red text

foreach i of local entities{
drop if i==`entities'
}


*Method 2 - but then results window has return code 111 and variable i not found

foreach i of local entities{
drop i
}

*Method 3 - but it deleted all of my observations

foreach i of local entities{
drop i
}

*Method 4 - after Stata told me that it was person 2,4, 6, 7, 9 etc...
that were missing observations I wrote out each line

drop if i==2
drop if i==4   //etc.....

summ i            // I still get 30 in the summary but it has told me
that it has deleted observations for each drop if line that I
used....shouldn't it be 22 now after I removed the 8 people?



I am stuck now...as I need the i to be correct as I will be doing some
regressions with the i later, that's why I have to drop the people
that don't have observations in my dataset before I do further
analysis.

eg.
summarize i
local m = r(max)											//create a local macro storing the max
number of distinct entities from an r-scalar
	
generate ar = .									


	forvalues x = 1/`m' {								//run regression for every entity in data set
		regress r ind if i==`x' & twindow				
		predict res if i==`x', residuals					//predict residuals both
in-sample and out-of-sample
		replace ar=res if i==`x' & holidaywindow			//replace ar=. with thes
estimated residuals
		drop res
}



Sorry for the long email. This is my first post, so wanted everyone to
be clear of what I have done so far and what I want to do next.



Many thanks for your considerations,
Lisa
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index