Bookmark and Share

Notice: On March 31, it was announced that Statalist is moving from an email list to a forum. The old list will shut down on April 23, and its replacement, statalist.org is already up and running.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: drop variables in panel data with loop


From   Nick Cox <njcoxstata@gmail.com>
To   statalist@hsphsun2.harvard.edu
Subject   Re: st: drop variables in panel data with loop
Date   Sun, 22 Jul 2012 12:26:14 +0100

It seems that you want to drop incomplete panels. But you should be
trying to summarize information on -r- not -i- or -t-. Here is one
approach:

bysort i : egen present = count(r)
su present
drop if present < #

where you fill in #.

Nick

On Sun, Jul 22, 2012 at 11:51 AM, Lisa Wang <lhwang0925@gmail.com> wrote:
> I am having trouble with Stata and would like some guidance on what I
> am doing incorrectly. I am new to Stata (only 1 month into it), so I
> am still trying to learn and sometimes still thinking like in Excel.
>
> I will try to be as detailed as possible, so you can understand my question.
>
> To describe my data set, I have some panel data and a variable i,
> which is the names (eg. Mary, Tom...) but encoded into a numeric as
> such: - encode symbol1, generate (i) -. There are 59732 rows and the
> count of i is 30.
>
> What I would like to achieve is to tell the program to drop the
> observations that have missing values for a variable for a specific
> period (variable window). E.g. If there is no data for "Mary" for day
> 102 then drop all the rows pertaining to "Mary"  from day 1...T - not
> only drop the the observation for Mary on day 102.
>
> This is my code to try to achieve this:
>
> version 12.1
> clear all
> set more off
>
> cd "C:\Users\Admin\Desktop"
>
> use window_students, clear
>
> xtset i t                                                                                                       //check panel structure is correct
>
>
> summ i   // this tells me that the max of variable i is 30, which is
> correct as I have 30 people I need to analyse
>
> tabulate i t if window==1 & r==.
>   //r is another variable stored in another column, which represents
> their rates. There are 8 people that don't have any rates within my
> window.
> ///I would like to remove all the observations pertaining to these peopl
>
> levelsof i if window==1 & r==., local(entities)        //tried to
> store the people that were missing into a local macro - these are i =
> 2 4 6 7 9 14 21 25
>
>
>
> Then I tried this:
>
> *Method 1 - but then results window has return code 198 and invalid
> '4' in red text
>
> foreach i of local entities{
> drop if i==`entities'
> }
>
>
> *Method 2 - but then results window has return code 111 and variable i not found
>
> foreach i of local entities{
> drop i
> }
>
> *Method 3 - but it deleted all of my observations
>
> foreach i of local entities{
> drop i
> }
>
> *Method 4 - after Stata told me that it was person 2,4, 6, 7, 9 etc...
> that were missing observations I wrote out each line
>
> drop if i==2
> drop if i==4   //etc.....
>
> summ i            // I still get 30 in the summary but it has told me
> that it has deleted observations for each drop if line that I
> used....shouldn't it be 22 now after I removed the 8 people?
>
>
>
> I am stuck now...as I need the i to be correct as I will be doing some
> regressions with the i later, that's why I have to drop the people
> that don't have observations in my dataset before I do further
> analysis.
>
> eg.
> summarize i
> local m = r(max)                                                                                        //create a local macro storing the max
> number of distinct entities from an r-scalar
>
> generate ar = .
>
>
>         forvalues x = 1/`m' {                                                           //run regression for every entity in data set
>                 regress r ind if i==`x' & twindow
>                 predict res if i==`x', residuals                                        //predict residuals both
> in-sample and out-of-sample
>                 replace ar=res if i==`x' & holidaywindow                        //replace ar=. with thes
> estimated residuals
>                 drop res
> }
>
>
>
> Sorry for the long email. This is my first post, so wanted everyone to
> be clear of what I have done so far and what I want to do next.
>
>
>
> Many thanks for your considerations,
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2014 StataCorp LP   |   Terms of use   |   Privacy   |   Contact us   |   Site index