Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: tabstatmat question
Nick Cox <firstname.lastname@example.org>
Re: st: tabstatmat question
Fri, 2 Sep 2011 18:35:47 +0100
In addition to Austin's replies, do please note that he has already
replied to your main specific question here.
You can create matrices in Stata and you have already created one. But
the way you use it in a -generate- statement implies that it has a
completely different structure from what it has. Your idea that Stata
does, or could do, a kind of intelligent lookup to give you what you
want just won't work.
On Fri, Sep 2, 2011 at 6:14 PM, Alvarez,Sergio <email@example.com> wrote:
> Hi Austin,
> Thanks for your response. So what I'm doing is a site choice model of
> recreational fishing. From the dataset I can tell what city/town people
> come from and what city/town they went fishing in. I can also tell how many
> fish people caught. For the site choice model, I need to create a series of
> alternatives or just other places where the person could have fished at but
> decided not to. But I need some indication of the quality of the site that
> was not visited. Since alternative fishing trips did not take place, I have
> no indication of how many fish the person could have caught if they had gone
> to place B, rather than to place A, which is where they actually went.
> So as an indication of quality I was going to use the mean number of fish
> caught in the site (zone) at that particular time of the year (wave) in
> years past. That is why I wanted to create a matrix that would have the
> mean catch by zone and wave, something like this:
> ZONE 1 2 ...
> 1 mean(1,1) mean(1,2)
> 2 mean(2,1) mean(2,2)
> Which was my original question. I would use that matrix to input the mean
> catch for the alternatives that did not happen after I created them. Now if
> the matrix looked like the example above, I thought I could use:
> gen meancatch = matrix[zone,wave]
> I was hoping that this line of code would look up the wave and zone of each
> observation and input the value from the matrix that corresponded to each
> observation. So I looked around and found -tabstatmat- from SSC, and tried
> it, using the code you gave yesterday:
> egen byv=group(zone wave), lab
> tabstat num_typ3, stat(mean) by(byv) save
> tabstatmat TABLE
> And this created the matrix with the values, and looks like this:
> 1†1:mean 1.9822335
> 1†2:mean 2.6614173
> 1†3:mean 2.7150396
> 1†4:mean 3.3340782
> 1†5:mean 2.8161094
> 1†6:mean 1.1767857
> 2†1:mean 1.5857143
> 2†2:mean 2.1863208
> 2†3:mean 2.542777
> 2†4:mean 1.8849432
> 2†5:mean 1.7281553
> 2†6:mean 1.4927536
> 3†1:mean 1.875
> There's 85 sites with 6 waves a piece.
> The original dataset has about 70,000 observations, so after creating 84
> alternatives for each I get about 6,000,000 observations. I already know
> how to do this using -reshape- and the distance to the alternative sites,
> which I already put in the dataset. And what I need is to have the
> indicator of quality, or mean catch for each alternative site during the
> time period that the person actually went fishing. Then I will be able to
> run -clogit- or a similar procedure.
> I hope this makes sense. I'm new both to stata and to choice models, so
> this has been a pretty confusing and slow process for me.
> I really appreciate the help.
> On Fri, 2 Sep 2011 12:45:05 -0400, Austin Nichols wrote:
>> Sergio <firstname.lastname@example.org>:
>> Did you read my response?
>> Look at the matrix; there is one column, so your references to row and
>> column make no sense.
>> You could make another matrix with values of byv corresponding to zone
>> and wave, noting that you must have these be integers counting from 1
>> up for row and column numbers to correspond to what you seem to want.
>> But why? What would be the point of this?
>> On Fri, Sep 2, 2011 at 12:35 PM, Alvarez,Sergio <email@example.com> wrote:
>>> Sorry about ambiguity.
>>> So I used the mean by group code to create the matrix that would store
>>> mean values for each group, using:
>>> egen byv=group(zone wave), lab
>>> tabstat num_typ3, stat(mean) by(byv) save
>>> tabstatmat TABLE
>>> which gives me a matrix, or rather a vector, with all the values I need.
>>> The first few lines of the matrix in the output screen look like this:
>>> 1†1:mean 1.9822335
>>> 1†2:mean 2.6614173
>>> 1†3:mean 2.7150396
>>> 1†4:mean 3.3340782
>>> 1†5:mean 2.8161094
>>> 1†6:mean 1.1767857
>>> 2†1:mean 1.5857143
>>> 2†2:mean 2.1863208
>>> 2†3:mean 2.542777
>>> 2†4:mean 1.8849432
>>> Now what I want to do is use -gen- or -egen- to create a variable that
>>> look up the zone and wave of the corresponding observation from the
>>> and insert the correct value in there. So I tried:
>>> gen meancatch = TABLE[zone,wave]
>>> and this gives the correct values for all observations with wave = 1, but
>>> creates missing values on the rest of the observations. I also tried:
>>> gen meancatch = TABLE[byv,num_typ3]
>>> and this gives me the correct value in some of the observations, but
>>> missing values in the others.
>>> So I must be doing something wrong, but can't figure out what. I guess
>>> question is how to call the row and column numbers from the TABLE matrix?
>>> Thanks again,
>>> On Fri, 2 Sep 2011 12:08:31 -0400, Austin Nichols wrote:
>>>> Sergio <firstname.lastname@example.org> :
>>>> Now I have no idea what you are trying to do. For the mean by group,
>>>> egen mby=mean(num_typ3), by(zone wave)
>>>> but you are referring to (probably) nonexistent row and column numbers
>>>> of a matrix in your example.
>>>> On Fri, Sep 2, 2011 at 10:42 AM, Alvarez,Sergio <email@example.com>
>>>>> Thanks Austin and Nick for your help. I used what Austin suggested
>>>>> is what Nick also suggested) and it worked. However, when I try to
>>>>> the variable that contains the mean by group it works for some
>>>>> but missing values are created for most of them. I tried both:
>>>>> gen meancatch = TABLE[zone,wave]
>>>>> gen meancatch = TABLE[byv,num_typ3]
>>>>> For the first line of code, it creates the correct value for all
>>>>> observations where wave = 1, but not for any others. The second line
>>>>> creates missing values at random (as far as I can tell).
>>>>> I'd appreciate any tips.
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
> Sergio Alvarez
> Food and Resource Economics
> University of Florida
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
* For searches and help try: