Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

# Re: st: Doing something an observation-specific number of times

 From robert hartman To statalist@hsphsun2.harvard.edu Subject Re: st: Doing something an observation-specific number of times Date Tue, 28 Aug 2012 14:52:07 -0400

```Austin,
Maarten's "value."

Many thanks.

Sincerely,
Rob

On Tue, Aug 28, 2012 at 2:12 PM, Austin Nichols <austinnichols@gmail.com> wrote:
> robert hartman <rohartman@gmail.com>:
> Except I notice now you are dividing the ones by 2 as well, so
> g v3=v2/2+v1*(1-v1^v2)/(1-v1)/2
>
> On Tue, Aug 28, 2012 at 2:11 PM, Austin Nichols <austinnichols@gmail.com> wrote:
>> robert hartman <rohartman@gmail.com>:
>> v3=((1+(.41^1))/2) + ((1+(.41^2))/2) ...((1+(.41^77))/2) + ((1+(.41^78))/2)
>> for v1=.41 and v2=.78
>> the sum is v2 (all the ones) plus a geometric series
>> that sums to .5*.41*(1-.41^78)/(1-.41), right?
>> I.e.
>> g v3=v2+v1*(1-v1^v2)/(1-v1)/2
>>
>> On Tue, Aug 28, 2012 at 2:03 PM, robert hartman <rohartman@gmail.com> wrote:
>>> Thanks for the pointers, Maarten and Austin.
>>>
>>> I don't believe this is a geometric series, since the ratio of
>>> consecutive terms is not constant. But I may just be missing it.
>>>
>>> Maarten, the data sets can get well into the tens and perhaps hundreds
>>> of thousands. Code like what you've provided looks promising, though
>>> you are probably right that there is no computational free lunch.
>>>
>>> On Tue, Aug 28, 2012 at 1:39 PM, Maarten Buis <maartenlbuis@gmail.com> wrote:
>>>> On Tue, Aug 28, 2012 at 6:45 PM, robert hartman wrote:
>>>>> Imagine that observation 1 has v1 and v2 values of .41 and 78,
>>>>> respectively.  <snip>  For example, for observation 1, the new obs 1 v3
>>>>> value=((1+(.41^1))/2) + ((1+(.41^2))/2) ...((1+(.41^77))/2) +
>>>>> ((1+(.41^78))/2).
>>>>>
>>>>> I have begun to think of some klugy ways of doing this via looping or
>>>>> even the expand command.
>>>>
>>>> Depending on the number of observations in your original dataset the
>>>> -expand- route may be the easiest. If the number of observations is
>>>> large than this strategy may be infeasible due to memory limitations.
>>>> When it comes to efficiency, you need to make the tradeoff between the
>>>> amount of time you need to write the more fancy code (and the effort
>>>> you will need to understand it again after some time...) against the
>>>> time you safe because it runs quicker. Often the balance will be
>>>> against the more fancy solutions(*).
>>>>
>>>> *---------------- begin example ---------------
>>>> // create some example data
>>>> clear
>>>> input v1 v2
>>>> .41 78
>>>> .23 50
>>>> end
>>>>
>>>> // we need to keep track on who is who before
>>>> // expanding
>>>> gen id = _n
>>>>
>>>> // create v2 rows per observation
>>>> expand v2
>>>>
>>>> // create the appropriate exponent
>>>> bys id : gen expo = _n
>>>>
>>>> // create the basic component of the computation
>>>> gen double value = (1+v1^expo)/2
>>>>
>>>> // sum() returns a running sum
>>>> by id : replace value = sum(value)
>>>>
>>>> // the final sum is the last of the running sum
>>>> bys id (expo) : replace value = value[_N]
>>>>
>>>> //get rid of things that are no longer needed
>>>> drop expo
>>>> by id : keep if _n == 1
>>>> drop id
>>>>
>>>> // see the result
>>>> list
>>>> *----------------- end example ----------------
>>>> (For more on examples I sent to the Statalist see:
>>>>  http://www.maartenbuis.nl/example_faq )
>>>>
>>>> Hope this helps,
>>>> Maarten
>>>>
>>>> (*) This of course ignores the pure joy you will get from figuring out
>>>> the fancy solution, but we are not payed to enjoy ourselves!
>>>>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
```