Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Equivalent of Excel's COUNTIF

From	Richard Herron <[email protected]>
To	[email protected]
Subject	Re: st: Equivalent of Excel's COUNTIF
Date	Wed, 16 Nov 2011 11:22:54 -0500

To be clear, my "total" variable includes the person that just checked
in. -sum(`change') - 1- would omit them.

@Nick -- Thanks, that's perfectly logical. I just so badly want to
exclude some variables from the -reshape-, but I guess I should handle
that with a -merge-.

On Wed, Nov 16, 2011 at 11:15, Nick Cox <[email protected]> wrote:
> You could
>
> reshape wide date total, i(id) j(inout) string
>
> but your final data structure looks more useful to me.
>
> Nick
> [email protected]
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Richard Herron
> Sent: 16 November 2011 16:05
> To: [email protected]
> Subject: Re: st: Equivalent of Excel's COUNTIF
>
> Here is an alternative solution with -reshape-, -cond-, and -sum-. The
> last two functions should be fast at any scale, but I don't have
> enough experience with Stata to know if -reshape- is faster than a
> loop.
>
> (Thanks to Stas for the data-generating code)
>
> * begin code
> clear
> set obs 10
> set seed 10101
> generate int id = _n
> generate int datein = ceil(runiform()*50)
> generate int dateout = datein + ceil(runiform()*15)
>
> reshape long date, i(id) j(inout) string
> sort date
> tempvar change
> generate int `change' = cond(inout == "in", 1, -1)
> generate int total = sum(`change')
> * end code
>
> I am still a little confused by Stata's -reshape- and would be
> appreciative if someone could get this back to a logical wide format
> (I couldn't). Thanks!
>
> On Tue, Nov 15, 2011 at 16:19, Stas Kolenikov <[email protected]> wrote:
>> This is grossly inefficient, as any code with explicit cycling over
>> observations is, but it will probably do the work for you:
>>
>> clear
>> set obs 10
>> set seed 10101
>> generate int arrival = ceil(runiform()*50)
>> generate int discharge = arrival + ceil(runiform()*15)
>> gen long npatients = .
>> gen long _num_discharged = .
>> sort arrival
>> isid arrival
>> forvalues k=1/`=_N' {
>>  replace _num_discharged = sum( discharge <= arrival[`k'] )
>>  replace npatients = (_n-1) - _num_discharged[`k'] in `k'
>> }
>> list
>>
>> It is not important that the -arrival- and -discharge- variables are
>> generated as integers, but it is important that _num_discharged is as
>> accurate as possible. The (_n-1) here is the number of patients who
>> have arrived by the time arrival[_n], and we know this since we have
>> sorted by -arrival- and checked that all values are unique.
>>
>> There may be alternative algorithms that would not require the
>> observation-by-observation comparisons, but you would need to create
>> some sort of stack of the patients... and that's not really
>> straightforward, at least within Stata (but should be doable in Mata).
>>
>> On Tue, Nov 15, 2011 at 3:47 PM, Teubner, David (Health)
>> <[email protected]> wrote:
>>> Apologies for the slightly long-winded first post.
>>>
>>> I have a data stet of emergency department presentations for the last 18 years.  The dataset includes numerous variables about individual patient presentations to the department, including time of arrival and time of leaving.  What it does not have is a variable for the number of patients already in the department when each new patient arrives.
>>>
>>> It seems to me that I could construct this by simply subtracting the number of patients who have left (prior to the time of arrival of a new patient) from the number who have arrived.  I understand how to calculate the number who have arrived (sort by arrival time and use the _n function) but I don't understand how to do the left number.  In Excel I would use COUNTIF(-all arrival times-,<individual patients discharge time).  Is there a way to do this in Stata?
>>
>> --
>> Stas Kolenikov, also found at http://stas.kolenikov.name
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Equivalent of Excel's COUNTIF
  - From: "Teubner, David (Health)" <[email protected]>
- Re: st: Equivalent of Excel's COUNTIF
  - From: Stas Kolenikov <[email protected]>
- Re: st: Equivalent of Excel's COUNTIF
  - From: Richard Herron <[email protected]>
- RE: st: Equivalent of Excel's COUNTIF
  - From: Nick Cox <[email protected]>

Prev by Date: RE: st: Equivalent of Excel's COUNTIF
Next by Date: Re: st: Creating dummy variables
Previous by thread: RE: st: Equivalent of Excel's COUNTIF
Next by thread: Re: st: Equivalent of Excel's COUNTIF
Index(es):
- Date
- Thread