FAQ:  Data management 


Last updated:  24 February 2016 
How do I connect to a database by using a Stata plugin?
How do I export tables from Stata?
Why do I get rows of missing data when I use infile?
How can I convert other packages' files to Stata format data files?
How do I set up an ODBC Data Source Name for Stata in Windows?
How do I set up an ODBC Data Source Name for Stata on Mac or Linux/Unix?
Stata is reading in my variables as string instead of numeric. What should I do?
How do I get information from Excel into Stata?
How can I save a Stata dataset so that it can be read in a previous version of Stata?
How can I sample clusters, not individuals?
How can I identify first and last occurrences systematically in panel data?
How do I deal with a report of repeated time values within panel?
How can I create variables containing percent summaries?
How can I generate a variable containing the last of several dates?
How do I split a string variable into parts?
What is true and false in Stata?
How do I calculate measures such as percent improved minus percent deteriorated?
How do I create individual identifiers numbered from 1 upwards?
How many significant digits are there in a float?
Why does the mod(x,y) function sometimes give puzzling results? Why is mod(0.3, 0.1) not equal to 0?
Why can’t I compare two values that I know are equal?
Why is x>1000 true when x contains missing values?
Why does my dofile or adofile produce different results every time I run it?
How do I convert my ICD9 codes from a string type to a numeric type?
How do I check a variable for a range of diagnosis or procedure codes?
How do I label my diagnosis or procedure codes with their descriptions?
How do I calculate the maximum or minimum seen so far in a sequence?
What are regular expressions and how can I use them in Stata?
How do I remove leading or trailing zeros from string variables?
How do I go through the groups of a variable in order of their first occurrence in the dataset?
How can I drop spells of missing values at the beginning and end of panel data?
How do I deal with multiple responses?
How can I collapse my dataset and keep the same variable labels?
How do I identify runs of consecutive observations in panel data?
How do I select a subset of observations using a complicated criterion?
How can I save one or more parts of a large dataset?
How do you efficiently define group characteristics in your data in order to create subsets?
How do I perform multiple operations on data records if a condition is met?
I am having problems with the reshape command. Can you give further guidance?
How do I produce a dataset based on all possible pairs of identifiers within each group?
Why doesn't the destring command in Stata include an encode option?
How can I create a dataset (matrix) of means (other stats) of variables from the current dataset?
How do I calculate the number of distinct values seen so far?
How do I count the number of distinct strings across a set of variables?
How do I compute the number of distinct observations?
How do I tabulate cumulative frequencies?
How do I list observations in a group that differ on a variable?
How do I identify neighbors of points or areas on a rectangular grid in Stata?
How do I identify leap years in Stata?
Why am I getting an error message that there is "insufficient disk space"?
How can I put the current date and time in my log files?
How do I accumulate the results of immediate commands?
Why does the command complain that there are no observations?
Why do I get the error “wrong number of values” when I use insheet to read data from Excel?
Can I use ODBC to write to an existing Excel file?
Why do I get the error message “no room to add more observations”?
How can I use a dataset that is larger than the available RAM?
How can I apply the original value and variable labels after using the reshape command?
Why does my merge produce a dataset with too many observations?
How do I identify duplicate observations in my data?
How can I convert other packages' files to Stata format data files?
What do I do if the command I need cannot be used with by?
How do I create a variable that contains a repeating sequence of numbers?
I am having trouble converting a SAS dataset to a Stata dataset using Stat/Transfer. What is wrong?
What is the new reshape command?
Why does reshape give a toomanyvariables error?
How do I create a lag variable?
Why am I getting a message that there is no room on my hard drive?