my dataset contains many missing
observations. My question is should I drop all these
observations for the consistent and efficient
estimates?
If you drop the observations you not only reduce statistical power you
also assume that data is missing completely at random (MCAR). This is a
rather strong assumption and can usually be relaxed by maximum
likelihood estimation to assume data only being missing at random
(MAR). MAR does not mean that missing is random (sic!) it may be
systematic. However, the probability of of a missing value should be
related to the covariates in the model, not the dependent variable.
Look at Schafer & Graham (2002) for an introduction.