Submit your question

recursive standard deviatin calculation-'d better using a moving estimate

Question:

can someone point me to a recursive algorithm for calculation of a standard deviation? I am processing a huge bunch of data, and need to calculate the standard deviation of the time series, but I cannot hold the data in memory. Therefore I must "update" the estimate of standard deviation with each point. This data does not have a zero (or known) mean, so I will be estimating that as well. It would be simple to update the standard dev. with each point, without going back and updating the mean for each calculated point, but is there some better way to do this?

Answer:

For a stable implementation use integers and avoid overflow: When calculating with integers you never will loose precision using the moving calculations in contrast using floating point numbers. The original data Y may be direct readouts from AD converters or you can scale them to integer. Some ideas: If your data y[i] are 16 bit, than m * Sum (y[i]^2) = Syy < m * 2^32 you sum may exceed 32 bit. But if you will subtract an estimate yEst and the estimate correctet values (indicated by an underscore '_') are y_[i] = y[i]-yEst Sy_ = Sum(y_[k]) ; Syy_ = Sum(y_[k]*y_[k]) y_[i] = y[i]-yEst with fluctuations of only 11 bit (+-1024) then m * Sum (y_[i]^2) = Syy_ < m * 2^22 the sum Syy_ will be less than 32 bit for 1024 points. I prefer a moving estimate, which is continuously updated. The best estimate is the integer part of the actual average yAvg. yAvg = Sy/m yEst = Int(yAvg) = Int(Sy/m) With the estimate correctet values the sum Sy_ = Sum (y[k] - yEst)
= Sum (y[k]) - m*yEst
= m*yAvg - m*yEst
= m* (yAvg-Int(yAvg)) never will exceed m as Abs(yAvg-Int(yAvg) < 1 Therefore after having updated Sy_ (j+1 .. j+m) = yDif = - Y_[j] + Y_[j+m] Syy_(j+1 .. j+m) = Sy_(j .. j+m-1) + yDif * (Y_[j]+Y_[j+m]) the test and update is simple IF m < Abs(Sy_) THEN BEGIN yDiff := Syy_ DIV m; yEst := yEst - yDiff; Sy_ := Sy_ + yDiff * m; Syy_ := Syy_ + yDiff * ( 2*Sy_ - m*yDiff) END; This runs extremely fast especially without floating point processors. Please look at "Moving Standard deviation" in borland.public.delphi.objectpascal

Submit Your Comments and Answers


Submit your comment or answer


Privacy Policy