| kzs {kzs} | R Documentation |
The Kolmogorov-Zurbenko Spline function utilizes the moving average to construct a piece-wise estimator of the underlying signal of the given input data.
kzs(x, delta, h, k = 1, show.edges = FALSE)
x |
a data frame of paired values X and Y. The data frame should consist of two columns of data representing pairs (Xi, Yi), i = 1,..., n and X, Y are real values; the first column of data represents X values and the second column represents the corresponding Y values. |
delta |
the physical range of smoothing in terms of unit values of X. Restriction: delta << Xn-X1
|
h |
a scale reading of all outcomes of the algorithm. More specifically, h is the interval
width of a uniform scale covering the interval (Xn - delta/2, Xn + delta/2).Restriction: h < min(Xi+1 - Xi) and h > 0
|
k |
the number of iterations the function will execute; k may also be interpreted as
the order of smoothness (as a polynomial of degree k-1). By default, k is set to perform
a single iteration.
|
show.edges |
a logical indicating whether or not to display the resulting data beyond the range of X
values of the user-supplied data. If FALSE, then the extended edges are suppressed. By
default, this parameter is set to FALSE.
|
The relation between variables Y and X as a function [namely, Y(x)] of a current value of
X = x is often desired as a result of practical research. Usually we search for some simple
function Y(x) when given a data set of pairs (Xi, Yi). These pairs frequently resemble a
noisy plot, and thus Y(x) is desired to be a smooth outcome from the original data to capture
important patterns in the data, while leaving out the noise. The KZS function estimates a
solution to this problem through use of splines, which is a nonparametric estimator of a
function. Given a data set of pairs (Xi, Yi), splines estimate the smooth values of Y from
X's. The KZS function Y(x) averages all values of Yi for all Xi within the range delta around
each scale reading hi along the variable X. The KZS algorithm is designed to smooth all fast
fluctuations in Y within the delta-range in X, while keeping ranges more then delta untouched.
The separation of short scales less than delta and long scales more than delta is becoming more
effective with higher k, while effective range of separation is becoming delta*sqrt(k).
a two-column data frame containing:
Xk |
X values resulting from execution of algorithm |
Y(Xk) |
Y values resulting from execution of algorithm |
The KZS function is designed for the general situation, including time series data. In many
applications where variable X can be time, the KZS is resolving the problem of missing values in
time series or irregularly observed values in longitudinal data analysis.
Derek Cyr dc896148@albany.edu and Igor Zurbenko igorg.zurbenko@gmail.com
"Spline Smoothing." http://economics.about.com/od/economicsglossary/g/splines.htm
# This example was created with the intent to push the limits of KZS. The
# function has a wide peak and a sharp peak; for a wide peak, you may permit
# stronger smoothing and for a sharp peak you may not (you would be over-
# smoothing). The key here is to find satisfying values for the parameters.
# EXAMPLE 1
t <- seq(from=-round(400*pi),to=round(400*pi),by=.25) #Total time t
tp <- seq(from=0,to=round(400*pi),by=.25) #Positive t (includes t=0)
tn <- seq(from=-round(400*pi),to=-.25,by=.25) #Negative t
nobs <- 1:length(t) #Sequence of obs
# True signal
signalp <- 0.5*sin(sqrt((2*pi*abs(tp))/200)) #Positive side of signal
signaln <- 0.5*sin(-sqrt((2*pi*abs(tn))/200)) #Negative side of signal
signal <- append(signaln,signalp,after=length(tn)) #Appending into one signal
# Randomly generate noise from the standard normal distribution
et <- rnorm(length(t),mean=0,sd=1)
# Add the noise to the true signal
yt <- et + signal
# Data frame of (t,yt)
pts <- data.frame(cbind(t,yt))
# Plot of the true signal
plot(signal~t,xlab='t',ylab='Signal',main='True Signal',type="l")
# Plot of yt (signal + noise)
plot(yt~t,ylab=expression(paste(Y[t])),main='Signal buried in noise',type="p")
# Apply KZS function - 3 iterations
kzs(pts,delta=80,h=.2,k=3,show.edges=FALSE)
lines(signal~t,col="red")
title(main="KZS(delta=80, h=0.20, k=3, show.edges=false)")
legend("topright", c("True signal","KZS estimate"), cex=0.8, col=c("red","black"),
lty=1:1, lwd=2, bty="n")
# EXAMPLE 2 - Rerun KZS on the same function after removing 20% of the data
# points. This provides an opportunity to create a random scale
# along the variable X.
# Generate and remove a random 20% of t
t20 <- sample(nobs,size=length(nobs)/5) #Random 20% of (t,yt)
pts20 <- pts[-t20,] #Remove the 20%
# Plot of (t,yt) with 20% removal
plot(pts20$yt~pts20$t,xlab='t',ylab=expression(paste(Y[t])),main='Signal buried
in noise - 20% removal',type="p")
# Apply KZS function - 3 iterations
kzs(pts20,delta=80,h=.20,k=3,show.edges=FALSE)
lines(signal~t,col="red")
title(main="KZS(delta=80, h=0.20, k=3, show.edges=false) - 20% removal")
legend("topright", c("True signal","KZS estimate"), cex=0.8, col=c("red","black"),
lty=1:1, lwd=2, bty="n")