\name{DCluster}

\alias{DCluster}

\title{A package for the detection of  spatial clusters of diseases
for count data}


\description{
DCluster is a collection of several methods related to the detection
of spatial clusters of diseases. Many widely used methods, such as Openshaw's
GAM, Besag and Newell, Kulldorff and Nagarwalla, and others
have been implemented.

Besides the calculation of these statistic, bootstrap can be used
to test its departure from the null hypotheses, which will be
no clustering in the study area. For possible sampling methods can be used to perform the
simulations: permutation, Multinomial, Poisson and Poisson-Gamma.

Minor modifications have been made to the methods to use standardized expected
number of cases instead of population, since it provides a better approach to
the expected number of cases.
}%description


\section{Introduction}{

We'll always suppose that we are working on a study region which is divided
into \emph{n} non-overlaping smaller areas where data are measured. Data
measured are usually people suffering from a disease or even deaths. This will
be refered as \emph{Observed number of cases}. For a given area, its observed
number of cases will be denoted by \eqn{O_i}{O_i} and the sum of these
quantities over the whole study region will be \eqn{O_+}{O_+}.

In the same way can be defined \emph{Population} and \emph{Standardized
Expected number of cases}, which will be denoted by \eqn{P_i}{P_i} and
\eqn{E_i}{E_i}, respectively. The sum of all these quantities 
are represented by \eqn{P_+}{P_+} and \eqn{E_+}{E_+}.

The basic assumption for the data is that they are independant
observations from a Poisson distribution, whose mean is 
\eqn{\theta_iE_i}{\theta_i E_i}, where \eqn{\theta_i}{theta_i}
is the relative risk. That is,

\deqn{O_i \sim Po(\theta_i E_i); \ i=1, \ldots , n}{O_i ~ Po(theta_i E_i); i=1, ..., n}

}%section{Introduction}


\section{Null hypotheses}{

Null hypotheses is usually equal relative risks, that is

\deqn{H_0: \theta_1= \ldots = \theta_n = \lambda}{H_0: theta_1= ... = theta_n = lambda}

\eqn{\lambda}{lambda} may be considered to be known (one, which means standard
risk) or unknown. In the last case, \eqn{E_i}{E_i} must slightly be corrected
by multiplying it by the overall relative risk \eqn{\frac{O_+}{E_+}}{O_+/E_+}.

}%section{Null hypotheses}


\section{Code structure}{
Function names follow a common format, which is a follows:

\item{\emph{method name}.stat}{Calculate the statistic itself.}
\item{\emph{method name}.boot}{Perform a non-parametric bootstrap.}
\item{\emph{method name}.pboot}{Perform a parametric bootstrap.}

Openshaw's G.A.M. has generally been implemented in a function called
\emph{gam}, which some methods ( Kulldorff & Nagarwalla, Besag & Newell) also
use, since they are based on a window scan of the whole region. At every point
of the grid, a function is called to determine whether that point is a cluster
or not. The name of this function is \emph{shorten method name.iscluster}.

This function calculates the local value of the statistic involved and
its signifiance by means of bootstrap. The interface provided, through
function \emph{gam}, is quite straightforward to use and it can handle the 
three methods mentioned and other supplied by the users.

}


\section{Bootstrap procedures}{

Four possible bootstrap models have been provided in order to estimate
sampling distributions of the statistics provided. The first one is a
non-parametric bootstrap, which performs permutations over the observed number
of cases, while the three others are parametric bootstrap based on
Multinomial, Poisson and Poisson-Gamma distributions.

Permutation method just takes observed number of cases and permute them among
all regions, to know whether risk in uniform across the whole study area.
It just should be used with care since we'll face the problem of having
more observed cases than population in very small populated areas.

Multinomial sampling is based on conditioning the Poisson framework
to \eqn{O_+}{O_+}. THis way \eqn{(O_1, \ldots, O_n)}{(O_1, ..., O_n}
follows a multinomial distribution of size \eqn{O_+}{O_+} and 
probabilities \eqn{(\frac{E_1}{E_+}, \ldots, \frac{E_n}{E_+})}{(E_1/E_+, ..., E_n/E_+)}.

Poisson sampling just generates observed number of cases from a Poisson
distribution whose mean is \eqn{E_i}{E_i}.

Poisson-Gamma sampling is based on the Poisson-Gamma model proposed
by \emph{Clayton and Kaldor} (1984):

\deqn{O_i|\theta_i \sim Po(\theta_i E_i)}{O_i | theta_i ~ Po(theta_i E_i)}

\deqn{\theta_i \sim Ga(\nu, \alpha)}{theta_i ~ Ga(nu, alpha)}


The distribution of \eqn{O_i}{O_i} unconditioned to \eqn{\theta_i}{theta_i} is
Negative Binomial with size \eqn{\nu}{nu} and probability
\eqn{\frac{\alpha}{\alpha+E_i}}{alpha/(alpha+E_i)}. The two parameters can be
estimated using an Empirical Bayes approach from the Expected and Observed
number of cases. Function \emph{empbaysmooth} is provided for this purpose.


}%section{Bootstrap procedures}


\section{Data}{

One of the parameters, which is usually called \emph{data}, passed to many of
the functions in this package is a dataframe which contains the data for each
of the regions used in the analysis. Besides, its columns must be labeled:


\itemize{

\item{\bold{Observed}}{Observed number of cases.}

\item{\bold{Expected}}{Standardised expected number of cases.}

\item{\bold{Population}}{Population at risk.}

\item{\bold{x}}{Easting coordinate of the region centroid.}

\item{\bold{y}}{Northing coordinate of the region centroid.}

}

}%section{Data}



%\seealso{
%}

\references{
Clayton, David and Kaldor, John (1987). Empirical Bayes Estimates of Age-standardized Relative Risks for Use in Disease Mapping. Biometrics 43, 671-681.

Lawson et al (eds.) (1999). Disease Mapping and Risk Assessment for Public
Health. John Wiley and Sons, Inc.

Lawson, A. B. (2001). Statistical Methods in Spatial Epidemiology. John Wiley and Sons, Inc.
}



\keyword{spatial}

\eof
\name{achisq}

\alias{achisq}

\title{Another implementation of Pearson's Chi-square statistic}


\description{
Another implementation of Pearson's Chi-square has been written
to fit the needs in package \emph{DCLuster}.

\emph{achisq.stat} is the function that calculates the value of the statistic
for the data.

\emph{achisq.boot} is used when performing a non-parametric bootstrap.

\emph{achisq.pboot} is used when performing a parametric bootstrap.
}

\details{
This statistic can be used to detect whether observed data
depart (over or above) expected number of cases significantly.
The test considered stands for relative risks among areas
to be equal to an (unknown) constant \eqn{\lambda}{\lambda}, while
the alternative hypotheses is that not all relative risks are equal.

The actual value of the statistic depends on null hypotheses.
If we consider that all the relative risks are equal to 1, the 
value is

T=\deqn{\sum_i\frac{(O_i-E_i)^2}{E_i}}{sum_i ( (O_i-E_i)^2/E_i )}

and the degrees of freedom are equal to the number of regions.


On the other hand, if we just consider relative risks to be equal, without
specifying their value (i.e., \eqn{\lambda}{lambda} is unknown),
\eqn{E_i}{E_i} must be substituted by \eqn{E_i\frac{O_+}{E_+}}{E_i*(O_+/E_+)}
and the number of degrees of freedom is the number of regions minus one.

When internal standardization is used, null hypotheses must 
be all relative risks equal to 1 and the number of degrees
of freedom is the number of regions minus one. This is due to the
fact that, in this case, \eqn{O_+=E_+}{O_+=E_+}.
}

\seealso{
DCluster, achiq.stat, achisq.boot, achisq.pboot
}

\references{
Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: I. The Binomial and Multinomial Distributions. Biometrika 53, 167-182.

Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: The Poisson Distribution. Biometrika 53, 183-190.
}

\keyword{htest}

\eof
\name{achisq.boot}

\alias{achisq.boot}
\alias{achisq.pboot}

\title{Bootstrap replicates of Pearson's Chi-square statistic}


\description{

Generate bootstrap replicates of the Pearson's Chi-square statistic (function
\emph{achisq.stat}), by means of function \emph{boot} from \emph{boot}
library. Notice that these functions should not  be used separately but as
argument \emph{statistic} when calling function \emph{boot}.


\emph{achisq.boot} is used when performing a non-parametric bootstrap.

\emph{achisq.pboot} is used when performing a parametric bootstrap.
}


\usage{
achisq.boot(data, i, ...)
achisq.pboot(...)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in
\emph{DCluster} manpage.}
\item{i}{Permutation generated by the non-parametric bootstrap procedure.}
\item{...}{Additional arguments passed when performing a bootstrap.}
}

\value{
Both functions return the value of the statistic.
}

\seealso{
DCluster, boot, achisq, achisq.stat
}


\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))

niter<-100

#Permutation  model
chq.perboot<-boot(sids, statistic=achisq.boot, R=niter)
plot(chq.perboot)#Display results

#Multinomial model
chq.mboot<-boot(sids, statistic=achisq.pboot, sim="parametric", ran.gen=multinom.sim,  R=niter)
plot(chq.mboot)#Display results

#Poisson model
chq.pboot<-boot(sids, statistic=achisq.pboot, sim="parametric", ran.gen=poisson.sim,  R=niter)
plot(chq.pboot)#Display results

#Poisson-Gamma model
chq.pgboot<-boot(sids, statistic=achisq.pboot, sim="parametric", ran.gen=negbin.sim, R=niter)
plot(chq.pgboot)#Display results
}

\references{
Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: I. The Binomial and Multinomial Distributions. Biometrika 53, 167-182.

Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: The Poisson Distribution. Biometrika 53, 183-190.
}

\keyword{htest}

\eof
\name{achisq.stat}

\alias{achisq.stat}

\title{Another implementation of Pearson's Chi-square statistic}


\description{
Compute Pearson's Chi-square statistic. See \emph{achisq} manual page
for more details.
}


\usage{
achisq.stat(data, lambda=NULL)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster}manpage.}
\item{lambda}{The value of the relative risks under the null hypotheses.
If its NULL, the second hypotheses commented above is considered
and the expected number of cases will automatically be corrected.}
}

\value{
A list with three components
\item{T}{The value of the statistic.}
\item{df}{Degrees of freedom of the asinthotic Chi-square distribution.}
\item{pvalue}{Related pvalue.}
}


\seealso{
DCluster, achisq, achisq.boot, achisq.pboot
}


\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))

#Compute the statistic under the assumption that lambda = 1.
achisq.stat(sids, lambda=1)
}

\references{
Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: I. The Binomial and Multinomial Distributions. Biometrika 53, 167-182.

Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: The Poisson Distribution. Biometrika 53, 183-190.
}

\keyword{htest}

\eof
\name{besagnewell}

\alias{besagnewell}

\title{Besag and Newell's statistic for spatial clustering}

\description{
Besag & Newell's statistic looks for clusters of size \emph{k}, i. e., where
the number of observed cases is \emph{k}. At every area where a case has
appeared, the number of neighbouring regions needed to reach $k$ cases is
calculated.  If this number is too small, that is, too many observed cases in
just a few regions with low expected cases, then it is marked as a cluster.
}


\seealso{
DCluster, besagnewell.stat, besagnewell.boot, besagnewell.pboot, bn.iscluster
}

\references{
Besag, J. and Newell, J.(1991). The detection of clusters in rare diseases. 
Journal of the Royal Statistical Society A  154, 143-155.
}

\examples{
#B&N must use the centroids as grid.
#The size of teh cluster is 20.
#100  bootstrap simulations are performed
#Poisson is the model used in the bootstrap simulations to generate the
#observations.
#Signifiance level is 0'05/100.
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

bnresults<-opgam(sids, thegrid=sids[,c("x","y")], alpha=.05/100, 
	iscluster=bn.iscluster, set.idxorder=TRUE, k=20, model="poisson", 
	R=100, mle=NULL )

#Plot all the centroids
plot(sids$x, sids$y)

#Plot signifiant centroids in red
points(bnresults$x, bnresults$y, col="red", pch=19)
}

\keyword{spatial}

\eof
\name{besagnewell.boot}

\alias{besagnewell.boot}
\alias{besagnewell.pboot}

\title{Generate boostrap replicates of Besag and Newell's statistic}

\description{
Generate boostrap replicates of Besag and Newell's statistic, by means of
function \emph{boot} from \emph{boot} library.  Notice that these functions
should not  be used separately but as argument \emph{statistic} when calling
function \emph{boot}.


\emph{besagnewell.boot} is used when performing a non-parametric bootstrap.


When sampling models are \emph{Multinomial} or \emph{Poisson} it is quite
straightforwad to obtain the actual p-value as shown in the examples. When
\emph{Permutation} or \emph{Negative Binomial} are used, simulation must be
used to estimate significance.  }


\usage{
besagnewell.boot(data, i, ...)
besagnewell.pboot(...)
}


\arguments{
\item{data}{A dataframe with the data, as explained in \emph{DCluster}.}
\item{i}{Permutation generated by the non-parametric bootstrap.}
\item{...}{Additional arguments needed.}
}


\value{
Both functions return the value of the statistic.
}

\seealso{
DCluster, boot, besagnewell, besagnewell.stat, bn.iscluster
}

\references{
Besag, J. and Newell, J.(1991). The detection of clusters in rare diseases. 
Journal of the Royal Statistical Society A  154, 143-155.
}

\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

niter<-100

#Permutation  model
besn.perboot<-boot(sids, statistic=besagnewell.boot, R=niter, k=20)
plot(besn.perboot)#Display results
}

\keyword{spatial}

\eof
\name{besagnewell.stat}

\alias{besagnewell.stat}

\title{Besag and Newell's statistic for spatial clustering}

\description{
\emph{besagnewell.stat} computes the statistic around a single location.
Data passed must be sorted according to distance to central region,
which is supposed to be the first row in the dataframe. Notice that the
size of the cluster is \emph{k+1}.

}


\usage{
besagnewell.stat(data, k)
}


\arguments{
\item{data}{A dataframe with the data, as explained in \emph{DCluster}.}
\item{k}{Cluster size.}
}


\value{
A vector of two elements: the value of the statistic and the size of the
cluster (which is equal to the value of the statistic).
}

\seealso{
DCluster, besagnewell, besagnewell.boot, besagnewell.pboot
}

\references{
Besag, J. and Newell, J.(1991). The detection of clusters in rare diseases. 
Journal of the Royal Statistical Society A  154, 143-155.
}

\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

besagnewell.stat(sids, k=20)
}

\keyword{spatial}

\eof
\name{bn.iscluster}

\alias{bn.iscluster}

\title{Clustering function for Besag and Newell's method}

\description{
This function is used to calculate the significance of the agregation
of cases around the current area when scanning the whole
area by means of function \emph{opgam}.

When data sampling distribution is  \emph{multinomial} or \emph{poisson} 
the exact p-value is computed. In the other cases (i.e.,
permutation and negative binomial) it is aproximated by bootstrap.

This function must be passed to function \emph{opgam} as argument
\emph{iscluster}.
}


\usage{
bn.iscluster(data, idx, idxorder, alpha, k, model="poisson", R=999, mle)
}


\arguments{
\item{data}{A dataframe with the data as explained in \emph{DCluster}.}
\item{idx}{A boolean vector to know the areas in the current circle.}
\item{idxorder}{A permutation of the rows of data to order the regions
according to their distance to the current centre.}
\item{alpha}{Test significance.}
\item{k}{Size of the cluster.}
\item{model}{Thge model used to generate random observations. It can be
'permutation', 'multinomial', 'poisson' or 'negbin'.}
\item{R}{Number of bootstrap replicates made to compute pvalue if
the local test.}
\item{mle}{Parameters needed to compute the Negative Binomial distribution (if used). See \emph{negbin.sim} manual page for details.}
}

\value{
A vector of four elements, as described in \emph{iscluster} manual page.
}



\references{
Besag, J. and Newell, J.(1991). The detection of clusters in rare diseases.
Journal of the Royal Statistical Society A  154, 143-155.
}


\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

#B&N's method
bnresults<-opgam(data=sids, thegrid=sids[,c("x","y")], alpha=.05/100, 
	iscluster=bn.iscluster, k=20, R=100, model="poisson", 
	mle=calculate.mle(sids))

#Plot all centroids and significant ones in red
plot(sids$x, sids$y, main="Besag & Newell's method")
points(bnresults$x, bnresults$y, col="red", pch=19)
}

\seealso{
DCluster, besagnewell, besagnewell.boot, besagnewell.pboot
}

\keyword{spatial}

\eof
\name{calculate.mle}

\alias{calculate.mle}

\title{Calculate parameters involved in smapling procedures}

\description{
When boostrap is used to sample values of the statistic under study,
it is possible to use argument \emph{mle} to pass the values of
the parameters involved in the sampling procedure.
}

\usage{
calculate.mle(d, model="poisson")
}

\arguments{
\item{d}{A dataframe as described in the \emph{DCluster} manual page.}
\item{model}{Model used to sample data. It can be either "multinomial",
"poisson" or "negbin".}
}


\value{
A list with the estimates of the parameters involved in the model:

\itemize{
\item{Multimonial}{Total observed cases (\emph{n}) and vector of probabilities
(\emph{p}).}
\item{Poisson}{Total number of regions (\emph{n}) and vector of means 
(\emph{lambda}).}
\item{Negative Binomial (Poisson-Gamma)}{Total number of regions (\emph{n}), 
size and probabilites, calculated after estimating parameters parameters 
\emph{nu} and \emph{alpha} of the Gamma distribution following equations 
proposed by Clayton and Kaldor (1989).}
}%itemize
}

\seealso{
DCluster, observed.sim
}

\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)


#Carry out simulations
datasim<-multinom.sim(sids, mle=calculate.mle(sids, model="multinomal") )

#Estimators for Poisson distribution
datasim<-poisson.sim(sids, mle=calculate.mle(sids, model="poisson") )

#Estimators for Negative Binomial distribution
datasim<-negbin.sim(sids, mle=calculate.mle(sids, model="negbin") )

}


\keyword{distribution}

\eof
\name{empbaysmooth}

\alias{empbaysmooth}


\title{Empirical Bayes Smoothing}

\description{
Smooth  relative risks from a set of expected and observed number of cases
using a Poisson-Gamma model as proposed by \emph{Clayton and Kaldor} (1987) .

If \eqn{\nu}{nu} and \eqn{\alpha}{alpha} are the two parameters of the 
prior Gamma distribution, smoothed relative risks are
\eqn{\frac{O_i+\nu}{E_i+\alpha}}{(O_i+nu)/(E_i+alpha)}.

\eqn{\nu}{nu} and \eqn{\alpha}{alpha} are estimated via Empirical Bayes,
by using mean and variance, as described by \emph{Clayton and Kaldor}(1987).

Size and probabilities for a Negative Binomial model are also calculated (see
below).

See \emph{Details} for more information.
}

\usage{
empbaysmooth(Observed, Expected, maxiter=20, tol=1e-5)
}

\arguments{
\item{Observed}{Vector of observed cases.}
\item{Expected}{Vector of expected cases.}
\item{maxiter}{Maximum number of iterations allowed.}
\item{tol}{Tolerance used to stop the iterative procedure.}
}


\details{
The Poisson-Gamma model, as described by \emph{Clayton and Kaldor},
is a two-layers Bayesian Hierarchical model:

\deqn{O_i|\theta_i \sim Po(\theta_i E_i)}{O_i|theta_i ~ Po(theta_i E_i)}

\deqn{\theta_i \sim Ga(\nu, \alpha)}{theta_i ~ Ga(nu, alpha)}

The posterior distribution of \eqn{O_i}{O_i},unconditioned to
\eqn{\theta_i}{theta_i}, is Negative Binomial with size \eqn{\nu}{nu} and
probability \eqn{\alpha/(\alpha+E_i)}{alpha/(alpha+E_i)}.

The estimators of relative risks are
\eqn{\widehat{\theta}_i=\frac{O_i+\nu}{E_i+\alpha}}{thetahat_i=(O_i+nu)/(E_i+alpha)}.
Estimators of \eqn{\nu}{nu} and \eqn{\alpha}{alpha}
(\eqn{\widehat{\nu}}{nuhat} and \eqn{\widehat{\alpha}}{alphahat},respectively)
are calculated by means of an iterative procedure using these two equations
(based on mean and variance estimations):

\deqn{\frac{\widehat{\nu}}{\widehat{\alpha}}=\frac{1}{n}\sum_{i=1}^n
\widehat{\theta}_i}{nuhat/alphahat=(1/n)*sum_i(thetahat_i)}

\deqn{\frac{\widehat{\nu}}{\widehat{\alpha}^2}=\frac{1}{n-1}\sum_{i=1}^n(1+\frac{\widehat{\alpha}}{E_i})(\widehat{\theta}_i-\frac{\widehat{\nu}}{\widehat{\alpha}})^2}{nuhat/alphahat^2 = (1/(n-1))*sum_i[(1+alphahat/E_i)*(thetahat_i-nuhat/alphahat)^2]}



}%\details

\value{
A list of four elements:
\item{n}{Number of regions.}
\item{nu}{Estimation of parameter \eqn{\nu}{nu}}
\item{alpha}{Estimation of parameter \eqn{\alpha}{alpha}}
\item{smthrr}{Vector of smoothed relative risks.}
\item{size}{Size parameter of the Negative Binomial. It is equal to 
\deqn{\widehat{\nu}}}{nuhat}.
\item{prob}{It is a vector of probabilities of the Negative Binomial,
calculated as
\deqn{\frac{\widehat{\alpha}}{\widehat{\alpha}+E_i}}{alphahat/(alphahat+E_i} .}
}

\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))

smth<-empbaysmooth(sids$Observed, sids$Expected)
}

\references{
Clayton, David and Kaldor, John (1987). Empirical Bayes Estimates of Age-standardized Relative Risks for Use in Disease Mapping. Biometrics 43, 671-681.
}

\keyword{models}

\eof
\name{gearyc}

\alias{gearyc}

\title{Moran's I autocorrelation statistic}


\description{
Geary's c statistic is used to measure autocorrelation between areas within
a region, as follows:

\deqn{
c=\frac{(n-1)\sum_i \sum_j W_{ij}(Z_i-Z_j)^2}{2(\sum_i\sum_jW_{ij})\sum_k (Z_k-\overline{Z})^2}
}{
c = (n-1) [sum_i sum_j W_ij (Z_i-Z_j)^2]/[2(sum_i sum_j W_ij) sum_k (Z_k-mean({Z))^2}
}

\eqn{W}{W} is a squared matrix which represents the relationship between each
pair of regions. An usual approach is set \eqn{w_{ij}}{w_ij} to 1 if regions
\eqn{i}{i} and \eqn{j}{j} have a common boundary and 0 otherwise, or it may
represent the inverse distance between the centroids of that two regions.

Small values of this statistic may indicate the presence of highly 
correlated areas, which may be a cluster.
}


\seealso{
DCluster, gearyc.stat, gearyc.boot, gearyc.pboot
}

\references{
Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician 5, 115-145.
}

\keyword{spatial}

\eof
\name{gearyc.boot}

\alias{gearyc.boot}
\alias{gearyc.pboot}

\title{Generate bootstrap replicates of Moran's I autocorrelation statistic}


\description{
Generate bootstrap replicates of Moran's I autocorrelation statistic, by means
of function \emph{boot} form \emph{boot} library. Notice that these functions
should not  be used separately but as argument \emph{statistic} when calling
function \emph{boot}.

\emph{gearyc.boot} is used when performing a non-parametric bootstrap.

\emph{gearyc.pboot} is used when performing a parametric bootstrap.
}


\usage{
gearyc.boot(data, i, ...)
gearyc.pboot(...)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster}manpage.}
\item{i}{Permutation generated by the bootstrap procedure}
\item{...}{Aditional arguments passed when performing a bootstrap.}
}

\value{
Both functions return the value of the statistic.
}

\seealso{
DCluster, boot, gearyc, gearyc.stat
}


\examples{
library(boot)
library(spdep)

data(nc.sids)
col.W <- nb2listw(ncCR85.nb, zero.policy=TRUE)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))


niter<-100

#Permutation model
gc.perboot<-boot(sids, statistic=gearyc.boot, R=niter, listw=col.W,
	n=length(ncCR85.nb), n1=length(ncCR85.nb)-1, S0=Szero(col.W) )
plot(gc.perboot)#Display results

#Multinomial model
gc.mboot<-boot(sids, statistic=gearyc.pboot, sim="parametric", 
	ran.gen=multinom.sim, R=niter, listw=col.W,
        n=length(ncCR85.nb), n1=length(ncCR85.nb)-1, S0=Szero(col.W) )
plot(gc.mboot)#Display results

#Poisson model
gc.pboot<-boot(sids, statistic=gearyc.pboot, sim="parametric", 
	ran.gen=poisson.sim, R=niter, listw=col.W,
	n=length(ncCR85.nb), n1=length(ncCR85.nb)-1, S0=Szero(col.W) )
plot(gc.pboot)#Display results

#Poisson-Gamma model
gc.pgboot<-boot(sids, statistic=gearyc.pboot, sim="parametric", 
	ran.gen=negbin.sim, R=niter, listw=col.W,
	n=length(ncCR85.nb), n1=length(ncCR85.nb)-1, S0=Szero(col.W) )
plot(gc.pgboot)#Display results

}

\references{
Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician 5, 115-145.
}

\keyword{spatial}

\eof
\name{gearyc.stat}

\alias{gearyc.stat}

\title{Compute Moran's I autocorrelation statistic}


\description{
Compute Moran's I autocorrelation statistic using either \bold{residuals}
or \bold{SMRs} by means of cuntion \emph{geary} from package \emph{spdep}.
}


\usage{
gearyc.stat(data, applyto="residuals", ...)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster} manpage.}
\item{applyto}{A string with the name of the statistic with
which calculate Moran's Index. It may be either \emph{residulas}
or \emph{SMR}.}
\item{...}{Additional arguments needed by function \emph{moran} from package
\emph{spdep}}
}

\seealso{
DCluster, geary, gearyc, gearyc.boot, gearyc.pboot
}


\examples{
library(spdep)
data(nc.sids)
col.W <- nb2listw(ncCR85.nb, zero.policy=TRUE)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))

gearyc.stat(data=sids, listw=col.W, n=length(ncCR85.nb), n1=length(ncCR85.nb)-1,
	S0=Szero(col.W) )

gearyc.stat(data=sids, applyto="SMR", listw=col.W, n=length(ncCR85.nb), 
	n1=length(ncCR85.nb)-1,S0=Szero(col.W) )

}

\references{
Geary, R. C. (1954). The contiguity ratio and statistical mapping. The Incorporated Statistician 5, 115-145.
}

\keyword{spatial}

\eof
\name{kn.iscluster}

\alias{kn.iscluster}

\title{Clustering function for Kulldorff and Nagarwalla's statistic}

\description{
\emph{kn.iscluster} is called from \emph{opgam} when studying the whole
area. At every point of the grid, which may be all the centroids, this
function is called to determine whether it is a cluster or not by
calculating Kulldorff and Nagarwalla's statistic.

See \emph{opgam.iscluster.default} for more details.

}


\usage{
kn.iscluster(data, idx, idxorder, alpha, fractpop, use.poisson=TRUE, model="poisson", R, mle)
}


\arguments{
\item{data}{A dataframe with the data as explained in \emph{DCluster}.}
\item{idx}{A boolean vector to know the areas in the current circle.}
\item{idxorder}{A permutation of the rows of data to order the regions
according to their distance to the current center.}
\item{alpha}{Test signifiance.}
\item{fractpop}{Maximum fraction of the total population used when
creating the balls.}
\item{use.poisson}{Use the statistic for Poisson (default) or Bernouilli case.}
\item{model}{Thge model used to generate random observations. It can be
'permutation', 'multinomial', 'poisson' or 'negbin'. See \emph{observed.sim} manual page for details.}
\item{R}{The number of bootstrap replicates to generate.}
\item{mle}{Parameters need by the bootstrap procedure.}
}

\value{
A vector of four elements, as describe in \emph{iscluster} manual page.
}
\seealso{
DCluster, kullnagar, kullnagar.stat, kullnagar.boot, kullnagar.pboot
}


\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, Population=nc.sids$BIR74, x=nc.sids$x, y=nc.sids$y)

#K&N's method over the centroids
mle<-calculate.mle(sids, model="poisson")
knresults<-opgam(data=sids, thegrid=sids[,c("x","y")], alpha=.05/100, 
	iscluster=kn.iscluster, fractpop=.5, R=100, model="poisson", mle=mle)

#Plot all centroids and significant ones in red
plot(sids$x, sids$y, main="Kulldorff and Nagarwalla's method")
points(knresults$x, knresults$y, col="red", pch=19)
}

\references{
Kulldorff, Martin and Nagarwalla, Neville (1995). Spatial Disease Clusters: Detection and Inference. Statistics in Medicine 14, 799-810.
}

\keyword{spatial}

\eof
\name{kullnagar}

\alias{kullnagar}

\title{Kulldorff and Nagarwalla's statistic for spatial clustering.}

\description{
This method is based on creating a grid over the study area. Each point of
the grid is taken to be the centre of all circles that contain up to a
fraction of the total population. This is calculated by suming all the
population of the regions whose centroids fall inside the circle. For each one
of these balls, the likelihood ratio of the next test hypotheses is computed:

\tabular{lcl}{
\eqn{H_0}{H_0} \tab : \tab \eqn{p=q}{p=q} \cr
\eqn{H_1}{H_1} \tab : \tab \eqn{p>q}{p>q}
}
where \emph{p} is the probability of being a case inside the ball and
\emph{q} the probability of being a case outside it. Then, the ball
where the maximum of the likelihood ratio is achieved is selected and its
value is tested to assess whether it is significant or not.

There are two possible statistics, depending on the model assumed for the
data, which can be Bernouilli or Poisson. The value of the likelihood ratio 
statistic is

\deqn{\max_{z \in Z}\frac{L(z)}{L_0}}{max_z[L(z)/L_0]}

where \emph{Z} is the set of ball at a given point, \emph{z} an element of
this set, \eqn{L_0}{L_0} is the likelihood under the null hypotheses and
\eqn{L(z)}{L(z)} is the likelihood under the alternative hypotheses. The
actual formulae involved in the calculation can be found in the reference
given below.
}

\seealso{
DCluster, kullnagar.stat, kullnagar.boot, kullnagar.pboot
}

\references{
Kulldorff, Martin and Nagarwalla, Neville (1995). Spatial Disease Clusters: Detection and Inference. Statistics in Medicine 14, 799-810.
}

\keyword{spatial}

\eof
\name{kullnagar.boot}

\alias{kullnagar.boot}
\alias{kullnagar.pboot}

\title{Generate bootstrap replicates of Kulldorff and Nagarwalla's statistic}

\description{
Generate bootstrap replicates of Kulldorff and Nagarwalla's statistic,
by calling functions \emph{boot} and \emph{kullnagar.stat}.

\emph{kullnagar.boot} is used when using non-parametric bootstrap to estimate
the distribution of the statistic.

\emph{kullnagar.pboot} is used when performing parametric bootstrap.
}


\usage{
kullnagar.boot(data, i, ...)
kullnagar.pboot(...)
}


\arguments{
\item{data}{A dataframe with the data as explained in \emph{DCluster}.}
\item{i}{Permutation created in non-parametric bootstrap.}
\item{...}{Additional arguments passed to the functions.}
}



\value{
Both functions return the value of the statistic.
}


\seealso{
DCluster, boot, kullnagar, kullnagar.stat, kn.iscluster
}


\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, Population=nc.sids$BIR74, x=nc.sids$x, y=nc.sids$y)

niter<-100

#Permutation  model
kn.perboot<-boot(sids, statistic=kullnagar.boot, R=niter, fractpop=.2)
plot(kn.perboot)#Display results

#Multinomial model
kn.mboot<-boot(sids, statistic=kullnagar.pboot, sim="parametric", 
	ran.gen=multinom.sim,  R=niter, fractpop=.2)
plot(kn.mboot)#Display results

#Poisson model
kn.pboot<-boot(sids, statistic=kullnagar.pboot, sim="parametric", 
	ran.gen=poisson.sim,  R=niter, fractpop=.2)
plot(kn.pboot)#Display results

#Poisson-Gamma model
kn.pgboot<-boot(sids, statistic=kullnagar.pboot, sim="parametric", 
	ran.gen=negbin.sim, R=niter, fractpop=.2)
plot(kn.pgboot)#Display results


}



\references{
Kulldorff, Martin and Nagarwalla, Neville (1995). Spatial Disease Clusters: Detection and Inference. Statistics in Medicine 14, 799-810.
}

\keyword{spatial}

\eof
\name{kullnagar.stat}

\alias{kullnagar.stat}
\alias{kullnagar.stat.poisson}
\alias{kullnagar.stat.bern}


\title{Kulldorff and Nagarwalla's statistic for spatial clustering.}

\description{
Compute Kulldorff and Nagarwalla's spatial statistic for cluster detection
around a single region, which is supposed to be the first row of the
dataframe. The other regions are supposed to be sorted by distance to 
the centre in the data frame.

Two possible function are provided: {\em kullnagar.stat.poisson}, for th Poisson
case, and {\em kullnagar.stat.bern}, for the Bernouilli case.

See \emph{kullnagar} manual page for details.
}


\usage{
kullnagar.stat(data, fractpop, use.poisson=TRUE, log.v=FALSE)
}


\arguments{
\item{data}{A dataframe with the data as explained in \emph{DCluster}.}
\item{fractpop}{Maximum fraction of the total population used when
creating the balls.}
\item{use.poisson}{Use the statistic for Poisson (default) or Bernouilli case.}
\item{log.v}{Whether the logarithm of the statistic is returned or not.}
}

\value{
Returns a vector of two elements:  the value of the statistic and the
size (in number of regions) of the cluster.
}


\seealso{
DCluster, kullnagar, kullnagar.stat, kullnagar.boot, kullnagar.pboot
}


\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, Population=nc.sids$BIR74, x=nc.sids$x, y=nc.sids$y)


dist<-(sids$x-sids$x[1])^2+(sids$y-sids$y[1])^2
index<-order(dist)
#Compute the statistic around the first county
kullnagar.stat(sids[index,], fractpop=.5)
}


\references{
Kulldorff, Martin and Nagarwalla, Neville (1995). Spatial Disease Clusters: Detection and Inference. Statistics in Medicine 14, 799-810.
}

\keyword{spatial}

\eof
\name{moranI}

\alias{moranI}

\title{Moran's I autocorrelation statistic}


\description{
Moran's I statistic measures autocorrelation between areas within
a region. It is similar to the correlation coefficient:

\deqn{
I=\frac{n\sum_i\sum_j W_{ij}(Z_i-\overline{Z})(Z_j-\overline{Z})}{2(\sum_i\sum_jW_{ij})\sum_k (Z_k-\overline{Z})^2}
}{
I= n * [sum_i ( sum_j W_ij(Z_i-mean(Z))*(Z_j-mean(Z))]/[2 * (sum_i sum_j W_ij) * sum_k (Z_k-mean(Z))^2]
}

\eqn{W}{W} is a squared matrix which represents the relationship between each
pair of regions. An usual approach is set \eqn{w_{ij}}{w_ij} to 1 if regions
\eqn{i}{i} and \eqn{j}{j} have a common boundary and 0 otherwise, or it may
represent the inverse distance between the centroids of these two regions.

High values of this statistic may indicate the presence of groups of zones
where values are unusually high. On the other hand, low values
of the Moran's statistic will indicate no correlation between neighbouring
areas, which may lead to indipendance in the observations. 


\emph{moranI.stat} is the function to calculate the value of the statistic for
residuals or SMRs of the data.

\emph{moranI.boot} is used when performing a non-parametric bootstrap.

\emph{moranI.pboot} is used when performing a parametric bootstrap.
}


\seealso{
DCluster, moranI.stat, moranI.boot, moranI.pboot
}

\references{
Moran, P. A. P. (1948). The interpretation os statistical maps. Journal of the Royal Statistical Society, Series B 10, 243-251.
}

\keyword{spatial}

\eof
\name{moranI.boot}

\alias{moranI.boot}
\alias{moranI.pboot}

\title{Generate bootstrap replicates of Moran's I autocorrelation statistic}


\description{

Generate bootstrap replicates of Moran's I autocorrelation statistic, by means
of function \emph{boot} form \emph{boot} library. Notice that these functions
should not  be used separately but as argument \emph{statistic} when calling
function \emph{boot}.

\emph{moranI.boot} is used when performing a non-parametric bootstrap.

\emph{moranI.pboot} is used when performing a parametric bootstrap.
}


\usage{
moranI.boot(data, i, ...)
moranI.pboot(...)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster}manpage.}
\item{i}{Permutation generated by the bootstrap procedure}
\item{...}{Aditional arguments passed when performing a bootstrap.}
}

\value{
Both functions return the value of the statistic.
}

\seealso{
DCluster, boot, moranI, moranI.stat
}


\examples{
library(spdep)
data(nc.sids)
col.W <- nb2listw(ncCR85.nb, zero.policy=TRUE)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74)) 

niter<-100

#Permutation model
moran.boot<-boot(sids, statistic=moranI.boot, R=niter, listw=col.W, 
	n=length(ncCR85.nb), S0=Szero(col.W) )
plot(moran.boot)#Display results

#Multinomial model
moran.mboot<-boot(sids, statistic=moranI.pboot, sim="parametric", 
	ran.gen=multinom.sim,  R=niter, listw=col.W,n=length(ncCR85.nb), 
	S0=Szero(col.W) )
plot(moran.mboot)#Display results

#Poisson model
moran.pboot<-boot(sids, statistic=moranI.pboot, sim="parametric", 
	ran.gen=poisson.sim,  R=niter, listw=col.W,n=length(ncCR85.nb),
	S0=Szero(col.W) )
		
plot(moran.pboot)#Display results

#Poisson-Gamma model
moran.pgboot<-boot(sids, statistic=moranI.pboot, sim="parametric", 
	ran.gen=negbin.sim, R=niter,  listw=col.W,n=length(ncCR85.nb),
	S0=Szero(col.W) )
		
plot(moran.pgboot)#Display results
}

\references{
Moran, P. A. P. (1948). The interpretation os statistical maps. Journal of the Royal Statistical Society, Series B 10, 243-251.
}

\keyword{spatial}

\eof
\name{moranI}

\alias{moranI.stat}

\title{Compute Moran's I autocorrelation statistic}


\description{
Compute Moran's I autocorrelation statistic using \bold{residuals}
or \bold{SMRs} by means of function \emph{moran} from package 
\emph{spdep}.
}


\usage{
moranI.stat(data, applyto="residuals", ...)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster}manpage.}
\item{applyto}{A string with the name of the statistic with
which calculate Moran's Index. It may be either \emph{residulas}
or \emph{SMR}.}
\item{...}{Additional arguments needed by function \emph{moran} from package
\emph{spdep}}
}


\value{
The value of the statistic computed.
}
\seealso{
DCluster, moran, moranI, moranI.boot, MoranI.pboot
}

\examples{
library(spdep)
data(nc.sids)
col.W <- nb2listw(ncCR85.nb, zero.policy=TRUE)


sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74) )

moranI.stat(data=sids, listw=col.W, n=length(ncCR85.nb), S0=Szero(col.W) )

moranI.stat(data=sids, applyto="SMR", listw=col.W, n=length(ncCR85.nb), 
	S0=Szero(col.W) )
}

\references{
Moran, P. A. P. (1948). The interpretation os statistical maps. Journal of the Royal Statistical Society, Series B 10, 243-251.
}

\keyword{spatial}

\eof
\name{observed.sim}

\alias{multinom.sim}
\alias{poisson.sim}
\alias{negbin.sim}

\title{Randomly generate observed cases from different statistical distributions}

\description{
Simulate Observed number of cases according to a Multinomial, Poisson or
Negative Binomial distribution.


These functions are used when performing a parametric bootstrap and
they must be passed as argument \emph{ran.gen} when calling function \emph{boot}.

\emph{multinom.sim} generates observations from a Multinomial distribution.

\emph{poisson.sim} generates observations from a Poisson distribution.

\emph{negbin.sim} generates observations from a Negative Binomial distribution.
}

\usage{
multinom.sim(data, mle=NULL)

poisson.sim(data, mle=NULL)

negbin.sim(data, mle=NULL)
}

\arguments{
\item{data}{A dataframe as described in the \emph{DCluster} manual page.}
\item{mle}{
List containing the parameters of the distributions to be used.  If
they are not provided, then they are calculated from the data. Its value 
argument \emph{mle} in function \emph{boot}. 

The elements in the list depend on the distribution to be used:

\itemize{
\item Multimonial

Total observed cases (\emph{n}) and vector of probabilities (\emph{p}).

\item Poisson

Total number of regions (\emph{n}) and vector of means (\emph{lambda}).

\item Negative Binomial

Total number of regions (\emph{n}) and parameters \emph{nu} and \emph{alpha}
of the Gamma distribution.
}%\itemize
}

}%\arguments


\value{
A dataframe equal to the argument \emph{data}, but in which the Observed
column has been substituted by sampled observations. See \emph{DCluster}
manual page for more details.
}


\seealso{
DCluster
}

\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

#Carry out simulations
datasim<-multinom.sim(sids, mle=calculate.mle(sids, model="multinomal") )

#Estimators for Poisson distribution
datasim<-poisson.sim(sids, mle=calculate.mle(sids, model="poisson") )

#Estimators for Negative Binomial distribution
datasim<-negbin.sim(sids, mle=calculate.mle(sids, model="negbin") )

}


\keyword{distribution}

\eof
\name{opgam}

\alias{opgam}
\alias{opgam.intern}

\title{Openshaw's GAM}

\description{
Scan an area with Openshaw's Geographical Analysis Machine to look
for clusters.

\emph{opgam} is the main function, while \emph{gam.intern} is called
from there.
}


\details
{
The \emph{Geographical Analysis Machine} was developed by Openshaw
et al. to perform geographical studies of the relationship between
different types of cancer and their proximity to nuclear plants.

In this method, a grid of a fixed step is built along the study region, and
small balls of a given radius are created at each point of the grid. Local
observed and expected number of cases and population are calculated and a
function is used to assess whether the current ball is a cluster or not.  For
more information about this function see \emph{opgam.iscluster.default}, which
is the default function used.

If the obverved number of cases excess a critical value, which is calculated
by a function passed as an argument, then that circle is marked as a possible
cluster. At the end, all possible clusters are drawn on a map. Clusters may be
easily identified then.


Notice that we have follow a pretty flexible approach, since user-implemented
functions can be used to detect clusters, such as those related to
ovedispersion (Pearson's Chi square statistic, Potthoff-Whittinghill's
statistic) or autocorrelation (Moran's I statistic and Geary's c statistic),
or a bootstrap procedure, although it is not recommended because it can
be VERY slow.
}


\usage{
opgam(data, thegrid=NULL, radius=Inf, step=NULL,  alpha, iscluster=opgam.iscluster.default, set.idxorder=TRUE, ...)
opgam.intern(point, data, rr, set.idxorder, iscluster, alpha, ...)
}

\arguments{
\item{data}{A dataframe with the data, as described in \emph{DCluster} manual
page.}

\item{thegrid}{A two-columns matrix containing
the points of the grid to be used. If it is null, a rectangular grid of step 
\emph{step} is built.}

\item{radius}{The radius of the circles used in the computations.}

\item{step}{The step of the grid.}

\item{alpha}{Significance level of the tests performed.}

\item{iscluster}{Function used to decide whether the current circle
is a possible cluster or not. It must have the same arguments
and return the same object than \emph{gam.iscluster.default}}.

\item{set.idxorder}{Whether an index for the ordering by distance
to the center of the current ball is calculated or not.}

\item{point}{Point where the curent ball is centred.}

\item{rr}{rr=radius*radius .}

\item{...}{Aditional arguments to be passed to \emph{iscluster}.}
}


\value{
A dataframe with five columns:

\item{x}{Easting coordinate of the center of the cluster.}

\item{y}{Northing coordinate of the center of the cluster.}

\item{statistic}{Value of the statistic computed.}

\item{cluster}{Is it a cluster (according to the criteria used)? It should
be always TRUE.}

\item{pvalue}{Significance of the cluster.} 
}


\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)


#GAM using the centroids of the areas in data
sidsgam<-opgam(data=sids,  radius=30, step=10, alpha=.002)

#Plot centroids
plot(sids$x, sids$y, xlab="Easting", ylab="Northing")
#Plot points marked as clusters
points(sidsgam$x, sidsgam$y, col="red", pch="*")

}

\seealso{
DCluster, opgam.iscluster.default
}

\references{
Openshaw, S. and Charlton, M. and Wymer, C. and Craft, A. W. (1987). A mark I geographical analysis machine for the automated analysis of point data sets. International Journal of Geographical Information Systems 1, 335-358.

Waller, Lance A. and Turnbull, Bruce W. and Clarck, Larry C. and Nasca, Philip (1994). Spatial Pattern Analyses to Detect Rare Disease Clusters. In 'Case Studies in Biometry'. Chapter 1, 3-23.
} 

\keyword{spatial}

\eof
\name{iscluster}

\alias{opgam.iscluster.default}
\alias{opgam.iscluster.negbin}

\title{Local clustering test function}

\description{
This function is passed to function \emph{gam} as argument \emph{iscluster}
to decide whether the current circle must be marked as a cluster or not.

\emph{opgam.iscluster.default} is the function used by default, based on 
quantiles of the Poisson distribution.

\emph{opgam.iscluster.negbin} is similar to the previous one but based
on the Negative Binomial distribution. Local significance
is estimated using bootstrap since it involves the sum of Negative Binomial
variables.
}


\details{
These functions take a number of arguments to be able to assess
whether the current ball is a cluster or not. We have follow this 
approarch to create a common framework for all scan methods.

The vector returned by this functions can be of size higher than four,
but the first four elements must be those stated in this manual
page (and in the same order).

More example can be found in the implementations of other scan methods,
such as Besag and Newell's, and Kulldorff and Nagarwalla's.
}

\usage{
opgam.iscluster.default(data, idx, idxorder, alpha,  ...)
opgam.iscluster.negbin(data, idx, idxorder, alpha, mle, R=999, ...)
}

\arguments{
\item{data}{A dataframe with the data, as explained in \emph{DCluster} manual page.}
\item{idx}{A boolean vector to know the areas in the current circle.}
\item{idxorder}{A permutation of the rows of data to order the regions
according to their distance to the current center.}
\item{alpha}{Test signifiance.}
\item{mle}{Estimators of some parameters needed by the Negative Binomial
distribution. See \emph{negbin.sim} manual page for details.}
\item{R}{Number of simulations made used to estimate local pvalues.}
\item{...}{Any other arguments required.}
}

\value{
A vector with four values:
  \item{statistic}{Value of the statistic computed.}
  \item{result}{A boolean value, which is \emph{TRUE} for clusters.}
  \item{pvalue}{The pvalue obtained for the test performed.}
  \item{size}{Size of the cluster in inumber of regions from the centre.}
}

\seealso{opgam, besagnewell, bn.iscluster, kullnagar, kn.iscluster, turnbull, tb.iscluster}

\keyword{spatial}

\eof
\name{pottwhitt}

\alias{pottwhitt}

\title{Potthoff-Whittinghill's statistic for overdispersion}


\description{

This statistic can be used to test for homogeinity among all
the relative risks. The test statistic is:


\deqn{E_+ \sum_{i=1}^n \frac{O_i(O_i-1)}{E_i}}{E_+ * sum_i [O_i(O_i-1)/E_i]}

If we supposse that the data are generated from a multinomial model,
this is the locally U.M.P. when considering the next hypotheses:

\tabular{lcl}{
 \eqn{H_0}{H_0} \tab : \tab \eqn{\theta_1 = \ldots = \theta_n=\lambda}{theta_1 = ... = theta_n)=lambda} \cr
 \eqn{H_1}{H_1} \tab : \tab \eqn{\theta_i \sim Ga(\lambda^2/\sigma^2, \lambda/\sigma^2)}{theta_i ~ Ga(lambda^2/sigma^2, lambda/sigma^2)}
}

Notice that in this case, \eqn{\lambda}{lambda} is supposed to be unknown.
The alternative hypotheses means that relative risks come all from a Gamma
distribution with mean \eqn{\lambda}{lambda} and variance
\eqn{\sigma^2}{sigma^2}.

\emph{pottwhitt.stat} is the function to calculates the value of the statistic
for the data.

\emph{pottwhitt.boot} is used when performing a non-parametric bootstrap.

\emph{pottwhitt.pboot} is used when performing a parametric bootstrap.
}


\seealso{
DCluster, pottwhitt.stat, pottwhitt.boot, pottwhitt.pboot
}


\references{
Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: I. The Binomial and Multinomial Distributions. Biometrika 53, 167-182.

Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: The Poisson Distribution. Biometrika 53, 183-190.
}

\keyword{htest}

\eof
\name{pottwhitt.boot}

\alias{pottwhitt.boot}
\alias{pottwhitt.pboot}

\title{Bootstrap replicates of Potthoff-Whittinghill's statistic}


\description{
Generate bootstrap replicates of Potthoff-Whittinghill's statistic (function
\emph{pottwhitt.stat}), by means of function \emph{boot} from the \emph{boot}
library. Notice that these functions should not  be used separately but as
argument \emph{statistic} when calling function \emph{boot}.

\emph{pottwhitt.boot} is used when performing a non-parametric bootstrap.

\emph{pottwhitt.pboot} is used when performing a parametric bootstrap.
}


\usage{
pottwhitt.boot(data, i)
pottwhitt.pboot(...)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster}  manual page.}
\item{i}{Permutation generated by the bootstrap procedure}
\item{...}{Additional arguments passed when performing a bootstrap.}
}

\value{
Both functions return the value of the statistic.
}


\seealso{
DCluster, pottwhitt, pottwhitt.stat
}


\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

niter<-100


#Permutation model
pw.boot<-boot(sids, statistic=pottwhitt.boot, R=niter)
plot(pw.boot)#Plot results


#Multinomial model
pw.mboot<-boot(sids, statistic=pottwhitt.pboot, sim="parametric", ran.gen=multinom.sim,  R=niter)
plot(pw.mboot)#Plot results

#Poisson model
pw.pboot<-boot(sids, statistic=pottwhitt.pboot, sim="parametric", ran.gen=poisson.sim,  R=niter)
plot(pw.pboot)#Plot results

#Poisson-Gamma model
pw.pgboot<-boot(sids, statistic=pottwhitt.pboot, sim="parametric", ran.gen=negbin.sim, R=niter)
plot(pw.pgboot)#Plot results

}

\references{
Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: I. The Binomial and Multinomial Distributions. Biometrika 53, 167-182.

Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: The Poisson Distribution. Biometrika 53, 183-190.
}

\keyword{htest}

\eof
\name{pottwhitt.stat}

\alias{pottwhitt.stat}

\title{Compute Potthoff-Whittinghill's statistic}


\description{

Compute Pottwhoff-Whittinghill's statistic.

}


\usage{
pottwhitt.stat(data)
}

\arguments{
\item{data}{A dataframe containing the data, as specified in the 
\bold{DCluster}manpage.}
}

\value{
A list with the following elements:

\item{T}{The value of the statistic.}
\item{asintmean}{Mean of the asymptotical Normal distribution.}
\item{asintvar}{Variance of the asymptotical Normal distribution.}
\item{pvalue}{Significance of the statistic.}
}

\seealso{
DCluster
}


\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

pottwhitt.stat(sids)
}

\references{
Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: I. The Binomial and Multinomial Distributions. Biometrika 53, 167-182.

Potthoff, R. F. and Whittinghill, M.(1966). Testing for Homogeneity: The Poisson Distribution. Biometrika 53, 183-190.
}

\keyword{htest}

\eof
\name{rmultin}

\alias{rmultin}

\title{Generate random observations from a multinomial distribution}


\description{
 This function generates a random observation from a multinomial
distribution.

}

\usage{
rmultin(n, p)
}

\arguments{
\item{n}{Total size (and NOT the number of variables involved in the multinomial distribution).}
\item{p}{Vector of probabilities. The sum of all its elements must be one.}
}


\value{
A vector with the sample which has been generated.
}

\examples{
for(i in 1:10)
	print(rmultin(10, c(1/3, 1/3, 1/3) ))
}


\keyword{distribution}

\eof
\name{stone}

\alias{stone}

\title{Stone's Test}

\description{

Stone's Test is used to assess risk around given locations (i. e., a putative
pollution source). The null hypotheses is that relative risks are constant
across areas, while the alternative is that there is descending trend in
relative risks as distance to the focus increases. That is

\tabular{lcl}{
\eqn{H_0}{H_0} \tab : \tab \eqn{\theta_1 = \ldots = \theta_n = \lambda}{theta_1 = ... = theta_n = lambda} \cr
\eqn{H_1}{H_1} \tab : \tab \eqn{\theta_1 \geq \ldots \geq \theta_n}{theta_1 >= ... >= theta_n}

}
Supposing data sorted by distance to the putative pollution source, Stone's
statistic is as follows:

\deqn{\max_{j}(\frac{\sum _{i=1}^j O_i}{\sum _{i=1}^j E_i)}}{max_j [(sum_i=1 ^j O_i) / (sum_i=1 ^j E_i)]}

Depending on whether \eqn{\lambda}{lambda} is known (usually 1) or not,
\eqn{E_i}{E_i} may need a minor correction, which are not done automatically.
See \emph{achisq} manual page for details.
}


\seealso{
DCluster, stone.stat, stone.boot, stone.pboot
}

\references{
Stone, R. A. (1988). Investigating of excess environmental risks around putative sources: Statistical problems and a proposed test. Statistics in Medicine 7,649-660.
}

\keyword{spatial}

\eof
\name{stone.boot}

\alias{stone.boot}
\alias{stone.pboot}

\title{Generate boostrap replicates of Stone's statistic}

\description{
Generate bootstrap replicates of Stone's statictic, by means of function
\emph{boot} from \emph{boot} package. Notice that these functions should not
be used separately but as argument \emph{statistic} when calling function
\emph{boot}.


\emph{stone.boot} is used when performing a non-parametric bootstrap.

\emph{stone.pboot} is used when performing a parametric bootstrap.
}


\usage{
stone.boot(data, i, ...)
stone.pboot(...)
}


\value{
Both functions return the value of the statistic.
}

\arguments{
\item{data}{ A dataframe with all the data, as explained in the \emph{DCluster}
manual page.}
\item{i}{Permutation created in non-parametric bootstrap.}
\item{...}{Additional arguments passed to the functions.}

}


\examples{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

niter<-100

#All Tests are performed around county 78.


#Permutation  model
st.perboot<-boot(sids, statistic=stone.boot, R=niter, region=78)
plot(st.perboot)#Display results

#Multinomial model
st.mboot<-boot(sids, statistic=stone.pboot, sim="parametric", ran.gen=multinom.sim,  R=niter, region=78)
plot(st.mboot)#Display results

#Poisson model
st.pboot<-boot(sids, statistic=stone.pboot, sim="parametric", ran.gen=poisson.sim,  R=niter, region=78)
plot(st.pboot)#Display results

#Poisson-Gamma model
st.pgboot<-boot(sids, statistic=stone.pboot, sim="parametric", ran.gen=negbin.sim, R=niter, region=78)
plot(st.pgboot)#Display results


}

\seealso{
DCluster, boot, stone.stat
}

\references{
Stone, R. A. (1988). Investigating of excess environmental risks around putative sources: Statistical problems and a proposed test. Statistics in Medicine 7,649-660.
}

\keyword{spatial}

\eof
\name{stone.stat}

\alias{stone.stat}

\title{Compute Stone's statistic}

\description{
Calculate Stone's statistic. See \emph{stone} manual page for details.
}


\usage{
stone.stat(data, region, sorted=FALSE, lambda)
}


\value{
A vector of two elements with the value of the statistic and the region
(counting from the centre) where it was achieved.
}

\arguments{
\item{data}{A dataframe with all the data, as explained in the \emph{DCluster}
manual page.}
\item{region}{Region where around which we want to test for a cluster. It must
a row number of \emph{data}.}
\item{sorted}{Whether the data are already sorted by distance to \emph{region}.}
\item{lambda}{Value of the null hypotheses. It may NULL (i. e., not known) or a number.}
}


\seealso{
DCluster
}

\example{
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74))
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

#Compute Stone's statistic around the 78th county
stone.stat(sids, region=78, lambda=1)
}


\references{
Stone, R. A. (1988). Investigating of excess environmental risks around putative sources: Statistical problems and a proposed test. Statistics in Medicine 7,649-660.
}

\keyword{spatial}

\eof
\name{tango}

\alias{tango}

\title{Tango's statistic for general clustering}

\description{
Tango's statistic to perform a general clustering test is expressed as
follows:

\deqn{T = (r-p)^{'} A (r-p)}{T = (r-p)' A (r-p)}

where \eqn{r^{'} = [O_1/O_+, \ldots, O_n/O_+]}{r' = [O_1/O_+, ..., O_n/O_+]}, 
\eqn{p^{'}=[E_1/E_+, \ldots, E_n/E_+]}{p' = [E_1/E_+, ..., E_n/E_+]} 
and \eqn{A}{A} is a matrix of closeness which measures
the cloneness between two zones (the higher the closer).

Tango proposes to take \eqn{A_{ij}=\exp\{-D_{ij}/\phi\}}{A_ij=exp(-D/phi)},
where \eqn{D_{ij}}{D_ij} is the distance between centroids of regions \emph{i}
and \emph{j}, and \eqn{\phi}{phi} is a constant that measures how strong is the
relationship between regions in a general way.

}

\seealso{
DCluster, tango.stat, tango.boot, tango.pboot
}

\references{
Tango, Toshiro (1995). A Class of Tests for Detecting 'General' and 'Focused' Clustering of Rare Diseases. Statistics in Medicine 14, 2323-2334.
}

\keyword{spatial}

\eof
\name{tango.boot}

\alias{tango.boot}
\alias{tango.pboot}

\title{Generate bootstrap replicated of Tango's statistic}

\description{
Generate bootstrap replicated of Tango's statistic for general clustering, by
means of function \emph{boot} from \emph{boot} library. Notice that these
functions should not be used separately but as argument \emph{statistic} when
calling function \emph{boot}.

\emph{tango.boot} is used when performing non-parametric bootstrap.

\emph{tango.pboot} must be used for parametric bootstrap.
}

\usage{
tango.boot(data, i, ...)
tango.pboot(...)
}

\arguments{
\item{data}{Dataframe with the data as described in \emph{DCluster}.}
\item{i}{Permutation generated by the non-parametric boostrap procedure.}
\item{...}{Additional arguments passed when performing a bootstrap.}
}


\value{
Both functions return the value of the statistic.
}


\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74) )
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)


#Calculate neighbours based on distance
coords<-as.matrix(sids[,c("x", "y")])

dlist<-dnearneigh(coords, 0, Inf)
dlist<-include.self(dlist)
dlist.d<-nbdists(dlist, coords)

#Calculate weights. They are globally standardised but it doesn't
#change significance.
col.W.tango<-nb2listw(dlist, glist=lapply(dlist.d, function(x) {exp(-x)}),
        style="C")
	
niter<-100


#Permutation model
tn.boot<-boot(sids, statistic=tango.boot, R=niter, listw=col.W.tango, 
	zero.policy=TRUE)
plot(tn.boot)#Display results

#Multinomial model
tn.mboot<-boot(sids, statistic=tango.pboot, sim="parametric", 
	ran.gen=multinom.sim,  R=niter, listw=col.W.tango, zero.policy=TRUE)
		
plot(tn.mboot)#Display results

#Poisson model
tn.pboot<-boot(sids, statistic=tango.pboot, sim="parametric", 
	ran.gen=poisson.sim,  R=niter, listw=col.W.tango, zero.policy=TRUE)
		
plot(tn.pboot)#Display results

#Poisson-Gamma model
tn.pgboot<-boot(sids, statistic=tango.pboot, sim="parametric", 
	ran.gen=negbin.sim, R=niter, listw=col.W.tango, zero.policy=TRUE)
plot(tn.pgboot)#Display results
}

\seealso{
DCluster, boot, tango, tango.stat
}

\references{
Tango, Toshiro (1995). A Class of Tests for Detecting 'General' and 'Focused' Clustering of Rare Diseases. Statistics in Medicine 14, 2323-2334.
}

\keyword{spatial}

\eof
\name{tango.stat}

\alias{tango.stat}

\title{Compute Tango's statistic for general clustering}

\description{
Compute Tango's statistic for general clustering. See \emph{tango} manual
page for details.
}

\usage{
tango.stat(data, listw, zero.policy)
}

\arguments{
\item{data}{Dataframe with the data as described in \emph{DCluster}.}
\item{listw}{Neighbours list with spatial weights created, for example,
by 'nb2listw' (package \emph{spdep}).}
\item{zero.policy}{See \emph{nb2listw} in package \emph{spdep}.}
}


\examples{
library(spdep)
data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74) )
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

#Calculate neighbours based on distance
coords<-as.matrix(sids[,c("x", "y")])

dlist<-dnearneigh(coords, 0, Inf)
dlist<-include.self(dlist)
dlist.d<-nbdists(dlist, coords)

#Calculate weights. They are globally standardised but it doesn't
#change significance.
col.W.tango<-nb2listw(dlist, glist=lapply(dlist.d, function(x) {exp(-x)}),
        style="C")

niter<-100

#use exp(-D) as closeness matrix
tango.stat(sids, col.W.tango, zero.policy=TRUE)

}

\seealso{
DCluster, tango, tango.boot, tango.pboot
}

\references{
Tango, Toshiro (1995). A Class of Tests for Detecting 'General' and 'Focused' Clustering of Rare Diseases. Statistics in Medicine 14, 2323-2334.
}

\keyword{spatial}

\eof
\name{whittermore}

\alias{whittermore}

\title{Whittermore's statistic}

\description{
Whittermore's statistic is defined as follows:


\deqn{W=\frac{n-1}{n}r^{'} D r}{W = ((n-1) / n ) r' D r}

where \eqn{r^{'}=[O_1/O_+, \ldots, O_n/O_+]}{ r' = [O_1/O_+, \ldots, O_n/O_+]}
and \eqn{D}{D} is a distance matrix between the centroids of the areas.

It can be used to assess whether the region under study tends to cluster
or not.
}

\seealso{
DCluster, whittermore.stat, whittermore.boot, whittermore.pboot
}

\references{
Whittermore, A. S. and Friend, N. and Byron, W. and Brown, J. R. and Holly, E. A. (1987). A test to detect clusters of disease. Biometrika 74, 631-635.
}


\keyword{spatial}

\eof
\name{whittermore.boot}

\alias{whittermore.boot}
\alias{whittermore.pboot}

\title{Generate bootstrap replicates of Whittermore's statistic}

\description{
Generate bootstrap replicates of Whittermore's statistic by means of function
\emph{boot} from \emph{boot} library. Notice that these functions should not
be used separately but as argument \emph{statistic} when calling function
\emph{boot}.

\emph{whittermore.boot} is used to perform a non-parametric bootstrap

\emph{whittermore.pboot} is used when using parametric bootstrap.
}


\usage{
whittermore.boot(data, i, ...)
whittermore.pboot(...)
}

\arguments{
\item{data}{A dataframe with the data as explained in \emph{DCluster}.}
\item{i}{Permutation generated by the non-parametric bootstrap procedure.}
\item{...}{Additional arguments passed when performing a bootstrap.}
}

\value{
Both functions return the value of the statistic.
}

\seealso{
DCluster, boot, whittermore, whittermore.stat
}

\examples{
library(boot)
library(spdep)

data(nc.sids)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74) )
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

#Calculate neighbours based on distance
coords<-as.matrix(sids[,c("x", "y")])

dlist<-dnearneigh(coords, 0, Inf)
dlist<-include.self(dlist)
dlist.d<-nbdists(dlist, coords)

#Calculate weights. They are globally standardised but it doesn't
#change significance.
col.W.whitt<-nb2listw(dlist, glist=dlist.d, style="C")

niter<-100

#Permutation model
wt.boot<-boot(sids, statistic=whittermore.boot, R=niter, listw=col.W.whitt,
	zero.policy=TRUE)
plot(wt.boot)#Display results

#Multinomial model
wt.mboot<-boot(sids, statistic=whittermore.pboot, sim="parametric", 
	ran.gen=multinom.sim,  R=niter,  listw=col.W.whitt, zero.policy=TRUE)
		
plot(wt.mboot)#Display results

#Poisson model
wt.pboot<-boot(sids, statistic=whittermore.pboot, sim="parametric", 
	ran.gen=poisson.sim,  R=niter,  listw=col.W.whitt, zero.policy=TRUE)
		
plot(wt.pboot)#Display results

#Poisson-Gamma model
wt.pgboot<-boot(sids, statistic=whittermore.pboot, sim="parametric", 
	ran.gen=negbin.sim, R=niter, listw=col.W.whitt, zero.policy=TRUE)
plot(wt.pgboot)#Display results
}

\references{
Whittermore, A. S. and Friend, N. and Byron, W. and Brown, J. R. and Holly, E. A. (1987). A test to detect clusters of disease. Biometrika 74, 631-635.
}


\keyword{spatial}

\eof
\name{whittermore.stat}

\alias{whittermore.stat}

\title{Compute Whittermore's statistic}

\description{
Compute Whittermore's statistic. See \emph{whittermore} manual page
for more details.
}


\usage{
whittermore.stat(data, listw, zero.policy=FALSE))
}

\arguments{
\item{data}{Dataframe with the data, as desribed in \emph{DCluster}
manual page.}
\item{listw}{Neighbours list with spatial weights created, for example,
by 'nb2listw' (package \emph{spdep}).}
\item{zero.policy}{See \emph{nb2listw} in package \emph{spdep}.}
}

\value{
The value of the statistic.
}

\seealso{
DCluster, whittermore, whittermore.boot, whittermore.pboot
}

\examples{
library(spdep)
data(nc.sids)
col.W <- nb2listw(ncCR85.nb, zero.policy=TRUE)

sids<-data.frame(Observed=nc.sids$SID74)
sids<-cbind(sids, Expected=nc.sids$BIR74*sum(nc.sids$SID74)/sum(nc.sids$BIR74) )
sids<-cbind(sids, x=nc.sids$x, y=nc.sids$y)

#Calculate neighbours based on distance
coords<-as.matrix(sids[,c("x", "y")])

dlist<-dnearneigh(coords, 0, Inf)
dlist<-include.self(dlist)
dlist.d<-nbdists(dlist, coords)

#Calculate weights. They are globally standardised but it doesn't
#change significance.
col.W.whitt<-nb2listw(dlist, glist=dlist.d, style="C")


whittermore.stat(sids, col.W.whitt, zero.policy=TRUE)
}

\references{
Whittermore, A. S. and Friend, N. and Byron, W. and Brown, J. R. and Holly, E. A. (1987). A test to detect clusters of disease. Biometrika 74, 631-635.
}


\keyword{spatial}

\eof
