UPmaxentropy            package:sampling            R Documentation

_M_a_x_i_m_u_m _e_n_t_r_o_p_y _s_a_m_p_l_i_n_g _w_i_t_h _f_i_x_e_d _s_a_m_p_l_e _s_i_z_e _a_n_d _u_n_e_q_u_a_l _p_r_o_b_a_b_i_l_i_t_i_e_s

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum entropy sampling with fixed sample size and unequal
     probabilities (or Conditional Poisson sampling)  is implemented by
     means of a sequential method.

_U_s_a_g_e:

     UPmaxentropy(pik) 
     UPmaxentropypi2(pik)
     UPMEqfromw(w,n)
     UPMEpikfromq(q) 
     UPMEpiktildefrompik(pik,eps=1e-6)
     UPMEsfromq(q)
     UPMEpik2frompikw(pik,w)

_A_r_g_u_m_e_n_t_s:

       n: sample size.

     pik: vector of prescribed inclusion probabilities.

     eps: tolerance in Newton's method; by default is 1E-6.

       q: matrix of the conditional selection probabilities for the
          sequential algorithm.

       w: parameter vector of the maximum entropy design.

_D_e_t_a_i_l_s:

     The maximum entropy sampling maximizes the entropy criterion:

                    I(p) =  -sum_s p(s)log[p(s)].

     The main procedure is 'UPmaxentropy' that allows selecting a
     sample (a vector of 0 and 1) from a given vector of inclusion
     probabilities. The procedure 'UPmaxentropypi2' returns the matrix
     of the joint inclusion probabilities from the first-order
     inclusion probability vector. The other procedures are
     intermediate steps. They can be useful to run simulations as shown
     in the examples below. The procedure 'UPMEpiktildefrompik'
     computes the vector of the inclusion probabilities (denoted
     'pikt') of a Poisson sampling from the vector of the inclusion
     probabilities of the maximum entropy sampling.   The maximum
     entropy sampling is the conditional design given the fixed sample
     size. The vector 'w' can be easily obtained by  'w=pikt/(1-pikt)'.
     Once 'piktilde' and 'w' are deduced from 'pik', a matrix of
     selection probabilities 'q' can be derived from the sample size
     'n'  and the vector 'w', by means of the procedure 'UPMEqfromw'. 
     Next, a sample can be selected from 'q' by using 'UPMEsfromq'. In
     order to generate several samples,  it is more efficient to
     compute the matrix 'q' (which needs some calculation), and then to
     use the procedure 'UPMEsfromq'. The vector of the inclusion
     probabilities can be recomputed from 'q' by using 'UPMEpikfromq',
     which allows to check  the numerical precision  of the algorithm.
     The procedure 'UPMEpik2frompikw' computes the matrix of the joint
     inclusion probabilities from 'q' and 'w'.

_R_e_f_e_r_e_n_c_e_s:

     Chen, S.X., Liu, J.S. (1997). Statistical applications of the
     Poisson-binomial and conditional Bernoulli distributions,
     _Statistica Sinica_, 7, 875-892;
      Deville, J.-C. (2000). _Note sur l'algorithme de Chen, Dempster
     et Liu._ Technical report, CREST-ENSAI, Rennes.
      Matei, A., Till, Y. (2005) Evaluation of variance approximations
     and estimators in maximum entropy sampling with unequal
     probability and fixed sample size,  _Journal of Official
     Statistics_, Vol. 21, No. 4, p. 543-570.
      Till, Y. (2006), _Sampling Algorithms_, Springer.

_E_x_a_m_p_l_e_s:

     ############
     ## Example 1
     ############
     # Simple example - sample Selection 
     pik=c(0.07,0.17,0.41,0.61,0.83,0.91)
     # First method
     UPmaxentropy(pik)
     # Second method by using the intermediate procedures
     n=sum(pik)
     pikt=UPMEpiktildefrompik(pik)
     w=pikt/(1-pikt)
     q=UPMEqfromw(w,n)
     UPMEsfromq(q)
     # The matrix of inclusion probabilities
     # First method: direct computation from pik
     UPmaxentropypi2(pik)
     # Second method: computation from pik and w
     UPMEpik2frompikw(pik,w)
     ############
     ## Example 2
     ############
     # Sample of Belgian municipalities
     data(belgianmunicipalities)
     attach(belgianmunicipalities)
     n=200
     pik=inclusionprobabilities(averageincome,n)
     s=UPmaxentropy(pik)
     #the sample is
     as.character(Commune[s==1])
     #the joint inclusion probabilities
     pi2=UPmaxentropypi2(pik)
     rowSums(pi2)/pik/n
     ############
     ## Example 3
     ############
     # Selection of 200 samples of Belgian municipalities
     # Once matrix q is computed, the selection of a sample is very quick.
     # Simulations are thus possible.
     data(belgianmunicipalities)
     attach(belgianmunicipalities)
     pik=inclusionprobabilities(averageincome,200)
     pik=pik[pik!=1]
     n=sum(pik)
     pikt=UPMEpiktildefrompik(pik)
     w=pikt/(1-pikt)
     q=UPMEqfromw(w,n)
     N=length(pik)
     tt=rep(0,times=N)
     #number of simulations
     sim=200
     for(i in 1:sim) tt = tt+UPMEsfromq(q)
     tt=tt/sim
     sum(abs(tt-pik))

