concoreg               package:concor               R Documentation

_R_e_d_u_n_d_a_n_c_y _o_f _s_e_t_s _y_j _b_y _o_n_e _s_e_t _x

_D_e_s_c_r_i_p_t_i_o_n:

     Regression of several subsets of variables Yj by another set X.
     SUCCESSIVE SOLUTIONS

_U_s_a_g_e:

     concoreg(x,y,py,r)

_A_r_g_u_m_e_n_t_s:

       x: is a n x p matrix of p centered explanatory variables

       y: is a n x q matrix of q centered variables

      py: is a row vector which contains the numbers q_i, i=1,...,ky,
          of the ky subsets y_i of y : sum_i q_i = sum(py) = q. py is
          the partition vector of y

       r: is the wanted number of successive solutions

_D_e_t_a_i_l_s:

     The first solution calculates 1+ky normed vectors: the component
     cx[,1] in R^n associated to the ky vectors vi[,1]'s of R^{q_i}, by
     maximizing varexp1=sum_i rho(cx[,1],y_i*v_i[,1])^2
     mbox{var}(y_i*v_i[,1])), with 1+ky norm constraints. A explanatory
     component cx[,k] is associated to ky partial explained components
     yi*vi[,k] and also to a global explained component y*V[,k].
     rho(cx[,k],y*V[,k])^2 mbox{var}(y*V[,k])= mbox{varexpk}.  The
     total explained variance by the first solution is maximal.

     The second solution is obtained from the same criterion, but after
     replacing each yi by y_i-y_i*v_i[,1]*v_i[,1]'.  And so on for the
     successive solutions 1,2,...,r .  The biggest number of solutions
     may be r=inf(n,p,q_i), when the matrices x'*yi are supposed with
     full rank. For a set of r solutions, the matrix (cx)'*y*V is
     diagonal : "on average", the explanatory component of one solution
     is only linked with the components explained by this explanatory,
     and is not linked with the explained components of the other
     solutions.  The matrices (cx)'*y_j*v_j are triangular : the
     explanatory component of one solution is not linked with each of
     the partial components explained in the following solutions.  The
     definition of the explanatory components depends on the partition
     vector py from the second solution.

     This function is using concor function

_V_a_l_u_e:

     list with following components 

      cx: the n x r matrix of the r explanatory components

       v: is a q x r matrix of ky row blocks v_i (q_i x r) of axes in
          Rqi relative to yi; v_i'*v_i = mbox{Id}

       V: is a q x r matrix of axes in Rq relative to y; V'*V =
          mbox{Id}

  varexp: is a ky x r matrix; each column k contains ky explained
          variances rho(cx[,k],y_i*v_i[,k])^2 mbox{var}(y_i*v_i[,k])

_R_e_f_e_r_e_n_c_e_s:

     Hanafi & Lafosse (2001) Generalisation de la regression lineaire
     simple pour analyser la dependance de K ensembles de variables
     avec un K+1 eme. Revue de Statistique Appliquee vol.49, n.1.

     Chessel D. & Hanafi M. (1996) Analyses de la Co-inertie de K
     nuages de points.  Revue de Statistique Appliquee vol.44, n.2.
     (this ACOM analysis of one multiset is obtained by the command :
     concoreg(Y,Y,py,r))

_E_x_a_m_p_l_e_s:

     x<-matrix(runif(50),10,5);y<-matrix(runif(90),10,9)
     x<-scale(x);y<-scale(y)
     co<-concoreg(x,y,c(3,2,4),2)
     ((t(co$cx[,1])%*%y[,1:3]%*%co$v[1:3,1])/10)^2;co$varexp[1,1]
     t(co$cx)%*%co$cx /10
     diag(t(co$cx)%*%y%*%co$V/10)^2
     sum(co$varexp[,1]);sum(co$varexp[,2])

