consensus          package:relations          R Documentation(utf8)

_C_o_n_s_e_n_s_u_s _R_e_l_a_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Compute consensus relations of a relation ensemble.

_U_s_a_g_e:

     relation_consensus(x, method = NULL, weights = 1, control = list(), ...)

_A_r_g_u_m_e_n_t_s:

       x: an ensemble of relations, or something which can be coerced
          to such (see 'relation_ensemble').

  method: a character string specifying one of the built-in methods for
          computing consensus relations, or a function to be taken as a
          user-defined method, or 'NULL' (default value).  If a
          character string, its lower-cased version is matched against
          the lower-cased names of the available built-in methods using
          'pmatch'.  See *Details* for available built-in methods and
          defaults.

 weights: a numeric vector with non-negative case weights. Recycled to
          the number of elements in the ensemble given by 'x' if
          necessary.

 control: a list of control parameters.  See *Details*.

     ...: a list of control parameters (overruling those specified in
          'control').

_D_e_t_a_i_l_s:

     Consensus relations "synthesize" the information in the elements
     of a relation ensemble into a single relation, often by minimizing
     a criterion function measuring how dissimilar consensus candidates
     are from the (elements of) the ensemble (the so-called
     "optimization approach"), typically of the form L(R) = sum w_b
     d(R_b, R) ^ p, where d is a suitable dissimilarity measure (see
     'relation_dissimilarity'), w_b is the case weight given to element
     R_b of the ensemble, and p >= 1.  Such consensus relations are
     called "central relations" in Régnier (1965).  For p = 1, we
     obtain (generalized) medians; p = 2 gives (generalized) means
     (least squares consensus relations).

     Available built-in methods are as follows.  Apart from Condorcet's
     and the unrestricted Manhattan and Euclidean consensus methods,
     these are applicable to ensembles of endorelations only. 


     '"_B_o_r_d_a"' the consensus method proposed by Borda (1781). For each
          relation R_b and object x, one determines the Borda/Kendall
          scores, i.e., the number of objects y such that y R_b x. 
          These are then aggregated across relations by weighted
          averaging.  Finally, objects are ordered according to their
          aggregated scores.

     '"_C_o_p_e_l_a_n_d"' the consensus method proposed by Copeland (1951). 
          For each relation R_b and object x, one determines the
          Copeland scores, i.e., the number of objects y such that y
          R_b x, minus the number of objects y such that x R_b y.  Like
          the Borda method, these are then aggregated across relations
          by weighted averaging.  Finally, objects are ordered
          according to their aggregated scores.

     '"_C_o_n_d_o_r_c_e_t"' the consensus method proposed by Condorcet (1785). 
          For a given ensemble of crisp relations, this minimizes the
          criterion function L with d as symmetric difference distance
          and p = 1 over all possible crisp  relations.  In the case of
          endorelations, consensus is obtained by weighting voting,
          such that x R y if the weighted number of times that x R_b y
          is no less than the weighted number of times that this is not
          the case.  Even when aggregating linear orders, this can lead
          to intransitive consensus solutions ("effet Condorcet").  One
          can obtain a relation ensemble with _all_ consensus relations
          by setting the control parameter 'all' to 'TRUE'.

     '"_C_S"' the consensus method of Cook and Seiford (1978) which
          determines a linear order minimizing the criterion function L
          with d as generalized Cook-Seiford (ranking) distance and p =
          1 via solving a linear sum assignment problem.  One can
          obtain a relation ensemble with _all_ consensus relations by
          setting the control parameter 'all' to 'TRUE'.

     '"_S_D/_F"' an exact solver for determining the consensus relation of
          an ensemble of crisp endorelations by minimizing the
          criterion function L with d as symmetric difference distance
          ("SD") and p = 1 over a suitable class ("Family") of crisp
          endorelations as indicated by F, with values:

          '_G' general (crisp) endorelations.

          '_A' antisymmetric relations.

          '_C' complete relations.

          '_E' equivalence relations: reflexive, symmetric, and
               transitive.

          '_L' linear orders: complete, reflexive, antisymmetric, and
               transitive.

          '_M' matches: complete and reflexive.

          '_O' partial orders: reflexive, antisymmetric and transitive.

          '_S' symmetric relations.   

          '_T' tournaments: complete, irreflexive and antisymmetric
               (i.e., complete and asymmetric).

          '_W' weak orders (complete preorders, preferences,
               "orderings"): complete, reflexive and transitive. 

          '_p_r_e_o_r_d_e_r' preorders: reflexive and transitive.

          '_t_r_a_n_s_i_t_i_v_e' transitive relations.

          Consensus relations are determined by reformulating the
          consensus problem as a binary program (for the relation
          incidences), see Hornik and Meyer (2007) for details.  The
          solver employed can be specified via the control argument
          'solver', with currently possible values '"glpk"',
          '"lpsolve"', '"symphony"' or '"cplex"' or a unique
          abbreviation thereof, specifying to use the solvers from
          packages 'Rglpk' (default), 'lpSolve', 'Rsymphony', or
          'Rcplex', respectively. Unless control option 'sparse' is
          false, a sparse formulation of the binary program is used,
          which is typically more efficient.

          For fitting equivalences and weak orders (cases 'E' and 'W')
          it is possible to specify the number of classes k using the
          control parameter 'k'.  For fitting weak orders, one can also
          specify the number of elements in the classes via control
          parameter 'l'.

          Additional constraints on the incidences of the consensus
          solution can be given via the control parameter
          'constraints', in the form of a 3-column matrix whose rows
          give row and column indices i and j and the corresponding
          incidence I_{ij}. (I.e., incidences can be constrained to be
          zero or one on an object by object basis.)

          One can obtain a relation ensemble with _all_ consensus
          relations by setting the control parameter 'all' to 'TRUE'. 
          (See the examples.)

     '"_m_a_n_h_a_t_t_a_n"' the (unrestricted) median of the ensemble,
          minimizing L with d as Manhattan (symmetric difference)
          distance and p = 1 over all (possibly fuzzy) relations.

     '"_e_u_c_l_i_d_e_a_n"' the (unrestricted) mean of the ensemble, minimizing
          L with d as Euclidean distance and p = 2 over all (possibly
          fuzzy) relations.

     '"_e_u_c_l_i_d_e_a_n/_F"' an exact solver for determining the restricted
          least squares Euclidean consensus relation of an ensemble of
          endorelations by minimizing the criterion function L with d
          as Euclidean difference distance and p = 2 over a suitable
          family of crisp endorelations as indicated by F, with
          available families and control parameters as for methods
          '"SD/F"'.

     '"_m_a_j_o_r_i_t_y"' a generalized majority method for which the consensus
          relation contains of all tuples occurring with a relative
          frequency of more than 100 p percent (of 100 percent if p =
          1).  The fraction p can be specified via the control
          parameter 'p'.  By default, p = 1/2 is used.

     '"_C_K_S/_F"' an exact solver for determining the consensus relation
          of an ensemble of crisp endorelations by minimizing the
          criterion function L with d as Cook-Kress-Seiford distance
          ("CKS") and p = 1 over a suitable class ("Family") of crisp
          endorelations as indicated by F, with available families and
          control parameters as for methods '"SD/F"'.


_V_a_l_u_e:

     The consensus relation(s).

_R_e_f_e_r_e_n_c_e_s:

     J. C. Borda (1781), Mémoire sur les élections au scrutin.
     Histoire de l'Académie Royale des Sciences.

     W. D. Cook and M. Kress (1992), _Ordinal information and
     preference structures: decision models and applications_.
     Prentice-Hall: New York. ISBN: 0-13-630120-7.

     W. D. Cook and L. M. Seiford (1978), Priority ranking and
     consensus formation. _Management Science_, *24*/16, 1721-1732.

     M. J. A. de Condorcet (1785), Essai sur l'application de l'analyse
     à la probabilité des décisions rendues à la pluralité des
     voix.  Paris.

     A. H. Copeland (1951), A Reasonable Social Welfare Function.
     _mimeo_, University of Michigan.

     K. Hornik and D. Meyer (2007), Deriving consensus rankings from
     benchmarking experiments. In R. Decker and H.-J. Lenz, _Advances
     in Data Analysis_. Studies in Classification, Data Analysis, and
     Knowledge Organization. Springer-Verlag: Heidelberg, 163-170.

     F. Marcotorchino and P. Michaud (1982). Agrégation de
     similarités en classification automatique. _Revue de Statistique
     Appliquée_, *30*/2, 21-44. <URL:
     http://www.numdam.org/item?id=RSA_1982__30_2_21_0>.

     S. Régnier (1965), Sur quelques aspects mathématiques des
     problèmes de classification automatique. _ICC Bulletin_, *4*,
     175-191.

_E_x_a_m_p_l_e_s:

     ## Consensus equivalence.
     ## (I.e., in fact, consensus partition.)
     ## Classification of 30 felines, see Marcotorchino and Michaud (1982).
     data("Felines")
     ## Consider each variable an equivalence relation on the objects.
     relations <- as.relation_ensemble(Felines)
     ## This gives a relation ensemble of length 14 (number of variables in
     ## the data set).
     ## Now fit an equivalence relation to this:
     E <- relation_consensus(relations, "SD/E")
     ## And look at the equivalence classes:
     ids <- relation_class_ids(E)
     ## Or, more nicely:
     split(rownames(Felines), ids)
     ## Which is the same as in the paper ...

     ## Consensus linear order.
     ## Example from Cook and Kress, pages 48ff.
     ## Relation from paired comparisons.
     pm <- matrix(c(0, 1, 0, 1, 1,
                    0, 0, 0, 1, 1,
                    1, 1, 0, 0, 0,
                    0, 0, 1, 0, 0,
                    0, 0, 1, 1, 0),
                  nrow = 5,
                  byrow = TRUE,
                  dimnames = list(letters[1:5], letters[1:5]))
     ## Note that this is a Cook and Kress "preference matrix" where entry
     ## (i,j) is one iff object i is preferred to object j (i > j).
     ## Set up the corresponding '<' relation:
     R <- as.relation(t(pm))
     relation_incidence(R)
     relation_is_tournament(R)
     ## Closest linear order:
     L <- relation_consensus(R, "SD/L")
     relation_incidence(L)
     ## Visualize provided that Rgraphviz is available.
     if(require("Rgraphviz")) plot(L)
     ## But note that this linear order is not unique.
     L <- relation_consensus(R, "SD/L", control = list(all = TRUE))
     print(L)
     if(require("Rgraphviz")) plot(L)
     ## (Oh no: c is once first and once last.)
     ## Closest weak order relation with at most 3 indifference classes:
     W3 <- relation_consensus(R, "SD/W", control = list(k = 3))
     relation_incidence(W3)

