pstar                  package:sna                  R Documentation

_F_i_t _a _p*/_E_R_G _M_o_d_e_l _U_s_i_n_g _a _L_o_g_i_s_t_i_c _A_p_p_r_o_x_i_m_a_t_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     Fits a p*/ERG model to the adjacency matrix in 'dat' containing
     the effects listed in 'effects'.  The result is returned as a
     'glm' object.

_U_s_a_g_e:

     pstar(dat, effects=c("choice", "mutuality", "density", "reciprocity",
         "stransitivity", "wtransitivity", "stranstri",  "wtranstri", 
         "outdegree", "indegree", "betweenness", "closeness", 
         "degcentralization", "betcentralization", "clocentralization",
         "connectedness", "hierarchy", "lubness", "efficiency"), 
         attr=NULL, memb=NULL, diag=FALSE, mode="digraph")

_A_r_g_u_m_e_n_t_s:

     dat: A single adjacency matrix 

 effects: A vector of strings indicating which effects should be fit 

    attr: A matrix whose columns contain individual attributes (one row
          per vertex) whose differences should be used as supplemental
          predictors 

    memb: A matrix whose columns contain group memberships whose
          categorical similarities (same group/not same group) should
          be used as supplemental predictors

    diag: A boolean indicating whether or not diagonal entries (loops)
          should be counted as meaningful data 

    mode: '"digraph"' if 'dat' is directed, else '"graph"' 

_D_e_t_a_i_l_s:

     p* (also called the Exponential Random Graph (ERG) family) is an
     exponential family specification for network data.  Under p*, it
     is assumed that 

    p(G=g) propto exp(beta_0 gamma_0(g) + beta_1 gamma_1(g) + ...)

     for all g, where the betas represent real coefficients and the
     gammas represent functions of g.  Unfortunately, the unknown
     normalizing factor in the above expression makes evaluation
     difficult in the general case.  One solution to this problem is to
     operate instead on the edgewise log odds; in this case, the p* can
     be approximated by a logistic regression of each edge on the
     _differences_ in the gamma scores induced by the presence and
     absence of said edge in the graph (conditional on all other
     edges).  It is this approximation (known as autologistic
     regression, or maximum pseudo-likelihood estimation) which is
     employed here.  

     Using the 'effects' argument, a range of different potential
     parameters can be estimated.  The network measure associated with
     each is, in turn, the edge-perturbed difference in:

        1.  'choice': the number of edges in the graph (acts as a
           constant)

        2.  'mutuality': the number of reciprocated dyads in the graph

        3.  'density': the density of the graph

        4.  'reciprocity': the edgewise reciprocity of the graph

        5.  'stransitivity': the strong transitivity of the graph

        6.  'wtransitivity': the weak transitivity of the graph

        7.  'stranstri': the number of strongly transitive triads in
           the graph

        8.  'wtranstri': the number of weakly transitive triads in the
           graph

        9.  'outdegree': the outdegree of each actor (|V| parameters)

        10.  'indegree': the indegree of each actor (|V| parameters)

        11.  'betweenness': the betweenness of each actor (|V|
           parameters)

        12.  'closeness': the closeness of each actor (|V| parameters)

        13.  'degcentralization': the Freeman degree centralization of
           the graph

        14.  'betcentralization': the betweenness centralization of the
           graph

        15.  'clocentralization': the closeness centralization of the
           graph

        16.  'connectedness': the Krackhardt connectedness of the graph

        17.  'hierarchy': the Krackhardt hierarchy of the graph

        18.  'efficiency': the Krackhardt efficiency of the graph

        19.  'lubness': the Krackhardt LUBness of the graph

     (Note that some of these do differ somewhat from the common p*
     parameter formulation, e.g. quantities such as density and
     reciprocity are computed as per the 'gden' and 'grecip' functions
     rather than via the unnormalized "choice" and "mutual" quantities
     one often finds in the p* literature.)  _Please do not attempt to
     use all effects simultaneously!!!_  In addition to the above, the
     user may specify a matrix of individual attributes whose absolute
     dyadic differences are to be used as predictors, as well as a
     matrix of individual memberships whose dyadic categorical
     similarities (same/different) are used in the same manner.

     Although the p* framework is quite versatile in its ability to
     accommodate a range of structural predictors, it should be noted
     that the _substantial_ collinearity of many of the standard p*
     predictors can lead to very unstable model fits.  Measurement and
     specification errors compound this problem; thus, it is somewhat
     risky to use p* in an exploratory capacity (i.e., when there is
     little prior knowledge to constrain choice of parameters).  While
     raw instability due to multicollinearity should decline with graph
     size, improper specification will still result in biased
     coefficient estimates so long as an omitted predictor correlates
     with an included predictor.  Caution is advised.

_V_a_l_u_e:

     A 'glm' object

_W_A_R_N_I_N_G:

     Estimation of p* models by maximum pseudo-likelihood is now known
     to be a dangerous practice.  Use at your own risk.

_N_o_t_e:

     In the long run, support will be included for p* models involving
     arbitrary functions (much like the system used with 'cugtest' and
     'qaptest').

_A_u_t_h_o_r(_s):

     Carter T. Butts buttsc@uci.edu

_R_e_f_e_r_e_n_c_e_s:

     Anderson, C.; Wasserman, S.; and Crouch, B. (1999).  ``A p*
     Primer:  Logit Models for Social Networks.  _Social Networks,_
     21,37-66.

     Holland, P.W., and Leinhardt, S. (1981).  ``An Exponential Family
     of Probability Distributions for Directed Graphs.'' _Journal of
     the American statistical Association_, 81, 51-67.

     Wasserman, S., and Pattison, P. (1996).  ``Logit Models and
     Logistic Regressions for Social Networks:  I.  An introduction to
     Markov Graphs and p*.''  _Psychometrika,_ 60, 401-426.

_S_e_e _A_l_s_o:

     'eval.edgeperturbation'

_E_x_a_m_p_l_e_s:

     #Create a graph with expansiveness and popularity effects
     in.str<-rnorm(20,0,3)
     out.str<-rnorm(20,0,3)
     tie.str<-outer(out.str,in.str,"+")
     tie.p<-apply(tie.str,c(1,2),function(a){1/(1+exp(-a))})
     g<-rgraph(20,tprob=tie.p)

     #Fit a model with expansiveness only
     p1<-pstar(g,effects="outdegree")
     #Fit a model with expansiveness and popularity
     p2<-pstar(g,effects=c("outdegree","indegree"))
     #Fit a model with expansiveness, popularity, and mutuality
     p3<-pstar(g,effects=c("outdegree","indegree","mutuality"))

     #Compare the model AICs
     extractAIC(p1)
     extractAIC(p2)
     extractAIC(p3)

