hmm                package:hmm.discnp                R Documentation

_F_i_t _a _h_i_d_d_e_n _M_a_r_k_o_v _m_o_d_e_l _t_o _d_i_s_c_r_e_t_e _d_a_t_a.

_D_e_s_c_r_i_p_t_i_o_n:

     Uses the EM algorithm to perform a maximum likelihood fit of a
     hidden Markov model to discrete data where the observations come
     from one of a number of finite discrete distributions, depending
     on the (hidden) state of the Markov chain.  These distributions
     are specified (non-parametrically) by a matrix Rho = [rho_ij]
     where rho_ij = P(Y = y_i | S = j), Y being the observable random
     variable and S being the hidden state.

_U_s_a_g_e:

     hmm(y, yval=NULL, par0=NULL, K=NULL, rand.start=NULL, stationary=TRUE,
         mixture=FALSE, tolerance=1e-4, verbose=FALSE, itmax=200,
         crit='PCLL',keep.y=TRUE, data.name=NULL)

_A_r_g_u_m_e_n_t_s:

       y: A vector or matrix of discrete data; missing values are
          allowed.  If 'y' is a matrix, each column is interpreted as
          an independent replicate of the observation sequence. 

    yval: A vector of possible values for the data; it defaults to the
          sorted unique values of 'y'.  If any value of 'y' does not
          match some value of 'yval', it will be treated as a MISSING
          VALUE. 

    par0: An optional (_named_) list of starting values for the
          parameters of the model, with components 'tpm' (transition
          probability matrix) and 'Rho'.  The matrix 'Rho' specifies
          the probability that the observations take on each value in
          yval, given the state of the hidden Markov chain.  The
          columns of 'Rho' correspond to states, the rows to the values
          of 'yval'.

          If 'par0' is not specified, starting values are created by
          the function 'init.all()'. 

       K: The number of states in the hidden Markov chain; if 'par0' is
          not specified 'K' MUST be; if 'par0' is specified, 'K' is
          ignored.

          Note that 'K=1' is acceptable; if 'K' is 1 then all
          observations are treated as being independent and the
          non-parametric estimate of the distribution of the
          observations is calculated in the obvious way. 

rand.start: A list consisting of two logical scalars which must be
          named 'tpm' and 'Rho', if 'tpm' is TRUE then the function
          init.all() chooses entries for then starting value of 'tpm'
          at random; likewise for 'Rho'.  This argument defaults to
          'list(tpm=FALSE,Rho=FALSE)'. 

stationary: Logical scalar.  If 'TRUE' then the model is fitted under
          the stationarity assumption, i.e. that the Markov chain was
          in steady state at the time that observations commenced. In
          this case  the initial state probability distribution is
          estimated as the stationary distribution determined by the
          (estimated) transition probability matrix.  Otherwise the
          initial state probability distribution is estimated as the
          mean of the vectors of conditional probabilities of the
          states, given the observation sequences, at time 't=1'. 

 mixture: A logical scalar; if TRUE then a mixture model (all rows of
          the transition probability matrix are identical) is fitted
          rather than a general hidden Markov model. 

tolerance: If the value of the quantity used for the stopping criterion
          is less than tolerance then the EM algorithm is considered to
          have converged. 

 verbose: A logical scalar determining whether to print out details of
          the progress of the EM algorithm. 

   itmax: If the convergence criterion has not been met by the time
          'itmax' EM steps have been performed, a warning message is
          printed out, and the function stops.  A value is returned by
          the function anyway, with the logical component "converged"
          set to FALSE. 

    crit: The name of the stopping criterion, which must be one of
          "PCLL" (percent change in log-likelihood; the default), "L2"
          (L-2 norm, i.e.  square root of sum of squares of change in
          coefficients), or "Linf" (L-infinity norm, i.e.  maximum
          absolute value of change in coefficients). 

  keep.y: Logical scalar; should the observations 'y' be returned as a
          component of the value of this function?

data.name: An identifying tag for the fit; if omitted, it defaults to
          the name of data set 'y' as determined by
          'deparse(substitute(y))'.

_D_e_t_a_i_l_s:

     The hard work is done by a Fortran subroutine "recurse" (actually
     coded in Ratfor) which is dynamically loaded.

_V_a_l_u_e:

     A list with components:

     Rho: The fitted value of the probability matrix 'Rho' specifying
          the distributions of the observations. 

     tpm: The fitted value of the transition probabilty matrix 'tpm'. 

    ispd: The fitted initial state probability distribution, assumed to
          be the (unique) stationary distribution for the chain, and
          thereby determined by the transition probability matrix
          'tpm'. 

log.like: The final value of the log likelihood, as calculated through
          recursion. 

converged: A logical scalar saying whether the algorithm satisfied the
          convergence criterion before the maximum of itmax EM steps
          was exceeded. 

   nstep: The number of EM steps performed by the algorithm. 

       y: The observations (argument 'y').  Present only if 'keep.y' is
          'TRUE'.

data.name: An identifying tag, specified as an argument, or determined
          from the name of the argument y by deparse(substitute(y)). 

stationary: The argument 'stationary'

_N_o_t_e:

     If 'K=1' then 'tpm', 'ispd', 'converged', and 'nstep' are all set
     equal to 'NA' in the list returned by this function.

_W_a_r_n_i_n_g:

     The ordering of the (hidden) states can be arbitrary.  What the
     estimation procedure decides to call ``state 1'' may not be what
     _you_ think of as being state number 1. The ordering of the states
     will be affected by the starting values used.

_A_u_t_h_o_r(_s):

     Rolf Turner r.turner@auckland.ac.nz
      <URL: http://www.math.unb.ca/~rolf>

_R_e_f_e_r_e_n_c_e_s:

     Rabiner, L. R., "A tutorial on hidden Markov models and selected
     applications in speech recognition," Proc. IEEE vol. 77, pp. 257 -
     286, 1989.

     Zucchini, W. and Guttorp, P., "A hidden Markov model for
     space-time precipitation," Water Resources Research vol. 27, pp.
     1917-1923, 1991.

     MacDonald, I. L., and Zucchini, W., "Hidden Markov and Other
     Models for Discrete-valued Time Series, Chapman & Hall, London,
     1997.

     Liu, Limin, "Hidden Markov Models for Precipitation in a Region of
     Atlantic Canada", Master's Report, University of New Brunswick,
     1997.

_S_e_e _A_l_s_o:

     'sim.hmm()', 'mps()', 'viterbi()'

_E_x_a_m_p_l_e_s:

     # See the help for sim.hmm() for how to generate y.num.
     ## Not run: 
     fit.num <- hmm(y.num,K=2,verb=TRUE)
     fit.num.mix <- hmm(y.num,K=2,verb=TRUE,mixture=TRUE)
     ## End(Not run)

