dthmm              package:HiddenMarkov              R Documentation

_D_i_s_c_r_e_t_e _T_i_m_e _H_M_M _O_b_j_e_c_t

_D_e_s_c_r_i_p_t_i_o_n:

     Creates a discrete time hidden Markov model object with class
     '"dthmm"'.

_U_s_a_g_e:

     dthmm(x, Pi, delta, distn, pm, pn = NULL, discrete = FALSE,
           nonstat = TRUE)

_A_r_g_u_m_e_n_t_s:

       x: is a vector of length n containing the observed process.
          Alternatively, 'x' could be specified as 'NULL', meaning that
          the data will be added later (e.g. simulated).

      Pi: is the m times m transition probability matrix of the
          homogeneous hidden Markov chain.

   delta: is the marginal probability distribution of the m hidden
          states at the first time point.

   distn: is a character string with the distribution name, e.g.
          '"norm"' or '"pois"'. If the distribution is specified as
          '"wxyz"' then a distribution function called '"pwxyz"' should
          be available, in the standard R format (e.g. 'pnorm' or
          'ppois').

      pm: is a list object containing the (Markov dependent) parameter
          values associated with the distribution of the observed
          process (see below).

      pn: is a list object containing the observation dependent
          parameter values associated with the distribution of the
          observed process (see below).

discrete: is logical, and is 'TRUE' if 'distn' is a discrete
          distribution.

 nonstat: is logical, 'TRUE' if the homogeneous Markov chain is assumed
          to be non-stationary, default. See "Details" below.

_V_a_l_u_e:

     A 'list' object with class '"dthmm"', containing the above
     arguments as named components.

_N_o_t_a_t_i_o_n:

        1.  MacDonald & Zucchini (1997) use t to denote the _time_,
           where t = 1, cdots, T. To avoid confusion with other uses of
           't' and 'T' in R\, we use i = 1, cdots, n.

        2.  We denote the observed sequence as {X_i}, i = 1, cdots, n;
           and the hidden Markov chain as {C_i}, i = 1, cdots, n.

        3.  The history of the observed process up to time i is denoted
           by X^{(i)}, i.e.

                     X^{(i)} = (X_1, cdots, X_i)

           where i = 1, cdots, n. Similarly for C^{(i)}.

        4.  The hidden Markov chain has m states denoted by 1, cdots,
           m.

        5.  The Markov chain transition probability matrix is denoted
           by Pi, where the (j, k)th element is

                pi_{jk} = Pr{ C_{i+1}=k , | , C_i=j }

           for all i (i.e. all time points), and j,k = 1, cdots, m.

        6.  The Markov chain is assumed to be _homogeneous_, i.e. for
           each j and k, pi_{jk} is constant over time.

        7.  The Markov chain is said to be _stationary_ if the marginal
           distribution is the same over time, i.e. for each j, delta_j
           = Pr{ C_i = j } is constant for all i. The marginal
           distribution is denoted by delta = (delta_1, cdots,
           delta_m).

_L_i_s_t _O_b_j_e_c_t _p_m:

     The list object 'pm' contains parameter values for the probability
     distribution of the observed process that are dependent on the
     hidden Markov state. These parameters are generally required to be
     estimated. See "Modifications" in topic 'Mstep' when some do not
     require estimation.

     Assume that the hidden Markov chain has m states, and that there
     are ell parameters that are dependent on the hidden Markov state.
     Then the list object 'pm' should contain ell _named_ vector
     components each of length m. The names are determined by the
     required probability distribution.

     For example, if 'distn == "norm"', the arguments names must
     coincide with those used by the functions 'dnorm' or 'rnorm',
     which are 'mean' and 'sd'. Each must be specified in either 'pm'
     or 'pn'. If they both vary according to the hidden Markov state
     then 'pm' should have the _named_ components 'mean' and 'sd'.
     These are both vectors of length m containing the means and
     standard deviations of the observed process when the hidden Markov
     chain is in each of the m states. If, for example, 'sd' was "time"
     dependent, then 'sd' would be contained in 'pn' (see below).

     If 'distn == "pois"', then 'pm' should have one component named
     'lambda', being the parameter name in the function 'dpois'. Even
     if there is only one parameter, the vector component should still
     be within a list and named.

_L_i_s_t _O_b_j_e_c_t _p_n:

     The list object 'pn' contains parameter values of the probability
     distribution for the observed process that are dependent on the
     observation number or "time". These parameters are assumed to be
     _known_.

     Assume that the observed process is of length n, and that there
     are ell parameters that are dependent on the observation number or
     time. Then the list object 'pn' should contain ell _named_ vector
     components each of length n. The names, as in 'pm', are determined
     by the required probability distribution.

     For example, in the observed process we may count the number of
     successes in a _known_ number of Bernoulli trials, i.e. the number
     of Bernoulli trials is known at each time point, but the
     probability of success varies according to a hidden Markov state.
     The 'prob' parameter of 'rbinom' (or 'dbinom') would be specified
     in 'pm' and the 'size' parameter would specified in 'pn'.

     One could also have a situation where the observed process was
     Gaussian, with the means varying according to the hidden Markov
     state, but the variances varying non-randomly according to the
     observation number (or vice versa). Here 'mean' would be specified
     within 'pm' and 'sd' within 'pn'. Note that a given parameter can
     only occur within _one_ of 'pm' or 'pn'.

_C_o_m_p_l_e_t_e _D_a_t_a _L_i_k_e_l_i_h_o_o_d:

     The "complete data likelihood", L_c, is

    L_c = Pr{ X_1=x_1, cdots, X_n=x_n, C_1=c_1, cdots, C_n=c_n },.

     This can be shown to be

 Pr{ X_1=x_1 ,|, C_1=c_1 } Pr{ C_1=c_1 } prod_{i=2}^n Pr{ X_i=x_i ,|, C_i=c_i } Pr{ C_i=c_i ,|, C_{i-1}=c_{i-1} },,

     and hence, substituting model parameters, we get

 L_c = delta_{c_1} pi_{c_1c_2} pi_{c_2c_3} cdots pi_{c_{n-1}c_n} prod_{i=1}^n Pr{ X_i=x_i ,|, C_i=c_i },,

     and so

 log L_c = log delta_{c_1} +  sum_{i=2}^n log pi_{c_{i-1}c_i} + sum_{i=1}^n log Pr{ X_i=x_i ,|, C_i=c_i },.

     Hence the "complete data likelihood" is split into three terms:
     the first relates to parameters of the marginal distribution
     (Markov chain), the second to the transition probabilities, and
     the third to the distribution parameters of the observed random
     variable. When the Markov chain is non-stationary, each term can
     be maximised separately.

     When the hidden Markov chain is assumed to be stationary, delta =
     Pi^prime delta (see topic 'compdelta'), and then the first two
     terms of log L_c determine the transition probabilities Pi. This
     raises more complicated numerical problems, as the first term is
     effectively a constraint. We deal with this in a slightly ad-hoc
     manner by effectively disregarding the first term, which is
     assumed to be relatively small. In the M-step, the transition
     matrix is determined by the second term, then delta is estimated
     using the relation delta = delta Pi.

_R_e_f_e_r_e_n_c_e_s:

     Elliott, R.J.; Aggoun, L. & Moore, J.B. (1994). _Hidden Markov
     Models: Estimation and Control._ Springer-Verlag, New York.

     Harte, D. (2006). _Mathematical Background Notes for Package
     "HiddenMarkov"._ Statistics Research Associates, Wellington. URL:
     <URL:
     http://homepages.paradise.net.nz/david.harte/SSLib/Manuals/notes.pdf>.

     MacDonald, I.L. & Zucchini, W. (1997). _Hidden Markov and Other
     Models for Discrete-valued Time Series._ Chapman and Hall/CRC,
     Boca Raton.

     Rabiner, L.R. (1989). A tutorial on hidden Markov models and
     selected applications in speech recognition. _Proceedings of the
     IEEE_ *77(2)*, 257-286. URL: <URL:
     http://dx.doi.org/10.1109/5.18626>.

     Zucchini, W. (2005). _Hidden Markov Models Short Course, 3-4 April
     2005._ Macquarie University, Sydney.

_E_x_a_m_p_l_e_s:

     #-----  Test Gaussian Distribution -----

     Pi <- matrix(c(1/2, 1/2,   0,
                    1/3, 1/3, 1/3,
                      0, 1/2, 1/2),
                  byrow=TRUE, nrow=3)

     delta <- c(0, 1, 0)

     x <- dthmm(NULL, Pi, delta, "norm",
                list(mean=c(1, 6, 3), sd=c(0.5, 1, 0.5)))

     x <- simulate(x, nsim=1000)

     #    use above parameter values as initial values
     y <- BaumWelch(x)

     print(summary(y))
     print(logLik(y))
     hist(residuals(y))

     #   check parameter estimates
     print(sum(y$delta))
     print(y$Pi %*% rep(1, ncol(y$Pi)))

     #-----  Test Poisson Distribution  -----

     Pi <- matrix(c(0.8, 0.2,
                    0.3, 0.7),
                  byrow=TRUE, nrow=2)

     delta <- c(0, 1)

     x <- dthmm(NULL, Pi, delta, "pois", list(lambda=c(4, 0.1)),
                discrete = TRUE)

     x <- simulate(x, nsim=1000)

     #    use above parameter values as initial values
     y <- BaumWelch(x)

     print(summary(y))
     print(logLik(y))
     hist(residuals(y))

     #   check parameter estimates
     print(sum(y$delta))
     print(y$Pi %*% rep(1, ncol(y$Pi)))

     #-----  Test Exponential Distribution  -----

     Pi <- matrix(c(0.8, 0.2,
                    0.3, 0.7),
                  byrow=TRUE, nrow=2)

     delta <- c(0, 1)

     x <- dthmm(NULL, Pi, delta, "exp", list(rate=c(2, 0.1)))

     x <- simulate(x, nsim=1000)

     #    use above parameter values as initial values
     y <- BaumWelch(x)

     print(summary(y))
     print(logLik(y))
     hist(residuals(y))

     #   check parameter estimates
     print(sum(y$delta))
     print(y$Pi %*% rep(1, ncol(y$Pi)))

     #-----  Test Beta Distribution  -----

     Pi <- matrix(c(0.8, 0.2,
                    0.3, 0.7),
                  byrow=TRUE, nrow=2)

     delta <- c(0, 1)

     x <- dthmm(NULL, Pi, delta, "beta", list(shape1=c(2, 6), shape2=c(6, 2)))

     x <- simulate(x, nsim=1000)

     #    use above parameter values as initial values
     y <- BaumWelch(x)

     print(summary(y))
     print(logLik(y))
     hist(residuals(y))

     #   check parameter estimates
     print(sum(y$delta))
     print(y$Pi %*% rep(1, ncol(y$Pi)))

     #-----  Test Binomial Distribution  -----

     Pi <- matrix(c(0.8, 0.2,
                    0.3, 0.7),
                  byrow=TRUE, nrow=2)

     delta <- c(0, 1)

     #   vector of "fixed & known" number of Bernoulli trials
     pn <- list(size=rpois(1000, 10)+1)

     x <- dthmm(NULL, Pi, delta, "binom", list(prob=c(0.2, 0.8)), pn,
                discrete=TRUE)

     x <- simulate(x, nsim=1000)

     #    use above parameter values as initial values
     y <- BaumWelch(x)

     print(summary(y))
     print(logLik(y))
     hist(residuals(y))

     #   check parameter estimates
     print(sum(y$delta))
     print(y$Pi %*% rep(1, ncol(y$Pi)))

     #-----  Test Gamma Distribution  -----

     Pi <- matrix(c(0.8, 0.2,
                    0.3, 0.7),
                  byrow=TRUE, nrow=2)

     delta <- c(0, 1)

     pm <- list(rate=c(4, 0.5), shape=c(3, 3))

     x <- seq(0.01, 10, 0.01)
     plot(x, dgamma(x, rate=pm$rate[1], shape=pm$shape[1]),
          type="l", col="blue", ylab="Density")
     points(x, dgamma(x, rate=pm$rate[2], shape=pm$shape[2]),
            type="l", col="red")

     x <- dthmm(NULL, Pi, delta, "gamma", pm)

     x <- simulate(x, nsim=1000)

     #    use above parameter values as initial values
     y <- BaumWelch(x)

     print(summary(y))
     print(logLik(y))
     hist(residuals(y))

     #   check parameter estimates
     print(sum(y$delta))
     print(y$Pi %*% rep(1, ncol(y$Pi)))

