Transition Analysis (&  Other Stuff)

This page can essentially be thought of as a tutorial, and in order to work your way through you will probably want to install "R" on your computer.  Note that the following (with the exception of the Sugeno fuzzy integral) is only being applied to a single ordinal age "indicator" with three stages, and for the most part this page is only looking at parametric models (with the exception of the Sugeno fuzzy integral and kernel density estimation).

If you do not already have "R" you can get it from:

http://cran.r-project.org/

Note: the vast majority of the page below uses an add-on "R" package: VGAM.  The easiest way to get this is through the "R" menus once you get "R" started (follow the trail of gingerbread men below):

  1. "Packages" from the menu
  2. "Install package(s)..." from the sub-menu
  3. Select your closest "mirror" (I'm going with Ames, Iowa).
  4. And get VGAM
  5. Type "library(VGAM)" to get started using VGAM
This is shown below.


Package 1Package 2
Package 3Package 4Package 5


 
Getting Some Data

First things first, anything shown in fixed-width green text is intended to be copied and pasted into your R "Gui" (graphical user interface).

Click on the link below to get suture closure data for the pars orbitalia sphenoidal suture (1148 individuals, 1 = "open", 2 = "closing", 3 = "closed")

suture.csv

If you have the *.csv extension associated with a spreadsheet program, then this will open in a spreadsheet.  If so, save the file (with the extension *.csv) in your R working directory.  And then start R and save the data to an object, say (copy and paste the line below to your "R" Gui):

suture=read.csv('suture.csv')
 

Fitting Your First Models

The probit (on the straight and on the log age scale)

Mark and copy the one bold green line of text (below) to your "R" Gui, then hit "enter."

vglm(c(0,1,1)[suture[,2]]~suture[,1],family=binomialff(link='probitlink'))
 
This fit a probit model to find the probability (given age) that someone is in the "open suture" state as opposed to the "closing" or "closed" state.  The suture stage data is in the second column (so suture[,2]) and it is scored as 1, 2, or 3.  We want to rescore that as a binary (0,1) trait where "1" is recoded as "0" and "2" and "3" are recoded as "1", which is handled by c(0,1,1)[suture[,2]].  Now we regress the stage onto age (~suture[,1]) using vglm (vectorized general linear model) with a probit link.  So if you pasted that line into your Gui here's what you should have gotten:

Call:
vglm(formula = c(0, 1, 1)[suture[, 2]] ~ suture[, 1], family = binomialff(link = "probitlink"))

Coefficients:
(Intercept) suture[, 1]
 -1.6884546   0.0400675

Degrees of Freedom: 1148 Total; 1146 Residual
Residual Deviance: 1230.554
Log-likelihood: -615.2768


Now let's get this into a figure to look at.  The probit works in such a manner that (alpha/beta) is the mean age-at-transition and 1/betais the standard deviation of the transition ages, where alphais the intercept ("R" will always return that as negative which follows an old convention, but we're being unconventional, so take the the negative of what "R" gives you) and beta is the slope (shown as "suture[,1]" as that was the "predictor").  So if you paste the following line into your R Gui and hit enter:

plot(seq(0,100,.1),dnorm(seq(0,100,.1),1.68835/.04006,1/.04006),type='l',xlab='Age',ylab='Density',ylim=c(0,.025))

You will get:
probit
Well, that looks a bit problematic, in that there's a non-zero probability at age zero that the suture will be closing or closed.  So now let's say that instead of assuming the transition is normal we want to assume that it is log-normal.  So we resubmit our original analysis but taking the natural logarithm of age:

vglm(c(0,1,1)[suture[,2]]~log(suture[,1]),family=binomialff(link='probitlink'))

and we get:

Call:
vglm(formula = c(0, 1, 1)[suture[, 2]] ~ log(suture[, 1]), family = binomialff(link = "probitlink"))

Coefficients:
     (Intercept) log(suture[, 1])
       -6.804118         1.865738

Degrees of Freedom: 1148 Total; 1146 Residual
Residual Deviance: 1159.053
Log-likelihood: -579.5265


So now paste the following two lines into your Gui and you will get:

lines(seq(0,100,.1),dlnorm(seq(0,100,.1),6.804/1.866,1/1.866),lty=2,lwd=2)
legend(x='topright',legend=c('Straight scale','Log scale'),lty=c(1,2),bty='n',lwd=c(1,2))

Straight and log scale probit
What you've now done is overlayed the log normal (dashed line), and you can see that it is "behaving itself" such that the probability of the suture be in the closing or closed states is now estimated to be equal to zero at age zero.

The logit

Boldsen et al. (2002) use the logit instead of the probit.  The logit link is the default for the binomial family, so if you paste the following:

vglm(c(0,1,1)[suture[,2]]~suture[,1],family=binomialff)

You will get the logit analysis:

Call:
vglm(formula = c(0, 1, 1)[suture[, 2]] ~ suture[, 1], family = binomialff)

Coefficients:
(Intercept) suture[, 1]
-2.93107724  0.07124533

Degrees of Freedom: 1148 Total; 1146 Residual
Residual Deviance: 1219.952
Log-likelihood: -609.9761


Or if you paste the following (for the log-logistic model) you get:

vglm(c(0,1,1)[suture[,2]]~log(suture[,1]),family=binomialff)

Call:
vglm(formula = c(0, 1, 1)[suture[, 2]] ~ log(suture[, 1]), family = binomialff)

Coefficients:
     (Intercept) log(suture[, 1])
      -11.524805         3.170571

Degrees of Freedom: 1148 Total; 1146 Residual
Residual Deviance: 1156.955
Log-likelihood: -578.4773

Now we are going to plot the normal (from probit) and the logistic (from logit) together on one plot.  So here goes (copy and paste to your R Gui to produce the figure below):

plot(seq(0,100,.1),dlogis(seq(0,100,.1),2.93108/.0725,1/.0725),type='l',xlab='Age',ylab='Density')
lines(seq(0,100,.1),
dnorm(seq(0,100,.1),1.68835/.04006,1/.04006),lty=2,lwd=2)
legend(x='topright',legend=c('Logit','Probit'),lty=c(1,2),bty='n',lwd=c(1,2))

Compare probit and logit
And now we are going to plot the log-normal (from probit) and the log-logistic (from logit) together on one plot.  Various "R" packages have the log-logistic density, but it is a simple enough matter to write our own (that's what the little function dllogis is):

plot.loglogist=function ()
{
dllogis=function(x,a,s){
  return((a/s)*(x/s)^(a-1)/(1+(x/s)^a)^2)
}
age=seq(0,100,.1)
plot(age,dllogis(age,3.171,exp(11.525/3.171)),type='l',
  xlab='Age at suture closure',ylab='Density',)
lines(age,dlnorm(age,6.804/1.866,1/1.866),lty=2,lwd=2)
legend(x='topright',c('Log-logistic','Log-normal'),lty=c(1,2),lwd=c(1,2),bty='n')
}

plot.loglogist()

Log-normal and Log-logistic
A warning about interpreting coefficients from the logit (or logistic regression)

The logistic distribution can be parameterized using a mean and a scale parameter, or using a mean and a standard deviation.  The graphs above were drawn using the mean and a scale parameter, because that is how the "dlogis" function works.  But we would do well to compare the probit and logit models for their means and standard deviations, just so we can see how similar they are.  Let's start with the mean transition ages.

The mean transition age is found in the probit model by dividing the intercept by the slope, so we have the mean transition age as:

> 1.6884546/0.0400675
[1] 42.14025

And for the logit we have:

> 2.93107724/.07124533
[1] 41.14062

The standard deviation for the transition age in the probit model is the inverse of the slope:

> 1/0.0400675
[1]  24.95788

For the logit model we can first find the scale parameter as the inverse of the slope:

> 1/.07124533
[1] 14.03601

The standard deviation is then pi times the scale parameter divided by the square root of 3, so:

> 14.03601*pi/sqrt(3)
[1] 25.45851

So to sum that up, for the probit we have a mean and standard deviation of 42.14025 and 24.95788, while for the logit we have a mean and standard deviation of 41.14062 and 25.45851.



The exponential transition model

Samworth and Gowland (2007) have suggested fitting an exponential model (with a shift of "a" years).  While the exponential is a standard model for hazards analysis, it is not a standard model as far as general linear models (the vglm we kept using above) are concerned.  But not to fear, we can write a little R script to fit their model.  Mark, copy, and paste the following into your R Gui:

sam.tran.suture2=function (try=c(16,0.03))
{
####################################################
#      P(i|a)
####################################################
pia=function(stage,age,a,lambda){
if(stage==1) return(exp(-lambda[1]*(age-a)))
sec=0
for(j in 1:(stage-1)){
    p=1
    for(k in 1:(stage-1)){
       if(j!=k) p=p*lambda[k]/(lambda[k]-lambda[j])
     }
    sec=sec + exp(-lambda[j]*(age-a))*p
  }
if(stage==2) return(1-sec)
first=0
for(j in 1:stage){
    p=1
    for(k in 1:stage){
       if(j!=k) p=p*lambda[k]/(lambda[k]-lambda[j])
     }
    first=first + exp(-lambda[j]*(age-a))*p
  }
return(first-sec)
}

loglik=function(x){
lambda=0
a=x[1]
for(i in 1:2) lambda[i]=x[i+1]
zlnlk=0
for(i in 1:1148){
  sto=max(pia(c(1,2,2)[suture[i,2]],suture[i,1],a,lambda),1e-300)
  zlnlk=zlnlk+log(sto)
}
print(zlnlk)
flush.console()
return(zlnlk)
}

optim(try,loglik,control=list(fnscale=-1,maxit=5000))

}

sam.tran.suture2()

Which produces:

      .
      .
      .
[1] -573.1214
[1] -573.1213
[1] -573.1214
[1] -573.1213
$par
[1] 16.80512069  0.03472755

$value
[1] -573.1213

$counts
function gradient
      67       NA

$convergence
[1] 0

$message
NULL

The "par" stuff is what we need, so now we can draw the age-to-transition distribution (i.e., the distribution of ages at which the suture starts to close):

plot.sam.tran=function ()
{
    a = 16.80512069
    lambda = 0.03472755
    ages = seq(a, 100, length.out = 101)
    plot(ages, dexp(ages-a, lambda), type = "l", xlim = c(0, 100),
        ylab = "Density", xlab = "Age at suture closure",ylim=c(0,.035))
    lines(c(0, a), c(0, 0))
    b = 1/lambda
#   modal age-at-transition
    lines(rep(a, 2), c(0, dexp(0, lambda)), col = "red",lwd=2)
#   median age-at-transition
    lines(rep(a+b*log(2), 2), c(0, dexp(b*log(2), lambda)), col = "blue",lwd=2)
#   mean age-at-transition
    lines(rep(a + b, 2), c(0, dexp(b, lambda)),col='brown',lwd=2)
    legend(x='topright',c('Modal age','Median age','Mean age'),col=c('red',
     'blue','brown'),lwd=rep(2,3),bty='n')
}


plot.sam.tran()

which produces:

Exponential model
The way this model works there is zero probability of the suture starting to close until age 16.80512069 years, at which point the probability shoots up as high as it is ever going to be (hence the modal age at suture closure shown in red) and then the probability monotonically decreases. We would maintain that despite Samworth and Gowland's (2007) claim that: "The idea of this model is to try to capture the underlying biological process by which an individual moves between stages over time," the exponential model does not seem to follow most people's ideas about how biological aging occurs.  Compare Samworth and Gowland's (2007) model with the log-normal and see which you think might do a better job of capturing "the underlying biological process."

lines(seq(0,100,.1),dlnorm(seq(0,100,.1),6.804/1.866,1/1.866),lty=2,lwd=2)
legend(60,.025,legend=c('Exponential','Log Normal'),lty=c(1,2),bty='n',lwd=c(1,2))

Exponential and log-normal
where the dashed line is the log-normal model.  O.K., that's enough of the exponential model.  It can be easily extended from binary traits to cover multiple ordered stages (see Samworth and Gowland, 2007) as we will (eventually) do below.

Extending the analysis from binary traits to ordinal categorical traits

As if the above wasn't getting complicated enough, there are a number of different ways that one can extend the analysis to cover ordered trait scores with three or more states.  To keep things (relatively) simple, we will stick with the suture example, this time using three states (1 = "open", 2 = "closing", and 3 = "closed") and we will use primarily the "probit" link function.  In any of the analyses shown below if you leave off the link="probit" part you will get the "logit" link, as that is the default.  This section again makes extensive use of the VGAM package (Yee, 2010), so you will need to first download that if you have not already (see "Packages," "Install package(s)..." in the menu bar for "R" - hey, that rhymes!).  Then type library(VGAM) into your workspace and enter a return.

Cumulative probit ("restricted", "parallel")

WARNING: I have seen cases where vglm gets upset that lines are far from being parallel.  When this happens you cannot necessarily trust the parameter values.  It is much safer to use polr in the library MASS, as in the following:

library(MASS)
results=polr(factor(stage)~age,method='probit')
beta=as.vector(results$coef)
alpha=as.vector(results$zeta)



The cumulative probit with parallel (probit) regression lines is also known as "restricted cumulative probit" because of the restriction that the slopes must be equal.  The restricted cumulative logit is often referred to as the "proportional odds model," but this name cannot be applied to probit (i.e., "proportional odds probit" cannot be used).  To fit the restricted cumulative probit copy the line below into your workspace and you will get the following output:

vglm(suture[,2]~suture[,1],cumulative(link='probit',parallel = TRUE, reverse = TRUE))

Call:
vglm(formula = suture[, 2] ~ suture[, 1], family = cumulative(link = "probit",
    parallel = TRUE, reverse = TRUE))

Coefficients:
(Intercept):1 (Intercept):2   suture[, 1]
  -1.45404402   -2.26758869    0.03390022

Degrees of Freedom: 2296 Total; 2293 Residual
Residual Deviance: 2018.058
Log-likelihood: -1009.029

What was fit here was a probit analysis with a common slope and two intercepts.  We can plot these two transitions as follows:

plot(seq(0,100,.1),dnorm(seq(0,100,.1),1.45404402/0.03390022,1/0.03390022),type='l',xlab='Age',ylab='Density')
lines(seq(0,100,.1),dnorm(seq(0,100,.1),2.26758869/0.03390022,1/0.03390022))

Proportional odds probit
Now we want to find the probability at any given age that an individual will be in the "open suture" stage, the "closing suture" stage, or the "closed suture" stage.  We do this using 1-F(x), in other words, the "survivorship" functions or the reverse cumulative, which we show graphically below:

plot.cumpo=function ()
{
    plot(seq(0, 100, 0.1), 1 - pnorm(seq(0, 100, 0.1), 1.45404402/0.03390022,
        1/0.03390022), type = "l", ylim = c(0, 1), xlab = "Age",
        ylab = "Probability in Stage")
    lines(seq(0, 100, 0.1), 1 - pnorm(seq(0, 100, 0.1), 2.26758869/0.03390022,
        1/0.03390022))

#   P("open") at age 50

    p = 1 - pnorm(50, 1.45404402/0.03390022, 1/0.03390022)
    bot = p
    text(48, p/2,"P(\"open\")=",adj = 1)
    text(48, p/2-.04,as.character(round(p,4)),adj=1)

#   P("closing") at age 50

    p = 1 - pnorm(50, 2.26758869/0.03390022, 1/0.03390022)
    top = p
    text(48, .14 + bot + (top-bot)/2,"P(\"closing\")=",adj = 1)
    text(48, .14 + bot + (top-bot)/2-.04,as.character(round(top-bot,4)),adj=1)

#   P("closed") at age 50

    text(48, .1+(p+1)/2,"P(\"closed\")=",adj = 1)
    text(48, .1+(p+1)/2-.04,as.character(round(1-p,4)),adj=1)

    lines(c(50,50),c(0,bot),col='green',lwd=3)
    lines(c(50,50),c(bot,top),col='yellow',lwd=3)
    lines(c(50,50),c(top,1),col='red',lwd=3)

    points(c(50,50),c(bot,top), pch = 19)

}

plot.cumpo()

Proportional odds
The two black decreasing curves are for the probability of being in the "open" as versus any higher stage and for being in the "open" or "closing" stage as versus the closed stage.  So by subtraction you can get the probability of being in the "closing" stage (shown in yellow for age 50), as well as the probability of being in the "open" stage (shown in green) or the "closed" stage (shown in red).  Note that the probabilities sum to 1.0 (0.4048+0.3117+0.2835=1.0).  This graph is more typically shown as what Love and Müller (2002) have referred to as the "weight functions" (probabilities of being in each stage at a fixed age).  So let's try that:

plot.cumpo2=function ()
{
    age = seq(0, 100, 0.1)
    p1 = 1 - pnorm(age, 1.45404402/0.03390022, 1/0.03390022)
    p2 = 1 - pnorm(age, 2.26758869/0.03390022, 1/0.03390022)
    plot(age, p1, type = "l", ylim = c(0, 1), xlab = "Age", ylab = "Probability in Stage",
        lwd = 2)
    lines(age, p2 - p1, lty = 3, lwd = 2)
    lines(age, 1 - p2, lty = 2, lwd = 2)
    legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
        3), lty = c(1, 3, 2),bty='n')
    print(noquote("At age 0:"))
    print(noquote(paste("prob. suture open = ", round(p1[1],
        4))))
    print(noquote(paste("prob. suture closing = ", round(p2[1] -
        p1[1], 4))))
    print(noquote(paste("prob. suture closed = ", round(1 - p2[1],
        4))))
}

plot.cumpo2()


Weight functions from cumulative probit
If it wasn't clear from the first graph, this graph makes it clear that at age zero the probability that the suture is open is less than 1.0.  If you run the script you'll find that the probabilities of being in the "open", "closing", and closed stages at age zero are 0.927, 0.0613, and 0.0117, respectively.  This is an undesirable property of the cumulative probit analysis that comes from assuming a common slope in the probit analysis (i.e., a common standard deviation for the transitions, see the above graph).  We had to assume a common slope because if the probability lines cross in the above graph, then we would get negative probabilities for being in "internal" stages (in this case, the "closing" stage).  That's bad.  But on the other hand, it also does not make sense to have individuals who have already transitioned at birth into higher stages.  One simple way to get around both problems is to assume that transitions are log-normal rather than normal.

vglm(suture[,2]~log(suture[,1]),cumulative(link='probit',parallel = TRUE, reverse = TRUE))

Call:
vglm(formula = suture[, 2] ~ log(suture[, 1]), family = cumulative(link = "probit",
    parallel = TRUE, reverse = TRUE))

Coefficients:
   (Intercept):1    (Intercept):2 log(suture[, 1])
       -6.154022        -7.010010         1.683116

Degrees of Freedom: 2296 Total; 2293 Residual
Residual Deviance: 1932.268
Log-likelihood: -966.1341

So let's plot the transition distributions:

plot(seq(0,100,.1),dlnorm(seq(0,100,.1),6.154022/1.683116,1/1.683116),type='l',xlab='Age',ylab='Density')

lines(seq(0,100,.1),dlnorm(seq(0,100,.1),7.010010/1.683116,1/1.683116))

Log normal cumulative probit
and let's plot the weight functions:

plot.Lcumpo2=function ()
{
    age = seq(0, 100, 0.1)
    p1 = 1 - plnorm(age, 6.154022/1.683116,1/1.683116)
    p2 = 1 - plnorm(age, 7.010010/1.683116,1/1.683116)
    plot(age, p1, type = "l", ylim = c(0, 1), xlab = "Age", ylab = "Probability in Stage",
        lwd = 2)
    lines(age, p2 - p1, lty = 3, lwd = 2)
    lines(age, 1 - p2, lty = 2, lwd = 2)
    legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
        3), lty = c(1, 3, 2), bty = "n")
    print(noquote("At age 0:"))
    print(noquote(paste("prob. suture open = ", round(p1[1],
        4))))
    print(noquote(paste("prob. suture closing = ", round(p2[1] -
        p1[1], 4))))
    print(noquote(paste("prob. suture closed = ", round(1 - p2[1],
        4))))
}

plot.Lcumpo2()
Weight functions, log cumm. probit
Yippee!  Now the probability at age zero that an individual has a suture that is closing or closed is 0.0.  This is not the only way to skin this particular cat, and indeed there have been other approaches which we will (for completeness sake) document below.

Unrestricted cumulative probit

Konigsberg and Herrmann (2002) suggested using an "unrestricted cumulative probit," which fits separate slopes (or equivalently, separate standard deviations for the transition distributions).  At the time that they suggested this the software options for fitting this model were somewhat limited.  CatReg was available for S+ (and now for "R", see http://cfpub.epa.gov/ncea/cfm/recordisplay.cfm?deid=18162).  Konigsberg previously made Fortran code and an executable file  available (at http://konig.la.utk.edu/nphases2.htm, but this is no longer available).  Thankfully, VGAM handles the unrestricted cumulative probit very nicely, so there is absolutely no reason to persist in using Nphases2.  For "historical interest," see the section below which compares output from VGAM, CatReg, and Nphases2.

vglm(suture[,2]~suture[,1],cumulative(link='probit',parallel = FALSE, reverse = TRUE))

Call:
vglm(formula = suture[, 2] ~ suture[, 1], family = cumulative(link = "probit",
    parallel = FALSE, reverse = TRUE))

Coefficients:
(Intercept):1 (Intercept):2 suture[, 1]:1 suture[, 1]:2
  -1.61405531   -1.94941183    0.03820941    0.02775311

Degrees of Freedom: 2296 Total; 2292 Residual
Residual Deviance: 1999.763
Log-likelihood: -999.8814

And let's plot the weight functions:

plot.unrest = function ()
{
    age = seq(0, 100, 0.1)
    p1 = 1 - plnorm(age, 6.154022/1.683116,1/1.683116)
    p2 = 1 - plnorm(age, 7.010010/1.683116,1/1.683116)
    plot(age, p1, type = "l", ylim = c(0, 1),
        xlab = "Age", ylab = "Probability in Stage",col='gray',lwd = 2)
    lines(age, p2 - p1, lty = 3, lwd = 2, col = 'gray')
    lines(age, 1 - p2, lty = 2, lwd = 2, col = 'gray')
    legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
        3), lty = c(1, 3, 2), bty = "n")
    p1 = 1 - pnorm(age,1.61405531/0.03820941, 1/0.03820941)
    p2 = 1 - pnorm(age,1.94941183/0.02775311, 1/0.02775311)
    lines(age, p1, lwd=2)
    lines(age, p2 - p1, lty = 3, lwd = 2)
    lines(age, 1 - p2 , lty = 2, lwd = 2)
}

plot.unrest()

Unrestricted cumulative probit
where the gray lines are from the log-scale (restricted, i.e., "proportional odds") cumulative probit and the dark lines are from the unsretricted cumulative probit on the straight scale.

Comparison of the output from unrestricted cumulative probit in VGAM, CatReg, and Nphases2

VGAM
Call:
vglm(formula = suture[, 2] ~ suture[, 1], family = cumulative(link = "probit",
    parallel = FALSE, reverse = TRUE))

Coefficients:
(Intercept):1 (Intercept):2 suture[, 1]:1 suture[, 1]:2
  -1.61405531   -1.94941183    0.03820941    0.02775311

Degrees of Freedom: 2296 Total; 2292 Residual
Residual Deviance: 1999.763
Log-likelihood: -999.8814

CatReg
CatReg - R Beta Version 2.1 (01/06/11)

Source data file: tocatreg.csv

Type of analysis: Censored

Input file   : tocatreg.csv
Filtered data: none
Model        : unrestricted cumulative model
Link         : probit
Clustering   : none
Message      :
Iterations   : 43 7
Deviance     : 1999.763
Residual DF  : 2292
AIC          : 2007.763

Scale:
     Concentration: mg/m3
     Duration     : Hours

Stratification:
     No Stratification on Intercept, Concentration and Duration.   

Coefficients:
             Estimate  Std. Error   Z-Test=0    p-value
SEV1      -1.61408605 0.098159531  -16.44350      1e-05
SEV2      -1.94967848 0.119957376  -16.25309      1e-05
CONC:SEV1  0.03821028 0.002200964   17.36071      1e-05
CONC:SEV2  0.02775802 0.002333177   11.89709      1e-05

Analysis of Deviance Statistics:

Generalized R-squared: 0.154

           DF  Deviance Mean.Dev   Gen.F pvalue
Model       2  362.8459  181.423 207.935      0
Residual 2292 1999.7627    0.872              
Total    2294 2362.6086

Nphases2

  Output file name:
suture.txt
  Name of input file:
suture.prn
 I read 1148 records.  If that is wrong, bail!
 (i.e., ctrl-break)

 This system has  3 phases
  Model to fit?
   ("C" or "c" will fit cumulative probit,
    "L" or "l" will fit log age cumulative probit,
    any other single character will fit
    multiple s.d. model):
s
  Bottom truncation age?:
0
  Top truncation age?:
1000
  nopar =   4
 INITIAL FUNCTION VALUE F= 0.3861404213717E+04
 OPTSTP    RELATIVE GRADIENT CLOSE TO ZERO.
 OPTSTP    CURRENT ITERATE IS PROBABLY SOLUTION.
 FUNCTION VALUE F(X)= 0.9998813573202E+03 AT ITERATION   10
   0
       42.242201        1.077599
       70.238370        2.441374
       26.170956        1.507542
       36.025626        3.028228

Note: The parameter values are identical from VGAM, CatReg, and Nphases2, save for the fact the in Nphases2 there is a re-parameterization from probit regression to means and standard deviations of transitions.     
Forward continuation ratio logit

Boldsen et al. (2002) suggested using a forward continuation ratio logit.  Continuation ratios are an alternative to the cumulative probit (or logit) we were using above, and there are also related "stopping rule" models that have not (so far as I know) been applied in age estimation.  Continuation ratios (or stopping rules) use successive conditioning, and are probably simplest to understand graphically, as below:

Continuation Ratios and Stopping Rules

 Here's the code for the forward continuation ratio logit and the plot of the weight functions:

vglm(suture[,2]~suture[,1],cratio(parallel = F, reverse = F))

Call:
vglm(formula = suture[, 2] ~ suture[, 1], family = cratio(parallel = F,
    reverse = F))

Coefficients:
(Intercept):1 (Intercept):2 suture[, 1]:1 suture[, 1]:2
  -2.93107725   -0.87965245    0.07124533    0.01698143

Degrees of Freedom: 2296 Total; 2292 Residual
Residual Deviance: 1979.126
Log-likelihood: -989.5632

forward.cr = function ()
{
alpha=c(-2.93107725,-0.87965245)
beta=c(0.07124533,0.01698143)
expit=function(x) {return(exp(x)/(1+exp(x)))}
age=seq(0,100,.1)


p1 = 1 - plnorm(age, 6.154022/1.683116,1/1.683116)
p2 = 1 - plnorm(age, 7.010010/1.683116,1/1.683116)
plot(age, p1, type = "l", ylim = c(0, 1),
    xlab = "Age", ylab = "Probability in Stage",col='gray',lwd = 2,
    main='Forward Continuation Ratio (Logit)')
lines(age, p2 - p1, lty = 3, lwd = 2, col = 'gray')
lines(age, 1 - p2, lty = 2, lwd = 2, col = 'gray')
legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
    3), lty = c(1, 3, 2), bty = "n")

p.1=expit(alpha[1]+beta[1]*age)
p.2=expit(alpha[2]+beta[2]*age)

p1 = 1 - p.1
p2 = p.1*(1-p.2)
p3 = p.1*p.2

lines(age,p1,type='l',lwd=2)
lines(age,p2,lty=3,lwd=2)
lines(age,p3,lty=2,lwd=2)
}


forward.cr()

Forward continuation ratio
As before, the gray lines are from the log-scale cumulative probit.

O.K., just to be clear let's make sure we see how these various conditional probabilities are being used in the forward continuation ratios.  To be concrete about this we will use a specific age, which is 53.4 years.  The "expit" function in the above code is a simple function that finds probabilities from the logit regression.  The p.1 term in the above code is the probability that a 53.4 year-old would have crossed the "open" versus "closing" threshold at an earlier age, and this probability from the estimated model turns out to be 0.70545753. The p.2 term in the above code is the probability that a 53.4 year-old would have crossed the "closing" versus "closed" threshold at an earlier age given that they already crossed the "open" versus "closing" threshold at an even earlier age.  This probability from the estimated model turns out to be 0.5067886.  Now to find the probability that a 53.4 year-old would be in the "open" suture stage we need to find the probability that they have not crossed the first threshold, which is 1 - 0.70545753 = 0.29454247.  The calculation to find the probability that a 53.4 year-old is in the "closing" stage is a bit more complicated.  The probability that said individual will have crossed the first threshold (at an earlier age) is p.1, or 0.70545753.  We can think of this as a proportion, so 0.70545753 of 53.4 year-olds will have crossed the first threshold at any earlier age, meaning that about 70.55% of 53.4 year-olds will be in the "closing" or "closed" states.  Now we need to find the probability that they have not crossed the second threshold, so p.1*(1-p.2) = 0.70545753 × 0.4932114 = 0.3479397.  Finally, let's return to the "closing" stage.  It is the proportion of 53.4 year-olds who have crossed both thresholds, so 0.7054575*0.5067886= 0.3575178.  Finally, let's check the math for errors: 0.2945425 + 0.3479397 + 0.3575178 = 1, so we're good!


Backward continuation ratio logit

Boldsen et al. (2002) used a forward continuation ratio logit, but suggested that a backward continuation logit could be used instead.  Here is how that looks:

vglm(suture[,2]~suture[,1],cratio(parallel = F, reverse = T))

Call:
vglm(formula = suture[, 2] ~ suture[, 1], family = cratio(parallel = F,
    reverse = T))

Coefficients:
(Intercept):1 (Intercept):2 suture[, 1]:1 suture[, 1]:2
   2.94263897    3.18080637   -0.05587934   -0.04500276

Degrees of Freedom: 2296 Total; 2292 Residual
Residual Deviance: 2013.326
Log-likelihood: -1006.663

backward.cr=function()
{

alpha=-c(2.94263897, 3.18080637)
beta=-c(-0.05587934,-0.04500276)
expit=function(x) {return(exp(x)/(1+exp(x)))}
age=seq(0,100,.1)

p1 = 1 - plnorm(age, 6.154022/1.683116,1/1.683116)
p2 = 1 - plnorm(age, 7.010010/1.683116,1/1.683116)
plot(age, p1, type = "l", ylim = c(0, 1),
    xlab = "Age", ylab = "Probability in Stage",col='gray',lwd = 2,
    main='Backward Continuation Ratio (Logit)')
lines(age, p2 - p1, lty = 3, lwd = 2, col = 'gray')
lines(age, 1 - p2, lty = 2, lwd = 2, col = 'gray')
legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
    3), lty = c(1, 3, 2), bty = "n")

p.1=expit(alpha[1]+beta[1]*age)
p.2=expit(alpha[2]+beta[2]*age)

p1 = (1-p.1)*(1-p.2)
p2 = p.1*(1-p.2)
p3 = p.2

lines(age,p1,type='l',lwd=2)
lines(age,p2,lty=3,lwd=2)
lines(age,p3,lty=2,lwd=2)
}

backward.cr()

Backward continuation ratios
And once again the gray lines are from the log scale cumulative probit.

Finally for the continuation ratios it may be useful to compare the backward versus forward continuation ratios.  They do differ slightly (Bender and Benner, 2000), but there is no real basis to prefer one over the other.  Boldsen et al. (2002) picked the forward continuation ratios:

back.and.forth=function ()
{
 
alpha=-c(2.94263897, 3.18080637)
beta=-c(-0.05587934,-0.04500276)
expit=function(x) {return(exp(x)/(1+exp(x)))}
age=seq(0,100,.1)


# Backward continuation ratios

p.1=expit(alpha[1]+beta[1]*age)
p.2=expit(alpha[2]+beta[2]*age)

p1 = (1-p.1)*(1-p.2)
p2 = p.1*(1-p.2)
p3 = p.2

plot(age, p1, type = "l", ylim = c(0, 1),
    xlab = "Age", ylab = "Probability in Stage",col='gray',lwd = 2,
    main='Comparison of Backward & Forward CRs')
lines(age,p2,lty=3,lwd=2,col='gray')
lines(age,p3,lty=2,lwd=2,col='gray')

# Forward continuation ratios

alpha=c(-2.93107725,-0.87965245)
beta=c(0.07124533,0.01698143)

p.1=expit(alpha[1]+beta[1]*age)
p.2=expit(alpha[2]+beta[2]*age)

p1 = 1 - p.1
p2 = p.1*(1-p.2)
p3 = p.1*p.2

lines(age,p1,type='l',lwd=2)
lines(age,p2,lty=3,lwd=2)
lines(age,p3,lty=2,lwd=2)

legend(50,1,c('Backward CRs','Forward CRs'),text.col=c('grey','black'),bty='n')

}


back.and.forth()

Backward and forwards CRs

Multivariate latent-trait approach (Holman et al., 2002)

Holman et al. (2002) have suggested a particular multivariate latent-trait approach that may be useful for high dimensional problems (i.e., applications with a lot of different age-at-death "indicators").  But it is not clear that there is any advantage to using their method for single ordinal categorical traits.  We show how their model works with the three stage suture closure data.

First the scoring is converted from the 1, 2, 3 scheme to a dummy variable type coding:

State
T1
T2
"open"
0
0
"closing"
1
0
"closed"
1
1

The dummy variables indicate whether or not a threshold has been crossed ("1" means that the threshold has been crossed while "0" means that it has not). Holman et al. (2002) treat these dummy variables as the variables of interest in binary (log-age scale) probit regression analyses.  As a first stab, let's try this model in "R" and assume independence between the two threshold "traits":

latent.trait=function ()
{
age=log(suture[,1])
stage=suture[,2]
N=length(age)


LK = function(x){
 LK=0
 mu1=x[1]
 s1=x[2]
 mu2=x[3]
 s2=x[4]
 for(j in 1:1148){
 if (stage[j]==1) LK = LK + log(pnorm(age[j],mu1,s1,lower.tail=F)*pnorm(age[j],mu2,s2,lower.tail=F))
 if (stage[j]==2) LK = LK + log(pnorm(age[j],mu1,s1)*pnorm(age[j],mu2,s2,lower.tail=F))
 if (stage[j]==3) LK = LK + log(pnorm(age[j],mu1,s1)*pnorm(age[j],mu2,s2))
}
print(LK)
flush.console()
return(LK)
}

optim(c(log(30),.3,log(60),.4),LK,control=list(fnscale=-1))
}


which gets us to parameter values of 3.646852, 0.535961, 4.221854, and 0.712099.  That said, these parameters probably have no direct interpretation in terms of finding the probabilities that individuals are in particular closure states at given ages.  For example, the first term shown in the log-likelihood above is supposed to be the (log) probability of being in the "open" suture state.  At 53.4 years the estimated parameter values tell us that there is a 0.2684507 chance that the first threshold has not been passed and a 0.6340917 that the second threshold has not been passed.  Under the independence assumption these probabilities are being multiplied together to get 0.1702224, but we know that if the first threshold has not been passed then the second cannot have been passed.  Holman et al. first suggest that the covariance between the threshold traits needs to be estimated (rather than assumed equal to zero), but estimating a covariance does not account for the structural relationship among the two threshold traits.  Specifically, one cannot cross the second threshold without having first crossed the first threshold, so thought in terms of a tetrachoric correlation there can be no individuals who "have" the second trait but "lack" the first.  Holman et al. (2002) also suggest a proportional hazard model that could have very broad applicability for multiple ordinal traits.  But again, this model does not appear to deal with the structural relationships between thresholds within single ordinal traits.

Exponential hazard of transition (Samworth and Gowland, 2007)

First we use maximum likelihood estimation to get the parameter estimates.  Incidentally, we are starting very close to the actual parameter values found earlier by iterating from starting values that weren't so close.  Using unreasonable starting values can cause a lot of trouble when trying to search the likelihood surface.

sam.tran.suture = function (c(16.68580421,  0.03390403,  0.03756860))
{

####################################################
#      P(i|a)
####################################################
pia=function(stage,age,a,lambda){
if(stage==1) return(exp(-lambda[1]*(age-a)))
sec=0
for(j in 1:(stage-1)){
    p=1
    for(k in 1:(stage-1)){
       if(j!=k) p=p*lambda[k]/(lambda[k]-lambda[j])
     }
    sec=sec + exp(-lambda[j]*(age-a))*p
  }
if(stage==3) return(1-sec)
first=0
for(j in 1:stage){
    p=1
    for(k in 1:stage){
       if(j!=k) p=p*lambda[k]/(lambda[k]-lambda[j])
     }
    first=first + exp(-lambda[j]*(age-a))*p
  }
return(first-sec)
}

loglik=function(x){
lambda=0
a=x[1]
for(i in 1:2) lambda[i]=x[i+1]
zlnlk=0
for(i in 1:1148){
  sto=max(pia(as.numeric(sutures[i,10])+1,sutures[i,2],a,lambda),1e-300)
  zlnlk=zlnlk+log(sto)
}
print(zlnlk)
flush.console()
return(zlnlk)

}

optim(try,loglik,control=list(fnscale=-1,maxit=5000))

}

sam.tran.suture()

.
.
.
[1] -966.1596
[1] -966.1596
[1] -966.1596
[1] -966.1596
[1] -966.1596
$par
[1] 16.68623929  0.03390035  0.03756875

$value
[1] -966.1596

$counts
function gradient
      97       NA

$convergence
[1] 0

$message
NULL

And then the code to plot the "weight" functions:

plot.sam.tran2 = function (ip=1,bot=0,top=100)
{
a=16.68623929
lambda=c(0.03390035,0.03756875)

####################################################
trans=function(age,stage){
if(stage==1) return(exp(-lambda[1]*((age>a)*(age-a))))
sec=0
for(j in 1:(stage-1)){
    p=1
    for(k in 1:(stage-1)){
       if(j!=k) p=p*lambda[k]/(lambda[k]-lambda[j])
     }
    sec=sec + exp(-lambda[j]*(age>a)*(age-a))*p
  }
if(stage==3) return(1-sec)
first=0
for(j in 1:stage){
    p=1
    for(k in 1:stage){
       if(j!=k) p=p*lambda[k]/(lambda[k]-lambda[j])
     }
    first=first + exp(-lambda[j]*(age>a)*(age-a))*p
  }
return(first-sec)
}
#################END OF FUNCTIONS################################

    age = seq(0, 100, 0.1)
    p1 = 1 - plnorm(age, 6.154022/1.683116,1/1.683116)
    p2 = 1 - plnorm(age, 7.010010/1.683116,1/1.683116)
    plot(age, p1, type = "l", ylim = c(0, 1),
        xlab = "Age", ylab = "Probability in Stage",col='gray',lwd = 2)
    lines(age, p2 - p1, lty = 3, lwd = 2, col = 'gray')
    lines(age, 1 - p2, lty = 2, lwd = 2, col = 'gray')

    type.line=c(1,3,2)
    for(k in 1:3) lines(age,trans(age,k),lty=type.line[k],lwd=2)
    legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
    3), lty = c(1, 3, 2), bty = "n")

}

plot.sam.tran2()

Samworth and Gowland, 2007
where once again the gray lines are from the restricted cumulative log probit.  Note the "sharp points" at 16.68623929 years for the probability of being in the open stage or the closing stage, a problem we previous pointed out when there were just two stages.  But aside from that, Samworth and Gowland's (2007) model is surprisingly similar to the restricted cumulative log probit.

Kernel density estimation (Lucy et al., 1996, Love and Müller, 2002, Lucy et al., 2002)

To use kernel density estimation you will need to download GenKern and use the command library(GenKern), just as you did for VGAM near the top of this page.  Then you can do the kernel density plot, as follows:

kern.plot=function (xb=4)
{
library(GenKern)
x<-suture$age
y<-suture$stage
p<-matrix(rep(0,101*3),nc=3)
all<-KernSec(x,range.x=0:100,xb=xb)$yden

for(i in 1:3){
  p[,i]<-KernSec(x[y==i],range.x=0:100,xb=xb)$yden/all
}
plot(0:100,p[,1],type='l',ylim=c(0,1),xlab='Age',ylab='Probability',lwd=2)
lines(0:100,p[,2],lty=3,lwd=2)
lines(0:100,p[,3],lty=2,lwd=2)

legend(40, 1, c("P(open)", "P(closing)", "P(closed)"), lwd = rep(2,
3), lty = c(1, 3, 2), bty = "n")

}

kern.plot()

Kernel density
kern.plot2 = function (xb=4)
{
library(GenKern)
x<-suture$age
y<-c(1,2,2)[suture$stage]
p<-matrix(rep(0,101*2),nc=2)
all<-KernSec(x,range.x=0:100,xb=xb)$yden

for(i in 1:2){
  p[,i]<-KernSec(x[y==i],range.x=0:100,xb=xb)$yden/all
}
plot(0:100,p[,1],type='l',ylim=c(0,1),xlab='Age',ylab='Probability',lwd=2)
lines(0:100,p[,2],lty=3,lwd=2)

legend(40, 1, c("P(open)", "P(closing or closed)"), lwd = rep(2,
2), lty = c(1, 3), bty = "n")

}

The closing versus closed stages look pretty indistinguishable above, so we might want to collapse those two stages:

kern.plot2=function (xb=4)
{
library(GenKern)
x<-suture$age
y<-c(1,2,2)[suture$stage]
p<-matrix(rep(0,101*2),nc=2)
all<-KernSec(x,range.x=0:100,xb=xb)$yden

for(i in 1:2){
  p[,i]<-KernSec(x[y==i],range.x=0:100,xb=xb)$yden/all
}
plot(0:100,p[,1],type='l',ylim=c(0,1),xlab='Age',ylab='Probability',lwd=2)
lines(0:100,p[,2],lty=3,lwd=2)

legend(40, 1, c("P(open)", "P(closing or closed)"), lwd = rep(2,
2), lty = c(1, 3), bty = "n")

}

kern.plot2()

Kernel density

Sugeno fuzzy integral (Anderson et al., 2010, 2011)

The Sugeno fuzzy integral has been suggested by Anderson et al. (2010, 2011) as a method for combining age ranges from different indicators into a summary graph of the agreement between the indicators.  This is an inherently non-statistical approach.  As Zadeh (1965:340) has noted: "the notion of a fuzzy set is completely nonstatistical in nature."  And given that Anderson et al. begin with age ranges from the literature that are known to be biased by the age structure of the reference samples (see Bocquet-Appel and Masset, 1982 for an explanation of why this happens), the fuzzy integral method perpetuates this bias (see in particular Fig. 4-24 from Anderson, 2008).  But for completeness sake the Sugeno fuzzy integral is included here.

plot.Sugeno=function (sex='F',scores=c(5,3,7,0))
{
ps.bot=c(18,20,22,25,27,30,35,39,45,50)
ps.top=c(19,21,24,26,30,35,39,44,49,100)

au.bot=c(20,25,30,35,40,45,50,60)
au.top=c(24,29,34,39,44,49,59,100)

va.bot=c(0,rep(18,2),rep(22,4),rep(24,5),rep(24,4),rep(30,3),rep(23,2),40)
va.top=c(49,rep(45,2),rep(48,4),rep(60,5),rep(75,4),rep(71,3),rep(76,2),100)

la.bot=c(0,19,25,rep(23,3),23,rep(32,2),rep(33,2),rep(34,4))
la.top=c(50,48,49,rep(68,3),63,rep(65,2),rep(76,2),rep(68,4))

bot=c(ps.bot[scores[1]],au.bot[scores[2]],va.bot[scores[3]+1],la.bot[scores[4]+1])
top=c(ps.top[scores[1]],au.top[scores[2]],va.top[scores[3]+1],la.top[scores[4]+1])
for(i in 1:4){
print(noquote(paste(bot[i],' - ',top[i])))
}
p=rep(0,100)
for(i in 1:100) p[i]=Sugeno(sex=sex,i,bot,top)
plot(c(1:100,100,1),c(p,0,0),type='l',ylim=c(0,1),ylab='Confidence',xlab='Age',xlim=c(1,100))
polygon(c(1:100,100,1),c(p,0,0),col='gray')
}

Sugeno=function(sex='F',age = 60, bot = c(30, 40, 30, 33), top = c(35, 44, 71, 76))
{

#   h values are 1 if age is in indicator intervals, 0 otherwise

    h = rep(0, 4)
    for (i in 1:4) if (age >= bot[i] & age <= top[i])
        h[i] = h[i] + 1

#   g values are reported correlations of indicators with age from the literature
#   Female values
    if(sex=='F') g = c(0.64, 0.72, 0.34,.34)

#   Male values
    if(sex=='M') g = c(0.57, 0.72, 0.59,.59)

#   Sort on decreasing h (i.e., ones ahead of zeros)

    ord = sort(h, decreas = T, index.ret = T)$ix
    h = h[ord]
    g = g[ord]

#   Lambda (a function of the g values) previously calculated using "uniroot" function in "R"

    if(sex=='M') L=-0.9687146
    if(sex=='F') L=-0.9763193

    g1 = g[1]

    g12 = g[1] + g[2] + L * g[1] * g[2]

    g12.3 = g12 + g[3] + L * g12 * g[3]

    g123.4 = g12.3 + g[4] + L * g12.3 * g[4]

    max(min(h[1], g1), min(h[2], g12), min(h[3], g12.3), min(h[4],
        g123.4))
}


plot.Sugeno()

Sugeno fuzzy integral
which reproduces Fig. 3 from Anderson et al. (2010).

Testing the assumptions of a few of the models (parallel lines?)

If there are "restricted" as versus "unrestricted" versions of various models then this suggests that likelihood ratio tests could be used to see the restricted models have adequate fit relative to the unrestricted versions.  Indeed, this can be done using the difference in the residual deviance between two models where one model is "nested" within the other.  Comparing the straight scale restricted cumulative probit with the unrestricted in "R":

> 2018.058-1999.763
[1] 18.295
> 1-pchisq(18.295,1)
[1] 1.892031e-05

So the parallel slopes assumption is not a good one here.  Even slipping into the log scale will not rescue us from this problem:

> 1932.268-1920.595

[1] 11.673
> 1-pchisq(11.673,1)
[1] 0.0006341367

(Note, the unrestricted cumulative probit on the log scale wasn't fit on this page, but is is a simple modification).

Brant (1990) suggested a Wald test (score test) that could be used to test the assumption of parallel lines in a proportional log-odds (logit) model:

brant.logit = function (dat=suture,log.it=T)
{
    library(VGAM)
    age = dat[, 1]
    if(log.it==T) age=log(age)
    stage = dat[, 2]
    N = length(stage)
    T1 = vglm(c(0, 1, 1)[stage] ~ age, family = binomialff)
    T2 = vglm(c(0, 0, 1)[stage] ~ age, family = binomialff)
    wiml = 0
    p1 = 1/(1 + exp(-predict(T1)))
    p2 = 1/(1 + exp(-predict(T2)))
    W11=diag(N); diag(W11) = p1 - p1^2
    W12=diag(N); diag(W12) = p2 - p1 * p2
    W22=diag(N); diag(W22) = p2 - p2^2
    X = cbind(rep(1, N), age)

    B=c(coef(T1)[2],coef(T2)[2])
    V=diag(2)
    v12=(solve(t(X)%*%W11%*%X)%*%(t(X)%*%W12%*%X)%*%solve(t(X)%*%W22%*%X))[-1,-1]
    V[1,2]=v12; V[2,1]=v12
    V[1,1]=vcov(T1)[-1,-1]
    V[2,2]=vcov(T2)[-1,-1]

    D=c(1,-1)
    DB=D%*%B
    X2=as.vector(DB*solve(t(D)%*%V%*%D)*DB)
    return(1-pchisq(X2,1))
}

brant.logit()

which returns a p-value of 9.333906e-06, so once again it looks like the lines are not parallel. 

Specification Tests

As Greene and Hensher (2010) note, applying a specification test is a more direct way of examining the parallel lines assumption.  And there is the added benefit that we can test whether the transition distributions are actually normal, or log-normal, or logistic, or log-logistic.  Glewwe (1996), Johnson (1996), and Weiss (1996) all described a Lagrange multiplier specification test for the ordered probit model (with parallel lines, so restricted cumulative probit).  The code below follows most closely Johnson's description, and includes the special case for binary dependent variables (like suture open versus closed) given by Bera et al. (1984).  Incidentally, the 1996 publications all used Bera et al.'s work as the point of departure.

LM.test=function (dat,collapse=1:10,log.it=T)
{
library(MASS)
age=dat[,1]
if(log.it==T) age=log(age)
stage=dat[,2]
#  Number of individuals
n=NROW(stage)
stage=collapse[stage]
#  Number of states
m=NROW(table(stage))
if(m==2) stage=stage-1
if(m>=3) {

results=polr(factor(stage)~age,method='probit')
beta=as.vector(results$coef)
alpha=as.vector(results$zeta)

E1sum=0
E2sum=0
BOX=matrix(rep(0,(m+2)*(m+2)),nc=(m+2))
for(i in 1:n){
 
  vec=rep(0,m+2)

  if(stage[i]==1){
     hj=alpha[1]-beta*age[i]
     norm=dnorm(hj)/pnorm(hj)
     E1=-(1-hj^2)*norm
     E2=-(3+hj^2)*hj*norm
     vec[1]=norm
     vec[m]=-vec[1]*age[i]
  }

  if(stage[i]==m){
    hjm1=alpha[m-1]-beta*age[i]
    norm=1/(1-pnorm(hjm1))
    E1=((1-hjm1^2)*dnorm(hjm1))*norm
    E2=((3+hjm1^2)*hjm1*dnorm(hjm1))*norm
    vec[m-1]=-dnorm(hjm1)*norm
    vec[m]=-vec[m-1]*age[i]
   
  }

  if(stage[i]>1 & stage[i]<m){
     hj=alpha[stage[i]]-beta*age[i]
     hjm1=alpha[stage[i]-1]-beta*age[i]
     norm=1/(pnorm(hj)-pnorm(hjm1))
     E1=((1-hjm1^2)*dnorm(hjm1)-(1-hj^2)*dnorm(hj))*norm
     E2=((3+hjm1^2)*hjm1*dnorm(hjm1)-(3+hj^2)*hj*dnorm(hj))*norm
     vec[stage[i]-1]=-dnorm(hjm1)*norm
     vec[stage[i]]=dnorm(hj)*norm
     vec[m]=(dnorm(hjm1)-dnorm(hj))*norm*age[i]
    }

    vec[m+1]=E1
    vec[m+2]=E2
    BOX=BOX+(vec%o%vec)

    E1sum=E1sum+E1
    E2sum=E2sum+E2
}
   J=diag(m+2)[,(m+1):(m+2)]
   E=c(E1sum,E2sum)
   x2=as.vector(t(E)%*%t(J)%*%solve(BOX)%*%J%*%E)
   return(pchisq(x2,2,lower.tail=F))
}

#  For binary

cons=as.vector(glm(stage~age,family=binomial(link="probit"))$coeff)
beta=cons[2]
cons=cons[1]
vec=c(0,0)
mat=matrix(rep(0,4*4),nc=4)

for(i in 1:n){
   h=cons+beta*age[i]
   sto=dnorm(h)
   wi=sto
   p=pnorm(h)
   wi=wi/(p*(1-p))
   if(is.finite(wi)){
   vec=vec+wi*(stage[i]-p)*h*c(h,3+h^2)
   Xi=c(age[i],h,h^2-1,h*(3+h^2))
   mat=mat+sto*wi*(Xi%o%Xi)
   }
}

J=diag(4)[,3:4]
x2= as.vector(vec%*%t(J)%*%solve(mat)%*%J%*%vec)
return(pchisq(x2,2,lower.tail=F))
}

LM.test(suture)

which returns a p-value of 0.3579, so the transitions do appear to follow a log-normal distribution with a common variance on the log scale.

References