Mismatch

Analyses Not from Konigsberg L, and Frankenberg S. 2016. Postmarital residence analysis. In: Pilloud MA, and Hefner JT, editors. Biological distance analysis: Forensic and bioarchaeological perspectives. New York: Elsevier Inc. p 335-347.

This book chapter, which is about variation in discrete traits, made reference to some R code for such analyses using metric traits. That R code was on an old server which is no longer running. As a consequence, I am providing the code below before giving the code that is directly relevant to the book chapter. In what follows, copy and paste the green text into your Rgui.

Petersen (2002)

Petersen (2002) gives a parametric and nonparametric bootstrap test as alternatives to an F-test he previously gave for testing equality of two variance-covariance matrices. See the 2002 reference below for a description of these tests as well as prior literature (e.g., Key and Jantz 1990a, 1990b, Zhivotovsky 1988).

Petersen HC (2002) On statistical methods for comparison of intrasample morphometric variability: Zalavár revisted. Am J Phys Anthropol 113:79-84.

First copy and paste the following into your “Rgui” in order to get some data from the Howells' craniometric data:

download.file('http://faculty.las.illinois.edu/lylek/Mismatch/oslo.txt','oslo.txt')
oslo<-dget('oslo.txt')
download.file('http://faculty.las.illinois.edu/lylek/Mismatch/berg.txt','berg.txt')
berg<-dget('berg.txt')
file.remove('oslo.txt');file.remove('berg.txt')

At the prompt type “oslo” and “berg” (without quotes) to make sure you got it right.
Now mark and copy the following text into your “Rgui”:

zhivo<-function (h=berg,w=oslo)
{
SGVDF<-function (DF,p,SGV)
#########################################################################
#       This function takes the DF, p, and SGV and returns the corrected
#       DF and c*SGV (following Zhivotovsky, 1988; per Petersen 2000
#       correction)
########################################################################
{
        gvec<-rep(0,p)
        for(i in 1:p) {
               gvec[i]<-gamma(1/p+(DF-i+1)/2)/gamma((DF-i+1)/2)
                      }
        m1<-2*prod(gvec)
        for(i in 1:p) {
               gvec[i]<-gamma(2/p+(DF-i+1)/2)/gamma((DF-i+1)/2)
                      }
        m2<-4*prod(gvec)
        DFc<-2*m1^2/(m2-m1^2)
        cSGV<-DF/m1*SGV
        return(list(DFc=DFc,cSGV=cSGV))
}
#####################################################################
p<-NCOL(h)
det.ratio<-det(cov(h))/det(cov(w))
SGVh<-det(cov(h))^(1/p)
SGVw<-det(cov(w))^(1/p)
nw<-NROW(w)
nh<-NROW(h)
DFh<-nh-1
DFw<-nw-1
SGVh<-SGVDF(DFh,p,SGVh)
SGVw<-SGVDF(DFw,p,SGVw)
Fratio<-SGVh$cSGV/SGVw$cSGV
p<-pf(Fratio,SGVh$DFc,SGVw$DFc,lower.tail=F)
return(list(det.ratio=det.ratio,Fratio=Fratio,p=p))
}

Now type zhivo() at the R > prompt and you will get the following:

> zhivo()
$det.ratio
[1] 7.506352

$Fratio
[1] 1.220779

$p
[1] 0.01381720

In the above, “det.ratio” is the ratio of determinants of the two variance-covariance matrices, “Fratio” is the F-test, and “p” is the probability value for the F-test. In the call to “zhivo” you can specify different datasets, for example “zhivo(samp1,samp2)” or “zhivo(h=samp1,w=samp2).”

Now it is time to do a nonparametric bootstrap, so paste the following:

bootnon<-function (h=berg,w=oslo,Nboot=9,seed='NonRan')
{

set.seed(as.integer(Sys.time()))
if(seed=='NonRan') set.seed(42)
boot<-rep(0,Nboot+1)
p<-NCOL(h)
nw<-NROW(w)
nh<-NROW(h)
W<-w-rep(1,nw)%o%apply(w,2,mean)
H<-h-rep(1,nh)%o%apply(h,2,mean)
hw<-rbind(H,W)

boot[1]<-det(cov(hw[1:nh,]))/det(cov(hw[(nh+1):(nh+nw),]))

for (i in 2:(Nboot+1)) {
bootit<-sample(1:(nw+nh),replace=T)
hwb<-hw[bootit,1:p]
boot[i]<-det(cov(hwb[1:nh,]))/det(cov(hwb[(nh+1):(nh+nw),]))
}
return(boot)
}

and run as “bootnon()”. This will spit back a 10-element vector:

> bootnon()
[1] 7.5063520 2.3276142 2.6624329 4.2080064 1.0223504 0.1738658 0.7785375 1.2931999 1.7586308 0.8731364

where the first element (=7.5063520 ) is the observed determinant ratio and the following nine are bootstrapped. Nine bootstraps is pretty pathetic, so you’ll want to try something meatier once your sure things are working correctly, like:

> bootnon2<-bootnon(Nboot=999,seed='Ran')

This will store the vector, then you can use “sum(bootnon2>=bootnon2[1])/1000” to get the p-value. The seed='Ran' part uses the computer clock for a seed to the random generator (otherwise the random number sequence always starts at 42).

Now it is time to do a parametric bootstrap, so paste the following:

bootwish<- function (h=berg,w=oslo,Nboot=9,seed='NonRan')
{
set.seed(as.integer(Sys.time()))
if(seed=='NonRan') set.seed(42)

rwishart2<-function(df,SqrtSigma)
#################################################################
# rwishart2 simulates from a wishart distribution using the
# Cholesky decomposition of a variance-covariance matrix
#################################################################
{
p<-NROW(SqrtSigma)
Z <- matrix(0, p, p)
diag(Z) <- sqrt(rchisq(p, df:(df-p+1)))
pseq <- 1:(p-1)
Z[rep(p*pseq, pseq) + unlist(lapply(pseq, seq))] <- rnorm(p*(p-1)/2)
crossprod(Z %*% SqrtSigma)
}
#################################################################

boot<-rep(0,Nboot+1)
p<-NCOL(h)
nw<-NROW(w)
nh<-NROW(h)
DFh<-nh-1
DFw<-nw-1

boot[1]<-det(cov(h))/det(cov(w))
Sp<-(cov(h)*DFh+cov(w)*DFw)/(DFh+DFw)
SqrtSp<-chol(Sp)

for (i in 2:(Nboot+1)) {
boot[i]<-det(rwishart2(DFh,SqrtSp)/DFh)/det(rwishart2(DFw,SqrtSp)/DFw)
}
return(boot)
}

and run as “bootwish()”. This will spit back a 10-element vector:

> bootwish()
[1] 7.5063520 1.0086355 0.1061152 0.4056361 10.2281551 0.2542374 0.4334881 0.2216340 0.5006118 1.3059963

where the first element (=7.5063520 ) is the observed determinant ratio and the following nine are bootstrapped. Nine bootstraps is pretty pathetic, so you’ll want to try something meatier once your sure things are working correctly, like:

> bootwish2<-bootwish(Nboot=999,seed='Ran')

This will store the vector, then you can use “sum(bootwish2>=bootwish2[1])/1000” to get the p-value. The seed='Ran' part uses the computer clock for a seed to the random generator (otherwise the random number sequence always starts at 42).

Analyses from Konigsberg L, and Frankenberg S. 2016. Postmarital residence analysis. In: Pilloud MA, and Hefner JT, editors. Biological distance analysis: Forensic and bioarchaeological perspectives. New York: Elsevier Inc. p 335-347.

You will need the packages "ape" and "shape." Then copy and paste the green stuff below.

read.files=function ()
{
library(ape)
download.file('http://faculty.las.illinois.edu/lylek/Mismatch/Sites.zip','Sites.zip')
unzip('Sites.zip')
file.remove('Sites.zip')
EZ2F<<-read.dna('EZ2F.txt');EZ2M<<-read.dna('EZ2M.txt');EZF<<-read.dna('EZF.txt')
EZM<<-read.dna('EZM.txt');Females<<-read.dna('Females.txt');GIF<<-read.dna('GIF.txt')
GIM<<-read.dna('GIM.txt');HNF<<-read.dna('HNF.txt');HNM<<-read.dna('HNM.txt')

HRF<<-read.dna('HRF.txt');HRM<<-read.dna('HRM.txt');KOF<<-read.dna('KOF.txt')
KOM<<-read.dna('KOM.txt');LDF<<-read.dna('LDF.txt');LDM<<-read.dna('LDM.txt')
Males<<-read.dna('Males.txt');PK2F<<-read.dna('PK2F.txt');PK2M<<-read.dna('PK2M.txt')
PKF<<-read.dna('PKF.txt');PKlunkF<<-read.dna('PKlunkF.txt');PKlunkM<<-read.dna('PKlunkM.txt')
PKM<<-read.dna('PKM.txt');RYF<<-read.dna('RYF.txt');RYM<<-read.dna('RYM.txt')
SH1F<<-read.dna('SH1F.txt');SH1M<<-read.dna('SH1M.txt');SHF<<-read.dna('SHF.txt')
SHM<<-read.dna('SHM.txt');YO1F<<-read.dna('YO1F.txt');YO1M<<-read.dna('YO1M.txt')

YOF<<-read.dna('YOF.txt');YOM<<-read.dna('YOM.txt')

# Remove *.txt files

file.remove('EZ2F.txt');file.remove('EZ2M.txt');file.remove('EZF.txt')
file.remove('EZM.txt');file.remove('Females.txt');file.remove('GIF.txt')
file.remove('GIM.txt');file.remove('HNF.txt');file.remove('HNM.txt')

file.remove('HRF.txt');file.remove('HRM.txt');file.remove('KOF.txt')
file.remove('KOM.txt');file.remove('LDF.txt');file.remove('LDM.txt')
file.remove('Males.txt');file.remove('PK2F.txt');file.remove('PK2M.txt')
file.remove('PKF.txt');file.remove('PKlunkF.txt');file.remove('PKlunkM.txt')
file.remove('PKM.txt');file.remove('RYF.txt');file.remove('RYM.txt')
file.remove('SH1F.txt');file.remove('SH1M.txt');file.remove('SHF.txt')
file.remove('SHM.txt');file.remove('YO1F.txt');file.remove('YO1M.txt')

file.remove('YOF.txt');file.remove('YOM.txt')

}

read.files()

Figure 1

Histograms of 10,000 bootstrap samples (across nucleotides) for mean number of mtDNA pairwise differences within
males and within females for Bolnick and Smith’s (2007) Middle Woodland samples from the Pete Klunk Mound
Group. The dashed vertical lines are the observed male and female values.

Fig.01 = function ()
{
library(ape)
library(MASS)
set.seed(12345)

opar = par(no.readonly = T)

on.exit(par(opar))

par(mfrow=c(2,1))
pairw=0
for(i in 1:10000) pairw[i] = mean(dist.dna(Males[,sample(1:351,replace=T)],"N"))
truehist(pairw,nb=60,prob=T,xlab=' Mean No. Pairwise Differences',
ylab='Probability',col='white',
main='Male DNA sequence data')
actual=mean(dist.dna(Males,"N"))
lines(rep(actual,2),c(0,1000),lwd=2,lty=2)
print(quantile(pairw,c(0.025,0.5,0.975)))
print(mean(pairw))

for(i in 1:10000) pairw[i] = mean(dist.dna(Females[,sample(1:351,replace=T)],"N"))
truehist(pairw,nb=60,prob=T,xlab=' Mean No. Pairwise Differences',
ylab='Probability',col='white',
main='Female DNA sequence data')
actual=mean(dist.dna(Females,"N"))
lines(rep(actual,2),c(0,1000),lwd=2,lty=2)

print(quantile(pairw,c(0.025,0.5,0.975)))
print(mean(pairw))

}

Fig.01()

Figure 2

Empirical cumulative density representation of the 10,000 bootstrap samples for males in Fig. 18.1. The 2.5th, 50th,
and 97.5th percentile values are marked, as is the actual observed value.

Fig.02=function (nboots=10000)
{
library(ape)
library(shape)
set.seed(12345)
pairw=0
actual=mean(dist.dna(Males,"N"))
for(i in 1:nboots) pairw[i] = mean(dist.dna(Males[,sample(1:351,replace=T)],"N"))
lo=min(pairw)
hi=max(pairw)
to.plot=ecdf(pairw)
pnts=knots(to.plot)
lo=pnts[2]
hi=pnts[length(pnts)-1]
height = sum(actual>pairw)/nboots
plot.stepfun(to.plot,xlab=' Mean No. Pairwise Differences',
   ylab='Cumulative density',do.points=F,verticals=T,
   main='Cumulative density mean pair-wise differences (Males)',
   xlim=c(2,12),ylim=c(0,1))
lines(c(0,lo),rep(0,2))
lines(c(hi,14),rep(1,2))
lines(rep(actual,2),c(0,height),lwd=2,lty=2)
CI=as.vector(quantile(pairw,probs=c(0.025,0.5,0.975)))
lines(rep(CI[1],2),c(0,0.025),lwd=2)
lines(rep(CI[2],2),c(0,0.5),lwd=2)
lines(rep(CI[3],2),c(0,0.975),lwd=2)
lines(c(0,CI[1]),rep(0.025,2),lty=2)
lines(c(0,CI[2]),rep(0.5,2),lty=2)
lines(c(0,CI[3]),rep(0.975,2),lty=2)
text(1.8,.04,'0.025',adj=c(0,0))
text(1.8,0.99,'0.975',adj=c(0,0))
text(1.8,0.51,'Median (0.5)',adj=c(0,0))
print(round(CI,3))
Arrows(7.4453,.3796,6.7031,.3250,arr.adj=1)
text(7.6,.38,'actual',adj=c(0,.5))
}

Fig.02()

Figure 3

Empirical cumulative density plot of the 10,000 bootstrap samples where the plotted statistic is the ratio of male-tofemale
mean pairwise differences.

Fig.03 = function (nboots=10000)
{
library(ape)
set.seed(123456)
actual=mean(dist.dna(Males,'N'))/mean(dist.dna(Females,'N'))
pairw=0
for(i in 1:nboots) {
   boot = sample(1:351,replace=T)
   pairw[i] = mean(dist.dna(Males[,boot],"N"))/
              mean(dist.dna(Females[,boot],"N"))
}

height = sum(actual>pairw)/nboots

lo=min(pairw)
hi=max(pairw)
to.plot=ecdf(pairw)
pnts=knots(to.plot)
lo=pnts[2]
hi=pnts[length(pnts)-1]
plot.stepfun(to.plot,xlab=' Ratio of Male to Female Mean Pairwise Differences',
   ylab='Cumulative density',do.points=F,verticals=T,
   main='Bootstrap Ratio',ylim=c(0,1),xlim=c(0.5,2))
lines(rep(actual,2),c(0,height),lwd=2,lty=2)
lines(c(0,lo),rep(0,2))
lines(c(hi,14),rep(1,2))
lines(rep(actual,2),c(0,height),lwd=2,lty=2)
lines(rep(1,2),c(0,1000),lwd=2,lty=3)
CI=as.vector(quantile(pairw,probs=c(0.025,0.5,0.975)))
lines(rep(CI[1],2),c(0,0.025),lwd=2)
lines(rep(CI[3],2),c(0,0.975),lwd=2)
print(round(CI,3))
legend('topleft',legend=c('Null (ratio = 1)','2.5 and 97.5 \n percentiles','actual'),
     bty='n',lty=c(3,1,2),lwd=2,cex=.75)
abline(h=0)

}

Fig.03()

Figure 4

Empirical cumulative density plot of 10,000 bootstrap samples for the Pete Klunk Middle Woodland Site where the
plotted statistic is the ratio of male-to-female mean pairwise differences for cranial discrete traits. Note that the
observed ratio value (0.946) is near the 1.0 ratio, indicating equivalent variation in the sexes.

Fig.04 = function (nboots=10000)
{
library(ape)
set.seed(123456)
actual=mean(dist.dna(PKlunkM,pairwise.deletion=T,'N'))/
        mean(dist.dna(PKlunkF,pairwise.deletion=T,'N'))
print(actual)
pairw=0
for(i in 1:nboots) {
   boot = sample(1:23,replace=T)
   pairw[i] = mean(dist.dna(PKlunkM[,boot],pairwise.deletion=T,"N"))/
              mean(dist.dna(PKlunkF[,boot],pairwise.deletion=T,"N"))
}

height = sum(actual>pairw)/nboots

lo=min(pairw)
hi=max(pairw)
to.plot=ecdf(pairw)
pnts=knots(to.plot)
lo=pnts[2]
hi=pnts[length(pnts)-1]
plot.stepfun(to.plot,xlab=' Ratio of Male to Female Mean Pairwise Differences',
   ylab='Cumulative density',do.points=F,verticals=T,
   main='Pete Klunk Middle Woodland (discrete traits)',ylim=c(0,1),xlim=c(0.5,1.5))
lines(rep(actual,2),c(0,height),lwd=2,lty=2)
lines(c(0,lo),rep(0,2))
lines(c(hi,14),rep(1,2))
lines(rep(1,2),c(0,1000),lwd=2,lty=3)
CI=as.vector(quantile(pairw,probs=c(0.025,0.975)))
lines(rep(CI[1],2),c(0,0.025),lwd=2)
lines(rep(CI[2],2),c(0,0.975),lwd=2)
legend('topleft',legend=c('Null (ratio = 1)','2.5 and 97.5 percentiles','actual'),
     bty='n',lty=c(3,1,2),lwd=2,cex=.75)
abline(h=0)
}

Fig.04()

Figure 5

Empirical cumulative density plot of 10,000 bootstrap samples for 11 pre-Mississippian sites where the plotted statistic
is the ratio of male-to-female mean pairwise differences for cranial discrete traits.

Thanks to Dr. Matthew C. Velasco for catching an error in the previous version of this script, which has now been corrected following his comments.

Fig.05 = function (nboots=10000)
{
library(ape)
set.seed(123456)
NM=c(NROW(EZM),NROW(GIM),NROW(PKM),NROW(RYM),NROW(KOM),NROW(HNM),NROW(LDM),
     NROW(EZ2M),NROW(PK2M),NROW(SH1M),NROW(YO1M))
Np.M=sum(NM*(NM-1)/2)

NF=c(NROW(EZF),NROW(GIF),NROW(PKF),NROW(RYF),NROW(KOF),NROW(HNF),NROW(LDF),
     NROW(EZ2F),NROW(PK2F),NROW(SH1F),NROW(YO1F))
Np.F=sum(NF*(NF-1)/2)

   Sumd.M = sum(dist.dna(EZM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(GIM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(PKM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(RYM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(KOM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(HNM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(LDM,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(EZ2M,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(PK2M,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(SH1M,pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(YO1M,pairwise.deletion=T,"N"))

   Sumd.F = sum(dist.dna(EZF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(GIF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(PKF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(RYF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(KOF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(HNF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(LDF,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(EZ2F,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(PK2F,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(SH1F,pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(YO1F,pairwise.deletion=T,"N"))

actual = (Sumd.M/Np.M)/(Sumd.F/Np.F)

pairw=0
for(i in 1:nboots) {
   boot = sample(1:23,replace=T)
   Sumd.M = sum(dist.dna(EZM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(GIM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(PKM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(RYM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(KOM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(HNM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(LDM[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(EZ2M[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(PK2M[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(SH1M[,boot],pairwise.deletion=T,"N"))
   Sumd.M = Sumd.M + sum(dist.dna(YO1M[,boot],pairwise.deletion=T,"N"))

   Sumd.F = sum(dist.dna(EZF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(GIF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(PKF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(RYF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(KOF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(HNF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(LDF[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(EZ2F[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(PK2F[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(SH1F[,boot],pairwise.deletion=T,"N"))
   Sumd.F = Sumd.F + sum(dist.dna(YO1F[,boot],pairwise.deletion=T,"N"))

   pairw[i] = (Sumd.M/Np.M)/(Sumd.F/Np.F)
}

height = sum(actual>pairw)/nboots

lo=min(pairw)
hi=max(pairw)
to.plot=ecdf(pairw)
pnts=knots(to.plot)
lo=pnts[2]
hi=pnts[length(pnts)-1]
plot.stepfun(to.plot,xlab='Ratio of Male to Female Within-Site Mean Pairwise Differences',
   ylab='Cumulative density',do.points=F,verticals=T,
   main='Eleven Pre-Mississippian Sites',ylim=c(0,1),xlim=c(0.7,1.1))
mtext('(Hacker South Mound 2 classified as Mississippian)')
lines(rep(actual,2),c(0,height),lwd=2,lty=2)
lines(c(0,lo),rep(0,2))
lines(c(hi,14),rep(1,2))
lines(rep(1,2),c(0,1000),lwd=2,lty=3)
CI=as.vector(quantile(pairw,probs=c(0.025,0.975)))
lines(rep(CI[1],2),c(0,0.025),lwd=2)
lines(rep(CI[2],2),c(0,0.975),lwd=2)
legend('topleft',legend=c('Null (ratio = 1)','2.5 and 97.5 percentiles','actual'),
     bty='n',lty=c(3,1,2),lwd=2,cex=.75)
abline(h=0)
print(c(CI[1],actual,CI[2]))
}

Fig.05()

Figure 6

Empirical cumulative density plot of 10,000 bootstrap samples for three Mississippian sites where the plotted statistic is
the ratio of male-to-female mean pairwise differences for cranial discrete traits.

Thanks to Dr. Matthew C. Velasco for catching an error in the previous version of this script, which has now been corrected following his comments.

Fig.06 = function (nboots=10000)
{
library(ape)
set.seed(123456)
NM=c(NROW(HRM),NROW(SHM),NROW(YOM))
Np.M=sum(NM*(NM-1)/2)
NF=c(NROW(HRF),NROW(SHF),NROW(YOF))
Np.F=sum(NF*(NF-1)/2)

Sumd.M = sum(dist.dna(HRM,pair=T,"N"))
Sumd.M = Sumd.M + sum(dist.dna(SHM,pair=T,"N"))
Sumd.M = Sumd.M + sum(dist.dna(YOM,pair=T,"N"))

Sumd.F = sum(dist.dna(HRF,pair=T,"N"))
Sumd.F = Sumd.F + sum(dist.dna(SHF,pair=T,"N"))
Sumd.F = Sumd.F + sum(dist.dna(YOF,pair=T,"N"))

actual = (Sumd.M/Np.M)/(Sumd.F/Np.F)

pairw=0
for(i in 1:nboots) {
   boot = sample(1:23,replace=T)
    Sumd.M = sum(dist.dna(HRM[,boot],pair=T,"N"))
    Sumd.M = Sumd.M + sum(dist.dna(SHM[,boot],pair=T,"N"))
    Sumd.M = Sumd.M + sum(dist.dna(YOM[,boot],pair=T,"N"))

    Sumd.F = sum(dist.dna(HRF[,boot],pair=T,"N"))
    Sumd.F = Sumd.F + sum(dist.dna(SHF[,boot],pair=T,"N"))
    Sumd.F = Sumd.F + sum(dist.dna(YOF[,boot],pair=T,"N"))
   pairw[i] = (Sumd.M/Np.M)/(Sumd.F/Np.F)
}

height = sum(actual>pairw)/nboots
lo=min(pairw)
hi=max(pairw)
to.plot=ecdf(pairw)
pnts=knots(to.plot)
lo=pnts[2]
hi=pnts[length(pnts)-1]
plot.stepfun(to.plot,xlab='Ratio of Male to Female Within-Site Mean Pairwise Differences',
   ylab='Cumulative density',do.points=F,verticals=T,
   main='Three Mississippian Period Sites',ylim=c(0,1),xlim=c(0.8,1.4))
mtext('(Hacker South Mound 2 classified as Mississippian)')
lines(rep(actual,2),c(0,height),lwd=2,lty=2)
lines(c(0,lo),rep(0,2))
lines(c(hi,14),rep(1,2))
lines(rep(1,2),c(0,1000),lwd=2,lty=3)
CI=as.vector(quantile(pairw,probs=c(0.025,0.975)))
lines(rep(CI[1],2),c(0,0.025),lwd=2)
lines(rep(CI[2],2),c(0,0.975),lwd=2)

legend('topleft',legend=c('Null (ratio = 1)','2.5 and 97.5 \n percentiles','actual'),
     bty='n',lty=c(3,1,2),lwd=2,cex=.75)
abline(h=0)
print(c(CI[1],actual,CI[2]))
}

Fig.06()