Title: | Estimation of Jacquard's Genetic Identity Coefficients |
---|---|
Description: | Contains procedures to estimate the nine condensed Jacquard genetic identity coefficients (Jacquard, 1974) <doi:10.1007/978-3-642-88415-3> by constrained least squares (Graffelman et al., 2024) <doi:10.1101/2024.03.25.586682> and by the method of moments (Csuros, 2014) <doi:10.1016/j.tpb.2013.11.001>. These procedures require previous estimation of the allele frequencies. Functions are supplied that estimate relationship parameters that derive from the Jacquard coefficients, such as individual inbreeding coefficients and kinship coefficients. |
Authors: | Jan Graffelman [aut, cre] |
Maintainer: | Jan Graffelman <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.2 |
Built: | 2025-01-16 05:23:29 UTC |
Source: | https://github.com/cran/Jacquard |
Function BoxplotDelta
produces boxplots of the Jacquard coefficients from
a list structure containing nine matrices with the pairwise coefficients. The
diagonals of J1 and J7 are plotted in separate boxplots.
BoxplotDelta(J, ind.sub = 1:nrow(J[[1]]), ...)
BoxplotDelta(J, ind.sub = 1:nrow(J[[1]]), ...)
J |
The list structure with nine fields consisting of nine matrices |
ind.sub |
Index for subsetting the individuals |
... |
Additional arguments to pass on to |
NULL
Jan Graffelman ([email protected])
data(DeltaSimulatedPedigree) BoxplotDelta(DeltaSimulatedPedigree)
data(DeltaSimulatedPedigree) BoxplotDelta(DeltaSimulatedPedigree)
Function BoxplotTheta
makes boxplots of relatedess parameters from a
list object containing estimates of pairwise relatedness parameters.
BoxplotTheta(KS, ind.sub = 1:nrow(KS[[1]]), ...)
BoxplotTheta(KS, ind.sub = 1:nrow(KS[[1]]), ...)
KS |
A list object with four fields of matrices: kinship, inbreeding, T3 and T4. |
ind.sub |
Index for subsetting the individuals |
... |
Additional arguments passed on to the function |
For individual inbreeding coefficients, two boxplots are made, the first based on the diagonal of T2, the second on the row means of T2. The remaining boxplots (kinship, T3 and T4) are all pairwise, and exclude self-pairs.
NULL
Jan Graffelman ([email protected])
Csuros, M. (2014) Non-identifiability of identity coefficients at biallelic loci. Theoretical Population Biology 92, pp. 22-29. doi:10.1016/j.tpb.2013.11.001.
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv doi:10.1101/2024.03.25.586682.
data(DeltaSimulatedPedigree) Theta <- CalculateTheta(DeltaSimulatedPedigree) BoxplotTheta(Theta)
data(DeltaSimulatedPedigree) Theta <- CalculateTheta(DeltaSimulatedPedigree) BoxplotTheta(Theta)
Function CalculateMom
computes moment estimators for a set of
relatedness parameters (kinship, inbreeding, least one IBD out of three
and T4) using the genotype data and the allele frequencies.
CalculateMom(Xgen, mafvec, ind.sub = 1:nrow(Xgen), verbose = TRUE)
CalculateMom(Xgen, mafvec, ind.sub = 1:nrow(Xgen), verbose = TRUE)
Xgen |
Genotype data coded in (0,1,2) format |
mafvec |
A vector of minor allele frequencies |
ind.sub |
Index for subsetting individuals |
verbose |
Print output on the progress of the algorithm if |
A list object with four fields:
T1 |
The pairwise coancestry or kinship coefficients (symmetric) |
T2 |
The pairwise inbreeding coefficients (non-symmetric) |
T3 |
The pairwise Least one IBD out of three (symmetric) |
T4 |
T4 (skew-symmetric) |
Jan Graffelman ([email protected])
Csuros, M. (2014) Non-identifiability of identity coefficients at biallelic loci. Theoretical Population Biology 92, pp. 22-29. doi:10.1016/j.tpb.2013.11.001.
data(SimulatedPedigree) Xgen <- as.matrix(SimulatedPedigree[,6:ncol(SimulatedPedigree)]) mafvec <- mafvector(Xgen) Theta.mom <- CalculateMom(Xgen[1:10,],mafvec)
data(SimulatedPedigree) Xgen <- as.matrix(SimulatedPedigree[,6:ncol(SimulatedPedigree)]) mafvec <- mafvector(Xgen) Theta.mom <- CalculateMom(Xgen[1:10,],mafvec)
Function CalculateTheta
calculates five identifiable relatedness parameters from a
list structure containing nine matrices of Jacquard coefficients.
CalculateTheta(J, ind.sub = 1:nrow(J[[1]]))
CalculateTheta(J, ind.sub = 1:nrow(J[[1]]))
J |
A list structure with nine matrices of pairwise Jacquard coefficients |
ind.sub |
Index for subsetting the individuals |
CalculateTheta
produces four matrices according to the expressions in
Graffelman et al. (2024).
A list object with four fields:
T1 |
The pairwise coancestry or kinship coefficients (symmetric) |
T2 |
The pairwise inbreeding coefficients (non-symmetric) |
T3 |
The pairwise Least one IBD out of three (symmetric) |
T4 |
T4 (skew-symmetric) |
Jan Graffelman ([email protected])
Csuros, M. (2014) Non-identifiability of identity coefficients at biallelic loci. Theoretical Population Biology 92, pp. 22-29. doi:10.1016/j.tpb.2013.11.001.
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv doi:10.1101/2024.03.25.586682.
data(DeltaSimulatedPedigree) Theta <- CalculateTheta(DeltaSimulatedPedigree)
data(DeltaSimulatedPedigree) Theta <- CalculateTheta(DeltaSimulatedPedigree)
Function DeltaPair
extracts from the list object of all pairwise Jacquard coefficients the
set of coefficients of a given pair (i,j).
DeltaPair(Delta, i, j, digits = 7)
DeltaPair(Delta, i, j, digits = 7)
Delta |
A list with nine matrices of pairswise coefficients. |
i |
Index of the first individual. |
j |
Index of the second individual. |
digits |
Number of digits to which the coefficients are rounded. |
A vector with nine elements
Jan Graffelman ([email protected])
data(DeltaSimulatedPedigree) DeltaPair(DeltaSimulatedPedigree,1,2)
data(DeltaSimulatedPedigree) DeltaPair(DeltaSimulatedPedigree,1,2)
A list object containing nine matrices with pairwise Jacquard coefficients
data("DeltaSimulatedPedigree")
data("DeltaSimulatedPedigree")
A list with nine fields.
DeltaSimulatedPedigree
can be generated by applying function Jacquard.cls
to the
data in SimulatedPedigree
.
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv doi:10.1101/2024.03.25.586682.
data(DeltaSimulatedPedigree)
data(DeltaSimulatedPedigree)
Contains a list object with nine fields, consisting of nine lower triangular matrices with the joint genotype counts for 111 individuals.
data("GTC")
data("GTC")
A list with nine fields.
Goudet, J. (2022) JGTeach: JG Teaching material. R package version 0.1.9. https://github.com/jgx65
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv doi:10.1101/2024.03.25.586682.
data(GTC)
data(GTC)
Function Jacquard.cls
estimates the nine condensed Jacquard coefficients of a pair of individuals
using their joint genotype probabilities and the allele frequencies using constrained least squares.
Jacquard.cls(Xlist, mafvec = NULL, eps = 1e-06, delta.init = runif(9), Mavg = NULL, inner.iter = 1000, outer.iter = 1000, verbose = TRUE)
Jacquard.cls(Xlist, mafvec = NULL, eps = 1e-06, delta.init = runif(9), Mavg = NULL, inner.iter = 1000, outer.iter = 1000, verbose = TRUE)
Xlist |
A list object with nine fields containing the matrices with joint genotype counts. |
mafvec |
A vector with the minor allele frequencies for all genetic variants. |
eps |
Tolerance criterion for the solver ( |
delta.init |
Initial vector of estimates for the nine condensed Jacquard coefficients. |
Mavg |
A nine by nine matrix of conditional probabilities, allele frequency dependent. This matrix
is calculated by |
inner.iter |
Maximum number of inner iterations for the solver (1000 by default). |
outer.iter |
Maximum number of outer iterations for the solver (1000 by default). |
verbose |
Print output on the progress of the algorithm if |
Function Jacquard.cls
relies on the solver solnp
from the Rsolnp package.
A list object with fields:
delta |
A list with nine matrices of estimates of pairwise Jacquard coefficients. |
convergence |
A matrix with the convergence status for each pair (0 = converged; 1 = not converged). |
Jan Graffelman ([email protected])
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv doi:10.1101/2024.03.25.586682.
Ghalanos, A. and Theussl, S. (2015) Rsolnp: General Non-linear Optimization Using Augmented Lagrange Multiplier Method. R package version 1.16. https://cran.r-project.org/package=Rsolnp
data(SimulatedPedigree) Xgen <- as.matrix(SimulatedPedigree[,6:ncol(SimulatedPedigree)]) data(GTC) mafvec <- mafvector(Xgen) ii <- 1:3 GTCsubset <- list(length = 9) for (k in 1:9) { GTCsubset[[k]] <- matrix(numeric(3^2), ncol = 3) GTCsubset[[k]] <- GTC[[k]][ii,ii] } output <- Jacquard.cls(GTCsubset,mafvec=mafvec, eps=1e-06) Delta.cls <- output$delta print(Delta.cls) print(output$convergence) # A particular estimate of a Jacquard coefficient for a particular pair can # be extracted from Delta.cls # # E.g., Delta\_9 of the first pair of individuals (1,2) can be extracted by # D9_12 <- Delta.cls[[9]][1,2]
data(SimulatedPedigree) Xgen <- as.matrix(SimulatedPedigree[,6:ncol(SimulatedPedigree)]) data(GTC) mafvec <- mafvector(Xgen) ii <- 1:3 GTCsubset <- list(length = 9) for (k in 1:9) { GTCsubset[[k]] <- matrix(numeric(3^2), ncol = 3) GTCsubset[[k]] <- GTC[[k]][ii,ii] } output <- Jacquard.cls(GTCsubset,mafvec=mafvec, eps=1e-06) Delta.cls <- output$delta print(Delta.cls) print(output$convergence) # A particular estimate of a Jacquard coefficient for a particular pair can # be extracted from Delta.cls # # E.g., Delta\_9 of the first pair of individuals (1,2) can be extracted by # D9_12 <- Delta.cls[[9]][1,2]
Function JointGenotypeCounts
counts for each pair of individuals in the database their
nine joint genotype counts
JointGenotypeCounts(X.gen, one.is.minor = TRUE)
JointGenotypeCounts(X.gen, one.is.minor = TRUE)
X.gen |
A matrix with genotype data coded in (0,1,2) format |
one.is.minor |
If |
A list object with nine fields containing:
f0000 |
Matrix of (0/0,0/0) counts for all pairs |
f1111 |
Matrix of (1/1,1/1) counts for all pairs |
f1101 |
Matrix of (1/1,0/1) counts for all pairs |
f0111 |
Matrix of (0/1,1/1) counts for all pairs |
f0101 |
Matrix of (0/1,0/1) counts for all pairs |
f1100 |
Matrix of (1/1,0/0) counts for all pairs |
f0011 |
Matrix of (0/0,1/1) counts for all pairs |
f0100 |
Matrix of (0/1,0/0) counts for all pairs |
f0001 |
Matrix of (0/0,0/1) counts for all pairs |
Jan Graffelman ([email protected])
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv. doi:10.1101/2024.03.25.586682
data(SimulatedPedigree) JC <- JointGenotypeCounts(SimulatedPedigree[1:3,1:100]) print(JC)
data(SimulatedPedigree) JC <- JointGenotypeCounts(SimulatedPedigree[1:3,1:100]) print(JC)
Function mafvector
calculates genotype counts columnwise and determines
the minor allele frequency for each column.
mafvector(X)
mafvector(X)
X |
A matrix of with (0,1,2) genotype data, individuals in rows, markers in columns. |
mafvector
calculates the frequency of the minor allele irrespective of the coding; i.e.,
irrespective of whether the genotype data represent major or minor allele counts. Missing values
are discarded for the calculation of the MAF.
a vector
Jan Graffelman ([email protected])
data(SimulatedPedigree) p <- mafvector(SimulatedPedigree[,1:10]) print(p)
data(SimulatedPedigree) p <- mafvector(SimulatedPedigree[,1:10]) print(p)
Function MakeM
creates the matrix of conditional joint genotype probabilities for biallelic markers for a given allele frequency.
MakeM(p)
MakeM(p)
p |
the allele frequency |
a 9 by 9 matrix
Jan Graffelman ([email protected])
Csuros, M. (2014) Non-identifiability of identity coefficients at biallelic loci. Theoretical Population Biology 92, pp. 22–29. doi: 10.1016/j.tpb.2013.11.001
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Under review.
set.seed(123) p <- runif(1) M <- MakeM(p)
set.seed(123) p <- runif(1) M <- MakeM(p)
Function PairwiseList
processes the list structure of Jacquard coefficients
and converts it into to table of pairs of individuauls with their Jacquard coefficients.
PairwiseList(X, digits = 3)
PairwiseList(X, digits = 3)
X |
A list structure of nine matrices of Jacquard coefficients. |
digits |
Jacquard coefficients are rounded to the given number of digits. |
A matrix
Jan Graffelman ([email protected])
data(DeltaSimulatedPedigree) ii <- 1:3 SubSet <- list(length = 9) for (k in 1:9) { SubSet[[k]] <- matrix(numeric(3^2), ncol = 3) SubSet[[k]] <- DeltaSimulatedPedigree[[k]][ii,ii] } List <- PairwiseList(SubSet) print(List)
data(DeltaSimulatedPedigree) ii <- 1:3 SubSet <- list(length = 9) for (k in 1:9) { SubSet[[k]] <- matrix(numeric(3^2), ncol = 3) SubSet[[k]] <- DeltaSimulatedPedigree[[k]][ii,ii] } List <- PairwiseList(SubSet) print(List)
A matrix containing 111 individuals (rows), pedigree information (first five columns) and 20.000 single nucleotide polymorphisms (remaining columns) coded in (0,1,2) format and simulated according to a pedigree of 20 unrelated founders with six posterior generations.
data("SimulatedPedigree")
data("SimulatedPedigree")
A data frame containing 111 rows and 20.005 columns.
The SNP data was generated using the JGTeach package. The genotype counts in the object SimulatedPedigree represent the counts of the minor allele.
Goudet, J. (2022) JGTeach: JG Teaching material. R package version 0.1.9. https://github.com/jgx65
Graffelman, J., Weir, B.S. and Goudet, J. (2024) Estimation of Jacquard's genetic identity coefficients with bi-allelic variants by constrained least-squares. Preprint at bioRxiv doi:10.1101/2024.03.25.586682.
data(SimulatedPedigree)
data(SimulatedPedigree)