Allele manifestation in cancer: analytical expressions

orchidealecian 130 views 22 slides Sep 19, 2024
Slide 1
Slide 1 of 22
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22

About This Presentation

Author: Orchidea Maria Lecian
Speaker: Orchidea Maria Lecian
Title: Allele manifestation in cancer: analytical expressions
Abstract: The models of allele manifestation in cancer are newly formulated,
according to which the processes of allele manifestations are
described according to a Markov decisi...


Slide Content

Allele manifestation in cancer: analytical expressions.
Orchidea Maria Lecian
Sapienza University of Rome,
Rome, Italy.
International Meet on Stem cells and Regenerative Medicine
STEMCELLSMEET2024
19 September 2024,
Dubai, UAE.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Abstract
The models of allele manifestation in cancer are newly formulated,
according to which the processes of allele manifestations are
described according to a Markov decision process. According to
the several models, the expressions of the chains are analytically
written, the manifold (and their metrics) on which the graphs live
are classified.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Summary
Gene manifestations:
- Low-rank-tensor methods;
- Allele-specific copy number methods.
Allele manifestation: heterogenous sets of data:
- extract the Hoppe-urn-model from the Bayesian calculation; -
define the directed acyclic graphs of the corresponding Markov
(decision) process with Kantorovich metric.
Ranking of cancer genes form integration of heterogenous data:
- define the hidden Markov model form the latent states;
- take variance onto account;
- define the Markov process on Stiefel manifold with uniform
metric after Gibbs sampling.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Introduction
The perspectives of sequencing technologies allowing for characterization
of cancer genomes is along-standinginterrogation; the investigation of
the mutational’landscape’of tumours is providing new insights into the
evolution of cancer genome. Use of inference in the data processing is
usually made of
L.R. Yates, P.J. Campbell, Evolution of the cancer genome, Nat. Rev. Genet. 13, 795 (2012).
in
- multi-sampling strategies, and
- single-cells sequencing.
Implications are given:
•basic cancer biology, and
•development of anti-tumour therapy.
R. Wooster, K.E. Bachman, Catalogue, cause, complexity and cure; the many uses of
cancer genome sequence, Curr. Opin. Genet. Dev. 336, 41 (2010);
R.A. Burrell, C. Swanton, The evolution of the unstable cancer genome, Curr. Opin.
Genet. Dev. 61,7 (2014).
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Gene manifestations
Low-rank-tensor methods
The evolution of cancer phenomena can be modelled as continuous-time
Markov chains.
In
P. Georg, L. Grasedyck, M. Klever et al., Low-rank tensor methods for Markov chains
with applications to tumor progression models, J. Math. Biol. 86, 7 (2023).
transition rates are hypothesised a separable functions, i.e. such that
convergent ’iteration methods’ can be made use of, for which the notion
of distribution is retrieved.
Let
ˆ
Qbe the fundamental matrix of the chosen Markov chain on a
discrete state spaceSwith initial distributions assumed as defined.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

It is here
the entries of
ˆ
Qbe as infinitesimal.
Let
ˆ
Pbe the probability matrix associated after the fundamental matrix
ˆ
Qand after the new hypothesis: the distributions from the probability
matrixˆPare defined from the initial valuep, where the latter is written
as the integral
p=
R

0
e
τ[
ˆ
Q−
ˆ
I]
p(0)dτ
The p(0)∼o(0)
proper definition.
τis a time variable, and [
ˆ
Q−
ˆ
I] is a regular operator.
The spectrumσ([
ˆ
Q−
ˆ
I]) of the operator [
ˆ
Q−
ˆ
I] is written as from the statesx∈S
from the definition
σ([ˆQ−ˆI])⊆
S
x∈S
{z∈C:|z−Q−xx|≤|Qxx|} ⊆ {z∈C:Rez≤0}.It is
important to remark that
the Markov models differs from the ’dominant-eigenvalue’ technique.
OML, Markov models of genomic event, GJMCCR-11-281 (2024).
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Allele-specific copy number methods
Allele-specific copy number methods allows one to study copy-number
abnormalities, as from
H. Choo-Wosoba, P.S. Albert, B. Zhu, A hidden Markov modeling approach for
identifying tumor subclones in next-generation sequencing studies, Biostatistics 23, 69
(2022).
For this sake, a sub-Hidden Markov Model (subHMM) is implemented: it
allows one to consider both the ’ subclone region’ and the ’region-specific
genotype’.
The hidden-state variableWkof the statekrepresents the
’conglomeration of the subclone genotype’ and the ’clonal proportion’.
More in detail, the stateWk[zk,Uk,Tk] is defined as to give raise to
time-dependent transition probabilities which can be represented as
’multinomial distribution.
The statesWkare specified afterZkthe ’mainclone genotype’ of the
locusk,Ukthe ’indicator’ about whether there is a subclone ink, and
Tkthe ’subclone genotype’ (i.e. if the considered subclone exists).
Transition of the statesWkare considered only for consecutive ’loci’.
A maximum number of copies is assumed.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Therefore,
compose an envelopping algebra.
Under the hypothesis of the ’constant clonal proportion’, the
transition probabilitiesPt(z) determine that the hidden states are
not observed, and ’allele-specific’ elements are considered.
The proof the Markovian property of the originating chain is
fundamental in the definition of the Markov State Model(s) from
which the subHMM is taken. In the case of the Markovian feature,
the possibility to
are defined as orthogonal in the MSM.
OML, Markov models of genomic event, GJMCCR-11-281 (2024).
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Allele manifestation from heterogeneous set of
experimental data
Bayesian model for identifying the drivers
•miprobability of the driver mutation of the geneiis indicated, and
•ni’background mutation probability’ of the genei.
Probabilities for the genei, i.e.P(gi) are calculated as
P(gi= 1) =pmi+ (i−p)ni,
P(gi= 0) =p(1−mi) + (1−p)(1−ni),
where the mutation is understood asgi= 1.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

In order to establish the Bayesian estimation for the geneito be mutated
from the heterogeneous set of data, the Bayesian estimation is performed
from the Bayesian estimators ΘMi
and Θni
in the functionfas
f(m1,m2, ...,mM,n1,n2, ...,nM,p|GAL) =
=
Q
i=N
i=1
Q
j=M
j=1
[pmi+ (1−p)ni]gij[p(1−mi) + (1−p)(1−ni)]
1−gik
Θmi
Θni
M. Ma, Z. Zhao, T. Gui, Y. Chen, X. Dang, D. Wilkins, A generative Bayesian model to identify cancer driver
genes, 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Washington, DC, 2015,
pag. 351-356.
where the functions Θ’s are issued form the Hoppe urn model asBeta
distributions, as from
F.L. Rios, J.M. Noble, T.J.T. Koski, A Prior Distribution over Directed Acyclic Graphs for Sparse Bayesian
Networks, e-print arXiv:1504.06701.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

•The Hoppe urn model is newly specified for the selection of the chosen
genes; the steps needed to this determination are followed and explicated:
•it is therefore proven that
it is possible
which a Kantorovich metric
experimental error propagation.
•The Markov decision process defines a
i→kschematised according to a path of nodes and of edges in which
the mutation of the geneiis found at a node.
O.M. Lecian, Chain and Multi-spacial Markov chains of time dependence of allele manifestation in cancer: metric,
algebras and expectations of the reward, Journal of Medical and Clinical Case Reports 1(8) (2024).
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

The complete Bayesian network of the complete set of heterogeneous
data is here decomposed for the factorisation.
Directed acyclyc graphs with Kantorovich metric
•random variables (x1, ...,xd):
the set of random variables can be ordered indways, i.e. which
represent the configuration in which there are the chosen number of
mutations of the genei.
•factorisation properties:ρpermutations of (1, ....,d) are expressed in
order to induce the expression of the probability as
P
(x1,...,xd)≡
Q
j=d
j=1
[P
ρ(j)|Pa(ρ(j))]
where thePais defined after the consideration of theP(ρ(j)), where
there areρ’predecessors’ ofx
ρ(j)such thatPa(ρ(j)) is the smallest
subset ofP
ρ(j)such thatP
(x1,...,xd)is obeyed.
••After the study of
N. Ferns, P. Panangaden, D. Precup, Metrics for Finite Markov Decision Processes, UAI ’04: Proceedings of the
20th Conference on Uncertainty in artificial intelligence, 162-169 (2004).
•score functions can be defined over the space of the DAG’s.The
method compares with a Markov decisions model, which is defined
on a Borel subset on which there acts aσ-algebra, from which the
measure compares that of a Kantorovich metric when the error
propagation is taken into account.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Distributions over the graph structures are therefore possible to be
defined.
The ’prior distribution’Ppr(G)≡PGover the graph structureGis set as
P⃗
X|G
as the probability distribution over a setX, whose element (which
will be identified with the Markov states) are ordered as

X≡(X1, ...,Xd).
The posterior distributionPpo(G) is taken accordingly as
Ppo(G)≡PGP⃗
X|G
;
Distributions over the graph structures are therefore possible to be
defined.
The Ppr(G)≡PGover the graph structureGis set as
P⃗
X|G
as the probability distribution over a setX, whose element (which will be
identified with the Markov states) are ordered as

X≡(X1, ...,Xd).
The Ppo(G) is taken accordingly as
Ppo(G)≡PGP⃗
X|G
.
OML, The Markov chain of gene manifestation in cancer from a heterogeneous set of experimental data is defined
over directed acyclic graphs with Kantorovich metric, submitted to Journ. AppliedMath JAM.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Ranking of cancer genes from
integration of heterogeneous data
Leti= 1, ...,nbe the total number of genes; and letj= 1, ...,mbe the
total number of the patients of the sample. LetˆAbe the affinity matrix
of a bipartite graphG. The entriesaijare defined asaij= 1 iff the
patientihas a mutation of the genej;aij= 0 otherwise.
Mutual-reinforcement model is implemented as follows.
The driver scoreµiis proportional to the number of patients having the
mutationi.
The patient scoreπjis directly proportional to the number of mutations
possessed by the patientj.
Accordingly, the following summations are written
µj∝
X
i∈k:akj=1
πj,
πi∝
X
j∈l:ail=1
µl.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Leti= 1, ...,nbe the total number of genes; and letj= 1, ...,mbe the
total number of the patients of the sample. Let
ˆ
Abe the affinity matrix
of a bipartite graphG. The entriesaijare defined asaij= 1 iff the
patientihas a mutation of the genej;aij= 0 otherwise.
Accordingly, the following summations are written
µj∝
X
i∈k:akj=1
πj,
πi∝
X
j∈l:ail=1
µl.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

The probability matrix
ˆ
Arthat a patientiexhibits a mutation of the gene
jis calculated after the entriesar ijasar ij=
aij
P
k=n
k=1
akj
.
The probability matrix
ˆ
Acthat a mutation of the genejis exhibited in a
patientiis calculated after the entriesac ijasac ij=
aij
P
k=m
k=1
alj
.
Transition matrices
ˆ
Brare defined for patients as
ˆ
Br=α
ˆ
Ar+ (1−α)
1
n
ˆ
Im×n
with
ˆ
Im×nanm×nmatrix consisting of the entriesim×n ij≡1∀i,j.
Transition matrices
ˆ
Bcare defined for the mutation of genes as
ˆ
Br=α
ˆ
Ac+ (1−α)
1
m
ˆ
Im×n.
The parameterαtherefore qualifies the data sampling.
C. Ma, Y. Chen, D. Wilkins, Ranking of cancer genes in Markov chain model through integration of heterogeneous
sources of data, 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Belfast, UK, 248
(2014).
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

The latent states with Gaussain noise
’Latent Markov Model’ which describes time-varying ’latent states’ is
written as an inhomogeneous first-order Markov process.
Unobserved heterogeneity is allowed clustering(s) in a mode in which
random effects are discrete and may evolve in time.
Let there beκlatent states: the clustering properties allow that a
proportion of the elements of the sample is clustered with one of theκ
latent states.
There is a fixed number of clusters at the chosen time is made: the
discretised-time model is chosen: the time variable is discretised as
t= 1,2, ..., τ, withτthe number of measurement possibilities, with
κ1, κ2, ..., κτconfigurations.
Theκt’s configurations are described after an r-dim vectoryi(t) which
accounts for thei−thelement of the sample at the timet.
U(t) indicates the ’unobserved discrete random variable’ whose support
can be taken ast= 1,2, κ.
Atκt−1̸=κt, a rectangular transition matrix is necessary.
A positive-definite covariance matrix is introduced.
G. Anderson, A. Farcomeni, M.G. Pittau, R. Zelli, Rectangular latent Markov models for time-specific clustering,
with an analysis of the well being of nations, Journal of the Royal Statistical Society Series C: Applied Statistics 68,
603 (2019).
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Aciclyc directed graph on Stiefel manifold with uniform metric
•A graphG= (V,E) is taken: it is defined as bipartite iff there exists a
bipartition of verticesVsuch that all the edgesEconnect a vertex of
one partition. The bipartition of the vertices in the graphGis indicated
asG(V1,V2,E). If the edges are allowed to connect any number of
vertices, there happens the phenomenon of ’hyperedges’.
Letube a node inV1; letvbe a node inV2, and letHbe an hyperedge.
The ’fill’Fis the proportion of edges within the total number of possible
edges.
••The graphG(V1,V2,E) is taken to encode the schematisation of the
ranking, in which only edges between nodes inV1and nodes inV2are
allowed.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

The number of paths defines the number of blocks.
LetˆZbe anr×sreal matrix. The singular-value decomposition is
written as
ˆ
Z= ˆυ
ˆ
Σ
ˆ
V
T
.
The matrices ˆυand
ˆ
Vare orthogonal matrices:
•ˆυis anr×rcomplex unitary matrix, and

ˆ
Vis ans×scomplex unitary matrix. Furthermore,
ˆ
Σ is anr×srectangular
diagonal matrix.
The singular-value decomposition ofˆΣ is studied.
For the adjacency matrix
ˆ
A, the following decomposition holds
ˆA=
κ
ˆ0 ˆC
ˆC
Tˆ0
λ
The matrix
ˆ
Bis the biadjacency matrix of the graphG.
The powers of the matricesˆA’s from are defined after the properties of transposition
of
ˆ
C.
The number of paths of the graphGis therefore calculated.
The number of paths of length 2kbetween two nodesuandvequals [(CC
T
)
k
]uv,u,v∈V1.
The number of paths of length 2kbetween two nodesuandvequals [(C
T
C)
k
]uv,u,v∈V2.
The number of paths of length 2k+ 1 between two nodesuandvequals [(CC
T
)
k
C]uv,u,∈V1,v∈V2.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

The Markov process of a Stiefel manifold
••The matrix
ˆ
Tis chosen as
ˆ
T=
ˆ
M+
ˆ
H

ˆ
Tis anm×nrectangular matrix,
•ˆMis the unobserved mean matrix of rankK<m∧n, and
•ˆHis the matrix which accounts for the Gaussian noise.
•The matrix
ˆ
Mis decomposed as
ˆ
M=
ˆ
U
ˆ
D
ˆ
W

withˆUanm×nmatrix,ˆDann×ndiagonal matrix, andˆWann×mmatrix.
The representation of
ˆ
Thas to be chosen, in the cases in which a very high variance is
present.
••The entries of
ˆ
Trepresent covariance across thencolumns of the
K<nunobserved latent factors.
For the rankK, letˆUbe anm× Korthonormal matrix. The set of the matricesˆUis
νK,mthe Stiefel manifold.
••The probability distribution of
ˆ
Uis the uniform probability distribution
ofνK,m.
The metric of the Stiefel manifold is the uniform metric.
Once the rankKis fixed, the Markov chain is obtained after the Gibbs
sampling.
OML, Ranking of cancer genes from integration of heterogeneous data sources with Gaussian noise, related latent
models and related time-specific clustering(s) Hidden Markov Model(s) admit rectangular-matrix-representation
Markov chain on Stiefel manifold with uniform metric, submitted to Journal of AppliedMath JAM.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Outlook
•One of the causes of discussion about the gene manifestation is the
Markov aspects and the non-Markov aspects of the processes: the role of
the experimental-error propagation is crucial, for which several techniques
are developped to amend with the experimental techniques: inference,
regressions and numerical simulations are tried.
D. Draper, Assessment and Propagation of Model Uncertainty, Journal of the Royal Statistical Society. Series B
(Methodological) 57(1995), 45 (1995).
••The causes of non-Markov-like processes ar estill under investigation.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.

Thank You for Your attention.
International Meet on Stem cells and Regenerative Medicine STEMCELLSMEET2024 19 September 2024, Dubai, UAE.Allele manifestation in cancer: analytical expressions. Orchidea Maria Lecian Sapienza University of Rome, Rome, Italy.