Tutorial07

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Slide1: 

Tutorial by Ma’ayan Fishelson Changes made by Anna Tzemach

The Given Problem: 

The Given Problem Input: A pedigree + phenotype information about some of the people. These people are called typed. Output: the probability of the observed data, given some probability model for the transmission of alleles. typed

Slide3: 

Q: What is the probability of the observed data composed of ? A: There are three types of probability functions: founder probabilities, penetrance probabilities, and transmission probabilities.

Founder Probabilities – One Locus: 

Suppose the gene frequency of d is 0.05, then: P(d/h) = 2 * 0.05 * 0.95 Founder Probabilities – One Locus Founders – individuals whose parents are not in the pedigree. We need to assign probabilities to their genotypes. This is done by assuming Hardy-Weinberg equilibrium.

Founder Probabilities – Multiple Loci: 

Founder Probabilities – Multiple Loci According to linkage equilibrium, the probability of the multi-locus genotype of founder k is: Pr(xk) = Pr(xk1) *…* Pr(xkn)

Penetrance Probabilities: 

Penetrance Probabilities Penetrance: the probability of the phenotype, given the genotype. E.g.,dominant disease, complete penetrance: E.g., recessive disease, incomplete penetrance: Can be, for example, sex-dependent, age-dependent, environment-dependent.

Transmission Probabilities: 

Transmission Probabilities Transmission probability: the probability of a child having a certain genotype given the parents’ genotypes. Pr(xc| xm, xf). If we split the ordered genotype xc into the maternal allele xcm and the paternal allele xcf, we get: Pr(xc| xm, xf) = Pr(xcm|xm)Pr(xcf|xf) The inheritance from each parent is independent.

Transmission Probabilities – One locus: 

Transmission Probabilities – One locus The transmission is according to the 1st law of Mendel. Pr(Xc=d/h | Xm=h/h, Xf=d/h) = Pr(Xcm=h | Xm=h/h)*Pr(Xcf=d | Xf=d/h) = 1 * ½ = ½ We also need to add the inheritance probability of the other phase, but we can see that it’s zero !

Transmission Probabilities – One locus: 

Transmission Probabilities – One locus Different children are independent given the genotypes of their parents. Pr(X3=d/h, X4=h/h, x5=d/h | X1=d/h, X2=h/h) = = (1 * ½) * (1 * ½) * (1 * ½)

Transmission Probabilities – Multiple Loci: 

Transmission Probabilities – Multiple Loci Let’s look at paternal inheritance for example. We generate all possible recombination sequences (s1,s2,…,sn), where sl = 1 or sl = -1. (2n sequences for n loci). Each sequence determines a selection of paternal alleles p1,p2,…,pn where: and therefore its probability of inheritance is: We need to sum the probabilities of all 2n recombination sequences.

Calculating the Likelihood of Family Data - Summary: 

Calculating the Likelihood of Family Data - Summary For a pedigree with m people: where x=(x1,…,xm) and g=(g1,…,gm). The likelihood of the data is the probability of the observed data (the known phenotypes), given certain values for the unknown recombination fractions.

Calculating the Likelihood of Family Data - Summary: 

Calculating the Likelihood of Family Data - Summary Gi : genotype vector for individual i Founders: 1..k Non founders: im(i), f(i) Founder priorsby Hardy-Weinberg Recombinationprobabilities Penetrances

Computational Problem: 

Computational Problem Complexity disaster: Exponential in #markers Exponential in #individuals Performing a multiple sum over all possible genotype combinations for all members of the pedigree.

Elston-Stewart algorithm: 

Elston-Stewart algorithm The Elston-Stewart algorithm provides a means for evaluating the multiple sum in a streamlined fashion, for simple pedigrees. More efficient computation Exponential in #markers Linear in #individuals

Simple Pedigree: 

Simple Pedigree No consanguineous marriages, marriages of blood-related individuals ( no loops in the pedigree). There is one pair of founders from which the whole pedigree is generated.

Simple Pedigree: 

Simple Pedigree There is exactly one nuclear family T at the top generation. Every other nuclear family has exactly one parent who is a direct descendant of the two parents in family T and one parent who has no ancestors in the pedigree (such a person is called a founder). There are no multiple marriages. One of the parents in T is treated as the proband.

“Peeling” Order: 

“Peeling” Order Assume that the individuals in the pedigree are ordered such that parents precede their children, then the pedigree likelihood can be represented as: where is: P(gi), if i is a founder, or , otherwise. In this way, we first sum over all possible genotypes of the children and only then on the possible genotypes for the parents.

An Example for “Peeling” Order: 

An Example for “Peeling” Order h(gi) = P(xi|gi) P(gi) h(gm,gf,gc) = P(xc|gc) P(gc|gm,gf)

Elston-Stewart “Peeling” Order: 

Elston-Stewart “Peeling” Order As can be seen, this “peeling” order, “clips off” branches (sibships) of the pedigree, one after the other, in a bottom-up order.

Elston-Stewart – Computational Complexity : 

Elston-Stewart – Computational Complexity The computational complexity of the algorithm is linear in the number of people but exponential in the number of loci.

Variation on the Elston-Stewart Algorithm in Fastlink: 

Variation on the Elston-Stewart Algorithm in Fastlink The pedigree traversal order in Fastlink is some modification of the Elston-Stewart algorithm. Assume no multiple marriages… Nuclear family graph: Vertices: each nuclear family is a vertex. Edges: if some individual is a child in nuclear family x and a parent in nuclear family y, then x and y are connected by and edge x-y which is called a “down” edge w.r.t. x and an “up” edge w.r.t. y.

Traversal Order: 

Traversal Order Visit(w) { While w has an unvisited neighbor x reachable via an up edge: Visit(x); While w has an unvisited neighbor y reachable via a down edge: Visit(y); Update w; } One individual A is chosen to be a “proband”. For each genotype g, the probability is computed that A has genotype g conditioned on the known phenotypes for the rest of the pedigree and the assumed recombination fractions. The first family that is visited is a family containing the proband, preferably, a family in which he is a child.

Traversal Order - Updates: 

Traversal Order - Updates If nuclear family w is reached via a down edge from z, the parent in w that nuclear families w and z share, is updated. If nuclear family w is reached via an up edge from z, then the child that w and z share is updated.

Example 1: 

Example 1

Example 2: 

Example 2