This lecture is part of a course on "Probability, Estimation Theory, and Random Signals." So welcome back to the lecture slide set on multiple random variables. In a previous video, we looked at extending the probability transformation rule from the scalar random variable case to dealing with the vector random variables, where we have N input random variables and N output random variables. The probability transformation rule was interesting because it introduce the idea of a Jacobian. So in today's video, we're going to consider a slightly different case. What if we're transforming N random variables to M other random variables where M is less than N. M and N are both integers. So for example, if N is equal to two, where the input random variables are x and y, and the output random variable Z, so M is equal to one. How do we calculate the probability density function of Z? Now there are a few different ways of doing this, including looking at elemental regions in a probability space. However, what we're going to do in this video today is to introduce the concept of an auxiliary variable. And this actually just simplifies the analysis in a great number of cases. So let's start by considering a random variable that is one function of two other random variables, x and y. We're going to introduce an auxiliary random variable w. So that means that we are now going from two random variables in the input space, x and y to two random variables at the output z and w. Now there is an interesting question about how we choose the value of w. And that's going to be a topic of conversation for today. By examples might be as simple as choosing w equal to x or w equal to y? Is it that simple? Well, yes, actually, in many situations, it is. So now that we've gone from a situation of N equal to two, for X and Y, to M equal to two. for Z and W. We can use the probability transformation rule which we already have, but that's great, you might say, that gives us this familiar looking expression here. But I don't want the joint pdf of "W" and "Z". I just want the PDF of "Z". So how do you think we could just get the density of Z? I hope you've had a chance to think about that because yes, if you've got the right answer, we know that we can just do this by marginalization. I take the joint pdf of w and z, and I marginalize over the auxiliary variable. And that's it. It's as simple as that. So let's do an example or two. So in this first example, we're going to consider the sum of two random variables. Suppose that X and Y have a joint PDF f of x, y, x, y. And a PDF of the random variable Z is given by a linear combination of x plus y. So aX plus bY where a and b are constants. How do we find this PDF? So the solution is to use the auxiliary variable w equal to y. Now, you could equally set w equal to x and you should get the same result. I'll leave this as an exercise to the reader. So I'm just going to write this out again. We've got z equals x plus b y, and now we've got w equal to y. Now, there's a single solution to this set of equations, we've got y is equal to w. And then rearranging this to get x, we have got a, x is z minus b y, but y is equal to w. Therefore x is equal to z minus b w over a. So it's a single solution. So that simplifies our probability transformation rule. Now we just need to work out the Jacobian. So we're doing the Jacobian from going from x and y to W and Z. Now that's our determinant matrix. And we're going to fill it up with dz / dx, dz / dy, dw / dx, and dw/dy. And we notice that dz / dx is just a, dz / dy is b, dw/dx is 0, and dw / dy is one, and that equals to a. Now when you calculate these Jacobians, notice here I've gone row-wise, but it's actually possible using the properties of determinants to write that Jacobian in different ways. So here's a slightly, slightly different way of writing where I've gone column-wise. So I've differentiate w with respect to x and then w with respect to y but going down in columns. And in that case, you just basic get a rotated matrix compared to before. Now also notice that we are going to the magnitude of this as well. So putting all of that together, we've got the joint pdf of w and z is one over the determinant or the one over magnitude of a times the pdf of x and y, but with x replaced by z minus bw over a, and y replaced by w. We then just need to marginalize and integrate over w. And that gives us this expression here. Now that's quite an elegant expression and we're going to see in a later video. that that will actually simplify further, if X and Y are independent. I haven't assumed X and Y are independent here, but if they are, this formula would actually simplify to convolution, but let's come back to that later. Now, as I mentioned earlier, you might be concerned with my choice of w equal to y. What if I choose something else, what if I choose w equal to x? The answer is that as long as the auxiliary variable is a function of at least one of the random variables, then it doesn't actually matter because the marginalisation stage will yield the same answer. And I'm very happy to show an example of that in a second. Nevertheless, it usually pays just to choose the simplest auxiliary variable to avoid any difficulties in, in evaluating the marginal PDF. So let's go back to our previous problem. Let suppose what we chose w to equal to x. Now, remember in both cases, we have z equals a*x plus b*y. Therefore, the inverse solution is that x is equal to w. And from here, we see that y is equal to z minus a, w over b. The Jacobian is still dz/ dx, which is a, dz / dy, which is b. But this time, dw/dx is 1, and dw/dy is 0. And we see that the value of this will equal minus b. So when we apply the probability transformation rule, the pdf of w and z is one over the magnitude of b times the pdf of x and y now with x equal to w. And now with y replaced by z minus a w over b, but through marginalization, we end up with this formula here. Now the question is, how does that relate to the formula on the previous page? Here is a previous result. Well, it turns out that in fact, these are identical. If I apply the substitution, "w hat" equal to z minus a w over b. And what you're soon discover when doing this substitution is that I'm kind of unraveling or reversing some of the work that we did above. So if I rearrange that formula, then you can show that w equals z minus b w hat over a. And also "d w hat" is equal to a over b dw. So when you substitute that back into this formula here, you will see that w is now z minus b w hat over a. This z minus a w over b, was just b w and d w is b over a, dw hat. And if I dealt with these magnitude signs, you'll see they'll cancel, and you can see that this is equivalent to the previous result. So what's really happening is through the substitution, that you can do in a marginalization step, that's how the different auxiliary variables will end up being equivalent expressions. In fact, if you're really keen, you can have a look in the handout where I've tried an even more exotic substitution of w equal to x over y. That is an incredibly complicated substitution; completely unnecessary; but the handout does show that that will give you an exactly equivalent results as the two, ... as the one we've got at the bottom of this page. So finally to finish this example, if you want to prove this to yourself, I suggest that you don't take one as complicated as the one on the handout. But you consider, for example, the substitution w equal to x minus y, and see if you can prove the same result. So finally, here's a more complicated example which is in the handout, but I'm going to work through here. We have two random variables, x and y. And they are independent with Rayleigh densities. The Rayleigh density, by the way, if you were to plot it, because it's an x e to the minus x squared form. Looks very much like this. The question says. If I have a new random variable which is ratio of those two values, and finding the ratio of two random variables is quite common. Sorry about the ratio of these two densities is given by this interesting looking expression here is interesting because it's effectively an algebraic form. So notice, but the ratio of two densities from the exponential family is now no longer from the exponential family. Using this result was a final formula here, which I'll come back to in a moment. So let's start off by finding what the probability density function of a ratio of two random variables is. So we're gonna start with zed equals x over y. And I'm going to do this generically fast pages up because enlighten us on a white board. I'm going to drop pool V of Z, two terms and try and keep this as simple as possible. I need to choose an auxiliary variable. And we can choose either w equal to x or w equal to y. You could even choose a more exotic form. But for the moment I'm going to pick one arbitrarily. Now I'm actually going to pick once specifically to be different to the one in the handout just to make this video different to the handout. So I'm going to choose w equal to y. In the handout. I use the value to be equal to x. So the first stage is to work out the inverse solutions. They are y is equal to W. But from this expression here, x equals and y, or in other words, x equals x0 times dopey. There's one solution. So in the second stage is to work out the Jacobian. Now you can do this in one of two ways. I'm going to do it different to the handout again. So you can work out which created them from the x and y space to the WBS at space. But that would involve calculating, for example, the zed dx, which is fine. But these are DY, is beginning to look messy. Therefore, I natures submerged cribbing from going from W and Z to X and Y is looking particularly elegant because that involves calculating the expressions dx, dy is add dx dw d y, d z, d y TWO. So dx d z is w, dx dw is that d y d z is 0. D y d w is one. And I can just see if that equals Derby. So that's a fairly simple expression. And I noticed that Jacobian of X, Y, Z is one EVA, which gave him from z to x y. So now we have our probability transformation. So the pdf of to be and z is one over the Caribbean, which is one over, one over W, which is W, times a PDF of x and y, but with x replaced by w, z. Why replaced by w? So therefore the pdf of z is the integral of this function over to v. Now let's just fill amendment return to the problem. We're actually told the X and Y are independent and I rarely distributed with parameters alpha and beta respectively. And that means that the Rayleigh distribution is only defined for positive values of x and y. That Philippa simplifies our problem slightly in about risk becomes a PDF of x times a pdf of y. Now before we sub shoot him what the PDF? So we also need to work out a range of variables. So remember, but two, B was equal to y and y was between 0 and infinity. So that means that w0 is between 0 and infinity. For z was equal to x over y. And both x and y can go between 0 and infinity. So what does that mean? Well, consider a fixed value of y and z, also values between 0 and infinity. And you might say what happens when y goes to 0, where you get your Zara would that's accounted for in E infinity. So we're OK. So that means what we should do is simply substitute densities into this integral here. Say f z is the integral from 0 to infinity of w times f of x. Now, f of x was x, which is derby zed, ever after school ahead. Times e to minus x squared wishes eaten minus z squared over two alpha squared. F of y was y, which is Doobie ever beat two squared e to the minus y squared, which is just b squared over two v two square that GDB. Now we're going to try and combine some terms. They've got free w's. So it's wL cubed over alpha squared, b2 squared. And we've got reset and then we're going to combine the exponentials. But what I'm going to do is to put my w squareds together. Because Amendment, that's what we're going to be integrating. And we have a z squared, alpha squared plus one and p two squared TW. Naturally, these integrals requires a little bit of insight, but we also can rely on some identities, which we can take, for example, from Wikipedia or something like that. Now one identity which we can use is that for a Rayleigh distribution, the expected value of X squared, in this case, x has got with parameter alpha is actually equal to two alpha squared. Notice here, the pdf of x was equal to x i of alpha squared e to the minus x squared over two alpha square. So as possible share this identity. Or in other words, putting all that together, that means using the expression for the second amendment is that the integral of x cubed of alpha squared experiment of minus that squared of two alpha squared dx is equal to two alpha square. So we can use an identity. But this formula here is not quite in that form. So basically how do we make this look like this? Well, that's where we look at this and do a substitution. So I'm going to do a substitution. I'm going to call alpha hat squared equal to this term here. And this is not a substantiation for variable of integration. It's literally a substitution into the expression for the integral. If I do that, I notice that the pdf of z, we're going to take out different things I know is that David beat to scrub ahead. Now I'm going to tell us I take out the alpha squared. You'll see why in a moment. Now I'm going to take the integral from 0 to infinity of b cubed. Now to make this similar to this set commandment identity, which means I need to put an alpha squared hat out front, EXP minus w squared over two alpha. Go ahead, GDB. But we know what the second integral is equal to two alpha hat squared. So working through all of that, we now have two times zed alpha hat to the fourth over apps grad v two squared. Well remember, alpha hat is defined by this term here whenever this. So we now have the expression two Zed, the alpha squared b2 squared error, zed squared alpha squared plus one over bt squared squared. And if I multiply top and bottom by alpha squared into E alpha squared and we finally have the answer. But we were looking for, now this is for Zed being positive values. Now the final part of the question, I wanted to just to find out, what's the probability that random variable X takes on a value less than or equal to a proportion of a random variable Y. Now this does have many applications, but bass, he says, if you took a proportion of why, what's the probability that X is less than that? So the probability that X is less than or equal to K, y is the same as the probability for x over y is less than or equal to k, which equals the probability that Z is less than or equal to k. Now that then equals the integral of the PDF between 0 and k of f of z d z. So that equals the integral from 0 to K of two z alpha squared p2 squared zed squared plus alpha squared V squared. Do you said? Now that's one of those integrals that you might initially look, look out and think is extremely hot. But what's interesting to note is that if you order of enumerator in terms of z is an order lower than the denominator bandwidth, the derivative is probably actually quite straightforward. So you can work through this very carefully yourself by Notice that if a denominator was zed squared plus alpha squared, a repeat squared, rather than it being a squared here, if you were to differentiate just the first order term. So differentiate one over Z squared plus alpha squared beta squared, you would get two zed squared plus alpha squared b squared 2z squared. So you'd get, so I just need a couple of factors in here. So I just need to put a minus half in here and then integrate that between 0 and k. Sometimes when a handwrite ie solutions, you can always tell the difference between a, a two and a z. So that finding is the probability that Z is less than or equal to k is equal to alpha squared over b squared. That change your protonate our limits. So I'll put z equal to 0. And so this becomes alpha squared over b squared minus k squared plus alpha squared beta squared. And then finally, we have a small amount to manipulation. You can get the final answer of k squared over k squared plus alpha squared beta squared. Okay, so that example was really there just to show you that these examples are possible and weaving within your capabilities using auxiliary variables. And I would encourage you to work through this solution again on your own without following a handout or following a video, just to make sure you're getting comfortable with those expressions. So in summary, this topic has introduced the use of auxiliary variables and our applications is a very powerful tool when going from a number of random variables to fewer random variables is worth mentioning. You might ask, what if you went from two random variables at input, two more random variables at the output to four random variables at the output. Well, I haven't covered that case primarily because there is a deterministic solution to that problem. If I've got one random variable for example, and I went to New random variables than face to random variables or 100% correlated. And one can re-express deterministically in terms of VFS. So therefore, scenario is relatively trivial. So I'd encourage you to go and practice some over questions in the handout, which gets you to use auxiliary variables. In the meantime. Thank you very much.