For any binary classification dataset, let 𝑆𝐵 ∈ ℝ𝑑×𝑑 and 𝑆𝑊 ∈ ℝ𝑑×𝑑 be the between-class and…

Question

For any binary classification dataset, let 𝑆𝐵 ∈ ℝ𝑑×𝑑 and 𝑆𝑊 ∈ ℝ𝑑×𝑑 be the between-class and within-class scatter (covariance) matrices, respectively. The Fisher linear discriminant is defined by 𝑢∗ ∈ ℝ𝑑 , that maximizes

$J(u) = \frac{u^T S_B u}{u^T S_W u} $

If 𝜆 = 𝐽(𝑢∗ ), 𝑆𝑊 is non-singular and 𝑆𝐵 ≠ 0, then (𝑢∗ , 𝜆) must satisfy which ONE of the following equations?

Note: ℝ denotes the set of real numbers.

Accepted Answer

Correct answer: A — Key result: At a stationary point u* of J(u) we have S_B u* = λ S_W u*. Because S_W is invertible, this is equivalent to S_W^{-1} S_B u* = λ u*, so u* is an eigenvector of S_W^{-1} S_B with eigenvalue λ = J(u*). Derivation (sketch): Consider the Rayleigh quotient J(u) = (u^T S_B u)/(u^T S_W u). To find stationary points, use a Lagrange multiplier or take the derivative of J(u). Setting the derivative to zero yields (S_B - λ S_W) u = 0, i.e. S_B u = λ S_W u, where λ equals J(u) at the stationary point. Since S_W is non-singular, multiply by S_W^{-1} to get S_W^{-1} S_B u = λ u, the standard generalized-eigenvalue form. The maximizing direction u* is the eigenvector of S_W^{-1} S_B corresponding to the largest eigenvalue λ. Why the other proposed equations are incorrect: An equation that places S_W on the left and S_B on the right (e.g. S_W u = λ S_B u) would imply a reciprocal relationship and thus does not match λ = J(u*) as defined. A product like S_B S_W u = λ u changes the matrix order and is not produced by the stationary condition; matrix multiplication order matters. An identity relating the squared norm of u* to λ^2 is unrelated to the Rayleigh quotient optimization and does not follow from the derivation.

Answer

A. $S_W^{-1} S_B u^* = \lambda u^* $

Answer

B. $S_W u^* = \lambda S_B u^* $

Answer

C. $S_B S_W u^* = \lambda u^* $

Answer

D. $u^{*T} u^* = \lambda^2 $

For any binary classification dataset, let 𝑆𝐵 ∈ ℝ𝑑×𝑑 and 𝑆𝑊 ∈ ℝ𝑑×𝑑 be…