Evaluation Metrics for Disentanglement

Posted by Wenzhao Wei on 2021-02-05

2021-02-04

Evaluation Metrics

Unsupervised metrics:

  • Mutual Information (MI): between the input image and its embedded low dimensional representation using MINE to estimate. However, the MI metrics used previously is not normalized, we can normalize it using such formula

    Since Mutual information is the intersection, it is computed by \(MI(X,Y) = H(X) - H(X|Y)\), The normalized mutual information can be to obtain a scale of \([0, 1]\)

$$
NMI(X,Y) = \frac{2 \times I(Y;X)}{H(Y) + H(X)}
$$

  • Continuity

  • Trust worthiness

  • LCMC

Supervised metrics (require):

  • Modularity

    Assumption: ideally, each embed dimension will have high mutual information with a single factor and zero mutual information with all other factors. Modularity measures if each dimension of the latent space depends on only one attribute.

    Firstly measure the MI between each embed dimension and each truth factor. Then find the \(\theta_i = max(I_{ig})\), and define a template vector
    $$
    t_if =
    \begin{cases}
    \theta_{if} & \text{if f=$argmax_g(I_{g})$} \
    0 & \text{otherwise}
    \end{cases}
    $$
    The modularity can be computed as

  • Mutual Information Gap (MIG Beta-TCVAE paper)

    Estimate the MI between a latent variable \(z_j\) and a ground truth factor \(v_k\) using the joint distribution.

    Single factor can have high mutual information with multiple latent variables. We enforce axis-alignment by measuring the difference between the top two latent variables with highest mutual information.

    Note that MIG is normalized and have scale [0, 1];

  • Separated Attribute Predictability (SAP) from DIP-VAE Paper

    Firstly, construct score matrix \(S_{dk}\) consist results from linear classification (predicting \(j^{th}\) factor using latent \(z_i\)), compute the residual score \(R^2\) measuring how well the line is fitted.

  • Spearman Correlation (SCC)

    computes the maximum value of the SCC between an attribute and each dimension of the latent space.

  • Interpretability:

    measure the ability to predict a given of each

    From the k latent dimensions of \(z\), select \(z_i\space where\space i \leq k\) with maximal information about a truth factor \(v_j\)). \(i=argmax_i(v_j, z_i|x_t)\). Evaluate the interpretability \(z_i\)) to \(v_j\)) by measuring \(p(s_j|z_i)\), by summing the logarithms of the resulting probabilities corresponding to every test sample point for a dimension j of the side information.

    By aggregating the scores over all the dimensions of the side information s, we get the interpretability score