Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] FastICA rotation #9

Open
wants to merge 40 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
2ee1e45
Added changes. Started Univariate Tests.
jejjohnson Apr 8, 2019
cc4420c
Made small changes. started inverseCDF.
jejjohnson Jun 19, 2019
bf8a3f4
Working marginal uniformization.
jejjohnson Jun 19, 2019
a32c350
Some changes.
jejjohnson Jul 2, 2019
485232e
Made progress with refactoring. Working.
jejjohnson Jul 2, 2019
f06638c
Working functions demo notebook. TODO: LogDet
jejjohnson Jul 2, 2019
3857f67
Working Implementation of Linear Trans.
jejjohnson Oct 19, 2019
35a6f68
Same changes as before.
jejjohnson Oct 19, 2019
2843679
Minor changes.
jejjohnson Oct 19, 2019
885b37a
Updated notebooks.
jejjohnson Oct 20, 2019
caadd17
Added density mixin object with sampling method.
jejjohnson Oct 20, 2019
013ca0c
Added new naive quantile transformer.
jejjohnson Oct 20, 2019
e9e43f6
Added RBIG block.
jejjohnson Oct 20, 2019
9096a1d
Started Marginal Transform Base Class.
jejjohnson Oct 20, 2019
9c2fb5a
Added RBIG Flow model.
jejjohnson Oct 20, 2019
fd509ae
Working notebooks.
jejjohnson Oct 20, 2019
a676011
Got the quantile transform working.
jejjohnson Oct 20, 2019
79f4016
Still leakage. Will fix later.
jejjohnson Oct 20, 2019
484754e
Fixed Leak. Updated demos.
jejjohnson Oct 20, 2019
6cca69b
Made a few minor changes to custom quantiler.
jejjohnson Oct 21, 2019
0c1e0a5
Major changes.
jejjohnson Oct 21, 2019
9745cba
naive implementation of mi.
jejjohnson Oct 22, 2019
0015b6f
small change.
jejjohnson Oct 22, 2019
f3671af
Working exponential family entropy.
jejjohnson Oct 23, 2019
551423a
Final Exponential family approximation.
jejjohnson Oct 23, 2019
013560a
Converted MI to Base Estimator.
jejjohnson Nov 8, 2019
0a0a839
Added docs.
jejjohnson Mar 11, 2020
dea0a6e
Added uniformization notes.
jejjohnson Mar 11, 2020
8945183
Big changes
jejjohnson Mar 12, 2020
9d7f55b
Added ignore for pytest cache
jejjohnson Mar 12, 2020
ba9c2b3
Removed doc files.
jejjohnson Mar 12, 2020
8b66a39
Revert "Removed doc files."
jejjohnson Mar 12, 2020
f888866
Added pics.
jejjohnson Mar 12, 2020
3628fa3
Added no jekyll file.
jejjohnson Mar 12, 2020
b30d715
Updates.
jejjohnson Mar 19, 2020
617cded
a lot of refactoring.
jejjohnson Apr 1, 2020
4804aa4
Merge branch 'refactoring' of https://mutenroshi.uv.es/gitlab/emmanue…
jejjohnson Apr 2, 2020
b682fdc
Working RBIG Flow Models
jejjohnson Apr 2, 2020
e50abeb
deep rearranging.
jejjohnson Apr 3, 2020
e052455
Merge branch 'master' of github.com:IPL-UV/rbig into fastica
jejjohnson Nov 20, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
PYTHONPATH="${workspaceFolder}/."
3 changes: 1 addition & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ code_through/
\.idea/
\__pycache__/
*.log

*.tex
.idea/misc.xml
.idea/rbig.iml
Expand All @@ -18,4 +17,4 @@ code_through/
*.csv

\.eggs/
\site/
\site/
Empty file added docs/.nojekyll
Empty file.
16 changes: 16 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Rotation-Based Iterative Gaussianization


A method that provides a transformation scheme from any distribution to a gaussian distribution. This repository will facilitate translating the original MATLAB code into a python implementation compatible with the scikit-learn framework.


### Resources

* Original Webpage - [ISP](http://isp.uv.es/rbig.html)
* Original MATLAB Code - [webpage](http://isp.uv.es/code/featureextraction/RBIG_toolbox.zip)
* Original Python Code - [github](https://github.com/spencerkent/pyRBIG)
* [Paper](https://arxiv.org/abs/1602.00229) - Iterative Gaussianization: from ICA to Random Rotations

Abstract From Paper

> Most signal processing problems involve the challenging task of multidimensional probability density function (PDF) estimation. In this work, we propose a solution to this problem by using a family of Rotation-based Iterative Gaussianization (RBIG) transforms. The general framework consists of the sequential application of a univariate marginal Gaussianization transform followed by an orthonormal transform. The proposed procedure looks for differentiable transforms to a known PDF so that the unknown PDF can be estimated at any point of the original domain. In particular, we aim at a zero mean unit covariance Gaussian for convenience. RBIG is formally similar to classical iterative Projection Pursuit (PP) algorithms. However, we show that, unlike in PP methods, the particular class of rotations used has no special qualitative relevance in this context, since looking for interestingness is not a critical issue for PDF estimation. The key difference is that our approach focuses on the univariate part (marginal Gaussianization) of the problem rather than on the multivariate part (rotation). This difference implies that one may select the most convenient rotation suited to each practical application. The differentiability, invertibility and convergence of RBIG are theoretically and experimentally analyzed. Relation to other methods, such as Radial Gaussianization (RG), one-class support vector domain description (SVDD), and deep neural networks (DNN) is also pointed out. The practical performance of RBIG is successfully illustrated in a number of multidimensional problems such as image synthesis, classification, denoising, and multi-information estimation.
24 changes: 24 additions & 0 deletions docs/_sidebar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
<!-- docs/_sidebar.md -->

* [pyRBIG](README.md)

**Theory**
* [Literature](literature.md)
* [What is Gaussianization?](gaussianization.md)
* [What is RBIG?](rbig.md)
* [Normalizing Flows](nfs.md)

**Demos**
* [Gaussianization](/)
* [Information Theory](/)

**Walk-Throughs**
* [Uniformization](mu.md)
* [Marginal Gaussianization](mg.md)
* [Rotation](rotation.md)

**Supplementary**
* [Information Theory](itm.md)
* [Gaussian Distribution](gaussian.md)
* [Uniform Distribution](uniform.md)
* [Exponential Family](exponential.md)
27 changes: 27 additions & 0 deletions docs/dds.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Density Destructors


## Main Idea


## Forward Approach

We can view the approach of modeling from two perspectives: constructive or destructive. A constructive process tries to learn how to build an exact sequence of transformations to go from $z$ to $x$. The destructive process does the complete opposite and decides to create a sequence of transforms from $x$ to $z$ while also remembering the exact transforms; enabling it to reverse that sequence of transforms.

We can write some equations to illustrate exactly what we mean by these two terms. Let's define two spaces: one is our data space $\mathcal X$ and the other is the base space $\mathcal Z$. We want to learn a transformation $f_\theta$ that maps us from $\mathcal X$ to $\mathcal Z$, $f : \mathcal X \rightarrow \mathcal Z$. We also want a function $G_\theta$ that maps us from $\mathcal Z$ to $\mathcal X$, $f : \mathcal Z \rightarrow \mathcal X$.

**TODO: Plot**

More concretely, let's define the following pair of equations:

$$z \sim \mathcal{P}_\mathcal{Z}$$
$$\hat x = \mathcal G_\theta (z)$$

This is called the generative step; how well do we fit our parameters such that $x \approx \hat x$. We can define the alternative step below:

$$x \sim \mathcal{P}_\mathcal{X}$$
$$\hat z = \mathcal f_\theta (x)$$

This is called the inference step: how well do we fit the parameters of our transformation $f_\theta$ s.t. $z \approx \hat z$. So there are immediately some things to notice about this. Depending on the method you use in the deep learning community, the functions $\mathcal G_\theta$ and $f_\theta$ can be defined differently. Typically we are looking at the class of algorithms where we want $f_\theta = \mathcal G_\theta^{-1}$. In this ideal scenario, we only need to learn one transformation instead of two. With this requirement, we can actually compute the likelihood values exactly. The likelihood of the value $x$ given the transformation $\mathcal G_\theta$ is given as:

$$\mathcal P_{\hat x}(x)=\mathcal P_{z} \left( \mathcal G_\theta (x) \right)\left| \text{det } \mathbf J_{\mathcal G_\theta} \right|$$
57 changes: 57 additions & 0 deletions docs/demo_innf.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Demo: Gaussianization



## Data

```python

```

## RBIG Model

### Initialize Model

```python
# rbig parameters
n_layers = 1
rotation_type = 'PCA'
random_state = 123
zero_tolerance = 100
base = 'gauss'

# initialize RBIG Class
rbig_clf = RBIG(
n_layers=n_layers,
rotation_type=rotation_type,
random_state=random_state,
zero_tolerance=zero_tolerance,
base=base
)
```

### Fit Model to Data

```python
# run RBIG model
rbig_clf.fit(X);
```

### Visualization


#### 1. Marginal Gaussianization

```python
# rotation matrix V (N x F)
V = rbig_clf.rotation_matrix[0]

# perform rotation
data_marg_gauss = X @ V
```

#### 2. Rotation

```python

```
51 changes: 51 additions & 0 deletions docs/exponential.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Exponential Family of Distributions



This is the close-form expression for the Sharma-Mittal entropy calculation for expontial families. The Sharma-Mittal entropy is a generalization of the Shannon, Rényi and Tsallis entropy measurements. This estimates Y using the maximum likelihood estimation and then uses the analytical formula for the exponential family.



**Source Parameters, $\theta$**

$$\theta = (\mu, \Sigma)$$

where $\mu \in \mathbb{R}^{d}$ and $\Sigma > 0$

**Natural Parameters, $\eta$**

$$\eta = \left( \theta_2^{-1}\theta_1, \frac{1}{2}\theta_2^{-1} \right)$$

**Expectation Parameters**



**Log Normalizer, $F(\eta)$**

Also known as the log partition function.

$$F(\eta) = \frac{1}{4} tr( \eta_1^\top \eta_2^{-1} \eta) - \frac{1}{2} \log|\eta_2| + \frac{d}{2}\log \pi$$


**Gradient Log Normalizer, $\nabla F(\eta)$**

$$\nabla F(\eta) = \left( \frac{1}{2} \eta_2^{-1}\eta_1, -\frac{1}{2} \eta_2^{-1}- \frac{1}{4}(\eta_2^{-1}-\eta_1)(\eta_2^{-1}-\eta_1)^\top \right)$$

**Log Normalizer, $F(\theta)$**

Also known as the log partition function.

$$F(\theta) = \frac{1}{2} \theta_1^\top \theta_2^{-1} \theta + \frac{1}{2} \log|\theta_2| $$

**Final Entropy Calculation**

$$H = F(\eta) - \langle \eta, \nabla F(\eta) \rangle$$


## Resources

* A closed-form expression for the Sharma-Mittal entropy of exponential families - Nielsen & Nock (2012) - [Paper]()
* Statistical exponential families: A digest with flash cards - [Paper](https://arxiv.org/pdf/0911.4863.pdf)
* The Exponential Family: Getting Weird Expectations! - [Blog](https://zhiyzuo.github.io/Exponential-Family-Distributions/)
* Deep Exponential Family - [Code](https://github.com/tensorflow/probability/blob/master/tensorflow_probability/examples/deep_exponential_family.py)
* PyMEF: A Framework for Exponential Families in Python - [Code](https://github.com/pbrod/pymef) | [Paper](http://www-connex.lip6.fr/~schwander/articles/ssp2011.pdf)
98 changes: 98 additions & 0 deletions docs/gaussian.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
# Gaussian Distribution



### **PDF**

$$f(X)=
\frac{1}{\sqrt{(2\pi)^D|\Sigma|}}
\text{exp}\left( -\frac{1}{2} (x-\mu)^\top \Sigma^{-1} (x-\mu)\right)$$

### **Likelihood**

$$- \ln L = \frac{1}{2}\ln|\Sigma| + \frac{1}{2}(x-\mu)^\top \Sigma^{-1} (x - \mu) + \frac{D}{2}\ln 2\pi $$

### Alternative Representation

$$X \sim \mathcal{N}(\mu, \Sigma)$$

where $\mu$ is the mean function and $\Sigma$ is the covariance. Let's decompose $\Sigma$ as with an eigendecomposition like so

$$\Sigma = U\Lambda U^\top = U \Lambda^{1/2}(U\Lambda^{-1/2})^\top$$

Now we can represent our Normal distribution as:

$$X \sim \mu + U\Lambda^{1/2}Z$$



where:

* $U$ is a rotation matrix
* $\Lambda^{-1/2}$ is a scale matrix
* $\mu$ is a translation matrix
* $Z \sim \mathcal{N}(0,I)$

or also

$$X \sim \mu + UZ$$

where:

* $U$ is a rotation matrix
* $\Lambda$ is a scale matrix
* $\mu$ is a translation matrix
* $Z_n \sim \mathcal{N}(0,\Lambda)$


#### Reparameterization

So often in deep learning we will learn this distribution by a reparameterization like so:

$$X = \mu + AZ $$

where:

* $\mu \in \mathbb{R}^{d}$
* $A \in \mathbb{R}^{d\times l}$
* $Z_n \sim \mathcal{N}(0, I)$
* $\Sigma=AA^\top$ - the cholesky decomposition



---
### **Entropy**

**1 dimensional**

$$H(X) = \frac{1}{2} \log(2\pi e \sigma^2)$$

**D dimensional**
$$H(X) = \frac{D}{2} + \frac{D}{2} \ln(2\pi) + \frac{1}{2}\ln|\Sigma|$$


### **KL-Divergence (Relative Entropy)**

$$
KLD(\mathcal{N}_0||\mathcal{N}_1) = \frac{1}{2}
\left[
\text{tr}(\Sigma_1^{-1}\Sigma_0) +
(\mu_1 - \mu_0)^\top \Sigma_1^{-1} (\mu_1 - \mu_0) -
D + \ln \frac{|\Sigma_1|}{\Sigma_0|}
\right]
$$

if $\mu_1=\mu_0$ then:

$$
KLD(\Sigma_0||\Sigma_1) = \frac{1}{2} \left[
\text{tr}(\Sigma_1^{-1} \Sigma_0) - D + \ln \frac{|\Sigma_1|}{|\Sigma_0|} \right]
$$

**Mutual Information**

$$I(X)= - \frac{1}{2} \ln | \rho_0 |$$

where $\rho_0$ is the correlation matrix from $\Sigma_0$.

$$I(X)$$
Loading