Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neuralized K-Means #197

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Neuralized K-Means #197

wants to merge 10 commits into from

Conversation

jackmcrider
Copy link

@jackmcrider jackmcrider commented Aug 16, 2023

Hi chr5tphr!

I started an attempt to implement (deep) neuralized k-means (https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9817459) as more people want to use it and ask for code.

I took the SoftplusCanonizer from the docs as a starting point.

Main changes:

  • Add Distance, NeuralizedKMeans and LogMeanExpPool in zennit.layer
  • Add KMeansCanonizer in zennit.canonizers

Some things can be optimized:

  • add tests (I wrote a few, but then changed something and did not bother to rewrite tests unless I see a chance of this being merged)
  • copy.deepcopy in KMeansCanonizer.register probably not ideal
    • alternative: clone distance_module.centroids and import Distance from layer.py; then, create new Distance instance in remove()
  • KMeansCanonizer.apply scans for Distance layers, but there can be distance layers in some architectures that don't belong to k-means
  • one idea would be add a "contrastive layer" and identify kmeans as a composition of Distance followed by Contrastive similar to MergeBatchNorm:
    • contrastive computes (out[:,None,:] - out[None,:,:])[mask].reshape(K,K-1,D), cf. line 379-384 in canonizers.py
  • several advantages:
    • output of k-means and neuralized k-means are identical,
    • contrastive layer could also be applied to classifiers to get class-contrastive explanations (cf. Fig. 10.5 in https://iphome.hhi.de/samek/pdf/MonXAI19.pdf)
    • could replace NeuralizedKMeans layer by something more general (difference between squared distances and difference between linear layers are both linear, i.e. a contrastive layer covers both; would still need two separate Canonizers)
  • the scanning in KMeansCanonizer.apply does not work if Distance is the first layer (kmeans in input space), one can do Sequential(Distance(centroids)) as a trick, but not ideal
  • if I understand correctly, KMeansCanonizer.register is executed each time one calls with Gradient(...) as attributor; could be a bottleneck if number of clusters is large

Closes #198

@jackmcrider jackmcrider requested a review from p16i August 16, 2023 21:10
@jackmcrider jackmcrider marked this pull request as draft August 16, 2023 21:17
@jackmcrider
Copy link
Author

I just found the contributing guide and converted this to draft for now since I broke every single guideline.

@p16i I clicked somewhere and triggered a review request. Please ignore.

@jackmcrider jackmcrider force-pushed the master branch 3 times, most recently from 1c64129 to cbb350e Compare August 17, 2023 11:48
- documentation in numpydoc format
- pylint + flake8 stuff
- KMeansCanonizer
- NeuralizedKMeans layer
- LogMeanExpPool layer
- Distance layer
- Distance type
@jackmcrider jackmcrider marked this pull request as ready for review August 17, 2023 13:51
@jackmcrider
Copy link
Author

I tried to merge everything into one commit, extended documentation and made sure that all checks pass.
Could be reviewed, but there are no functional changes.

@jackmcrider jackmcrider force-pushed the master branch 6 times, most recently from 2bf6f52 to 7047178 Compare August 18, 2023 14:43
- Explaining Deep Cluster Assignments with Neuralized K-Means on Image Data
- I tried to adhere to guidelines
- That means: random data, random weights
- Code for real data and real weights in comments
- Runs on colab, did not test blender
- also adds the reference to docs/source/tutorial/index.rst
@jackmcrider
Copy link
Author

Checks pass 👍

It's quite challenging to get reproducible tox results for tutorials (e.g. had to manually fiddle with metadata.kernelspec.name in the raw .ipynb before committing). Probably a limitation of tox, but could be documented for future contributors. I'm not sure where, maybe in Contributing#continuous-integration.

I'm gonna freeze this branch for now, unless something comes up.

Copy link
Owner

@chr5tphr chr5tphr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jackmcrider

thanks a lot for the contribution!
I have looked at your implementation and left a few comments.

I have not yet looked at the tutorial.

src/zennit/layer.py Show resolved Hide resolved
src/zennit/layer.py Outdated Show resolved Hide resolved
src/zennit/layer.py Outdated Show resolved Hide resolved
src/zennit/layer.py Show resolved Hide resolved
src/zennit/layer.py Outdated Show resolved Hide resolved
src/zennit/canonizers.py Outdated Show resolved Hide resolved
src/zennit/canonizers.py Outdated Show resolved Hide resolved
src/zennit/canonizers.py Outdated Show resolved Hide resolved
src/zennit/canonizers.py Show resolved Hide resolved
src/zennit/canonizers.py Outdated Show resolved Hide resolved
jackmcrider and others added 7 commits August 24, 2023 09:45
Co-authored-by: Christopher <[email protected]>
change `torch.log(torch.tensor(n_dims, dtype=...))` to `math.log(n_dims)`

Co-authored-by: Christopher <[email protected]>
change `setattr(parent_module, ...)` to `parent_module.add_module(...)`

Co-authored-by: Christopher <[email protected]>
add spaces around binary operators

Co-authored-by: Christopher <[email protected]>
- rename Distance to PairwiseCentroidDistance
- remove LogMeanExpPool (might become relevant again, but not for now)
- add MinPool1d and MinPool2d in layer.py
- add MinTakesMost1d, MaxTakesMost1d, MinTakesMost2d, MaxTakesMost2d rules
  - largely untested. especially kernel_size as int or kernel_size as tuple
  - in principle, MaxTakesMost2d should also work for MaxPoll2d layers in
    standard conv nets - but needs some testing
- add abstract TakesMostBase class
- remove type definition for Distance in types.py
- adapt KMeans canonizer:
  - replace LogMeanExpPool with MinPool1d followed by torch.nn.Flatten
  - remove beta parameter; beta is now sit in MinTakesMost1d
  - remove deepcopy and simply return the module itself
- update docs/src/tutorials/deep_kmeans.ipynb
- doc strings
- merge changes coming from github web interface
- various non-functional changes
@jackmcrider
Copy link
Author

I have commited a new version with roughly these changes:

  • remove LogMeanExpPool
  • add layers MinPool1d and MinPool2d (simple inheritance from the PyTorch MaxPool* classes)
  • add rules MinTakesMost1d, MinTakesMost2d, MaxTakesMost1d, MaxTakesMost2d
  • change KMeansCanonizer to use MinPool1d instead of LogMeanExpPool
  • change tutorial to use MinTakesMost1d at the output layer
  • apply changes requested in review (rename Distance to PairwiseCentroidDistance, do self.parent_module.add_module instead of setattr(self.parent_module...)

I'm not sure if we want four rules for the *TakesMost* or if one rule with mode='max'/mode='min' and some autodetection for the 1d/2d is better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Neuralized K-Means: make k-means amenable to neural network explanations
4 participants