Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(fix): obs filtering #148

Merged
merged 17 commits into from
Oct 16, 2024
Merged

(fix): obs filtering #148

merged 17 commits into from
Oct 16, 2024

Conversation

ilan-gold
Copy link
Contributor

Fixes #146

@ilan-gold ilan-gold changed the title (fix): fix obs filtering (fix): obs filtering Oct 15, 2024
@ilan-gold ilan-gold requested a review from gtca October 15, 2024 10:08
@ilan-gold ilan-gold mentioned this pull request Oct 15, 2024
# Collect elements to subset
# NOTE: accessing them after subsetting .obs
# will fail due to _validate_value()
obsm = dict(data.obsm)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably need to use _remove_unused_categories on the dataframe - please open an issue in anndata about making this public

Comment on lines 149 to 156
A = pbmc3k_processed[:500,].copy()
B = pbmc3k_processed[500:,].copy()
mdata = mu.MuData({"A": A, "B": B})
np.random.seed(42)
var_sel = np.random.choice(np.array([0, 1]), size=mdata.n_vars, replace=True)
mdata.var["sel"] = var_sel
mu.pp.filter_var(mdata, "sel", lambda y: y == 1)
assert mdata.shape[1] == int(np.sum(var_sel))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update this test as well to check the full object like assert_equal(mdata["B"], B_subset)

@@ -756,6 +764,18 @@ def func(x):
data.raw._n_obs = data.raw.X.shape[0]

else:
# Subset .obs
data._obs = data.obs[obs_subset]
data._n_obs = data.obs.shape[0]
Copy link
Contributor Author

@ilan-gold ilan-gold Oct 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is necessary for AnnData. Also please consider de-dupping all this logic - you can use getattr(data, f"{key}m") and make key: Literal["obs", "var"] an argument to a helper

@gtca gtca merged commit c7461aa into main Oct 16, 2024
4 checks passed
@gtca gtca deleted the ig/fix_filter branch October 16, 2024 23:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filter_obs is broken with latest AnnData
2 participants