Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GridsearchCV and pipeline: input dimensionality #263

Open
FlyingFordAnglia opened this issue Nov 11, 2024 · 1 comment
Open

GridsearchCV and pipeline: input dimensionality #263

FlyingFordAnglia opened this issue Nov 11, 2024 · 1 comment

Comments

@FlyingFordAnglia
Copy link

FlyingFordAnglia commented Nov 11, 2024

Hi! I am trying to fit a glm to some spiking data from a bunch of neurons. My design matrix is the binned spike counts of all neurons, and my 'y' is the spike counts of the neuron I am interested in. Before fitting the glm, I wanted to run a grid search for hyperparameter tuning.
When I run the attached code, I get the following error:

TypeError: Input dimensionality mismatch. This basis evaluation requires 1 inputs, 15 inputs provided instead.

From what I can gather, it appears that the fit_transform method that gridsearchCV uses internally expects a design matrix with a single column, not a matrix of n_samples, n_features. How can I get this to work?

            # region Hyperparameter tuning
            num_bases = 10
            print(f'Number of basis functions: {num_bases}')
            basis = nemos.basis.RaisedCosineBasisLinear(n_basis_funcs=num_bases, mode="conv", window_size=filter_size)
            transformer_basis = basis.to_transformer()
            neuron = 15
            print(f'{neuron} Neurons considered = {neurons_slice[0:neuron]}')
            spike_counts = spike_dat[:][neurons_slice[0:neuron], :time_vec_cut_index].T
            train_spike_counts = spike_counts[0:int(len(spike_counts) * 0.7), :]
            pipeline = Pipeline(
                [
                    (
                        "transformerbasis",
                        transformer_basis,
                    ),
                    (
                        "glm",
                        nemos.glm.GLM(regularizer_strength=0.5, regularizer="Ridge", solver_kwargs={'verbose': True}),
                    ),
                ]
            )
            param_grid = dict(
                glm__regularizer_strength=(0.1, 0.01, 0.001, 1e-5),
                transformerbasis__n_basis_funcs=(5, 10, 15, 20),
            )
            gridsearch = GridSearchCV(
                pipeline,
                param_grid=param_grid,
                cv=2
            )
            gridsearch.fit(train_spike_counts, train_spike_counts[:, glm_neuron_id].flatten())
            cvdf = pd.DataFrame(gridsearch.cv_results_)

            cvdf_wide = cvdf.pivot(
                index="param_transformerbasis__n_basis_funcs",
                columns="param_glm__regularizer_strength",
                values="mean_test_score",
            )
            plot_heatmap_cv_results(cvdf_wide)
            # best_params = hyper_param_tuning()
            sys.exit()
            # endregion

My installed nemos version is 0.1.6 and sklearn version is 1.5.0

A tangential question: How do I integrate batch gradient descent with this pipeline?

Any help would be appreciated, thanks!

@sjvenditto
Copy link
Collaborator

Currently, basis objects assume a single input, and addressing this issue is a work-in-progress. The current work-around is to define a basis in param_grid that matches the dimensionality of the input; in your case, this will be an additive basis with the number of components (RaisedCosineBasisLinear bases) equal to the number of neurons. This will look like:

param_grid = dict(
                glm__regularizer_strength=(0.1, 0.01, 0.001, 1e-5),
                transformerbasis__basis=[basis*neuron],
            )

where basis*neuron is shorthand for adding basis together neuron times. Unfortunately, this solution will raise another error in both main and dev branches having to do with transformer basis property names (as well as the shorthand not existing). This issue is being fixed in PR #235, but you can try it out in the meantime by using the fix_transformer branch in nemos if you've installed it from source. Let me know if this works for you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants