Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singularity on HPC with slurm #837

Open
laurie-tonon opened this issue Aug 5, 2024 · 8 comments
Open

singularity on HPC with slurm #837

laurie-tonon opened this issue Aug 5, 2024 · 8 comments
Labels

Comments

@laurie-tonon
Copy link

Hello,

I would like to create singularity machines with rstudio servers for my projects and use them on our HPC cluster.
I started by following your tutorial here: https://rocker-project.org/use/singularity.html, but when I run the job, it stops instantly.
I've tried running an interactive job and then executing the script commands, and I get the following error:

TTY detected. Printing informational message about logging configuration. Logging configuration loaded from '/etc/rstudio/logging.conf'. Logging to 'syslog'.
2024-08-05T13:46:31.825727Z [rserver] ERROR Attempt to run server as user 'rstudio-server' (uid 999) from account 'tonon' (uid 10004) without privilege, which is required to run as a different uid; LOGGED FROM: virtual rstudio::core::ProgramStatus rstudio::server::Options::read(int, char* const*, std::ostream&) src/cpp/server/ServerOptions.cpp:318

If I understand correctly, in the container, I am myself (user tonon) but only the user rstudio-server can launch the server?

How can I start the rstudio-server with my account?

Thanks a lot

Laurie

@eddelbuettel eddelbuettel transferred this issue from rocker-org/rocker Aug 5, 2024
@benz0li
Copy link
Contributor

benz0li commented Aug 5, 2024

@eddelbuettel
Copy link
Member

eddelbuettel commented Aug 5, 2024

I can never shake the feeling that there is a bit of an impedance mismatch here. When I used slurm for batch jobs, those were never interactive. Now the 'rm' in 'slurm' stands for 'resource manager' so maybe it is appropriate to try to manage RStudio sessions that way. But it still feels a bit odd. But I haven't worked in an HPC setting in some time so maybe things are different now, or different in your place.

@cboettig
Copy link
Member

cboettig commented Aug 6, 2024

@laurie-tonon thanks for reporting, I think this is a limitation of rstudio-server setup. It is possible to run an rstudio session as non-root though; e.g. the rocker/binder image is set up to do this already (via jupyterhub and rsession-proxy). Can you give that a try?

@eddelbuettel I totally hear you on this mismatch between HPC and interactive use, but in my experience I think this issue often reflects a miss-match between computing needs in many research settings and computing providers. Many university HPC centers have been providing batch-based compute since long before interactive computing was even a thing, and most continue operate exclusively this way. I have seen very few HPC managers say "sure we'll set up a k8s cluster for on demand needs!" I have seen many respond "we can run any compute you need, so long as it runs on our SLURM queue." For instance, take a look at Lawrence Berkeley National Lab's recent efforts with "Open OnDemand" https://it.lbl.gov/service/scienceit/high-performance-computing/lrc/open-ondemand/.

Personally I agree with you that this seems a bit of an impedance mismatch, and most of the time everyone would be much better off if more university HPC centers might consider running any of the highly polished kubernetes solutions out there....

@nathanweeks
Copy link

@laurie-tonon This looks like the error that was resolved in rocker-org/website#92, which I now noticed updated only the "Running a Rocker Singularity container with password authentication" section, though the rserver --server-user=$(whoami) (or equivalent) option is required for the rserver commands in the other sections in that guide as well (mea culpa for not looking at the PR closely).

Could try adding--server-user=$(whoami) to the rserver command to see if that resolves the issue?

@mdsumner
Copy link
Contributor

I build for singularity like this from docker hub, fwiw:

module load singularity/4.1.0-slurm

singularity pull --force --dir $MYSOFTWARE/sif_lib docker://project/image:main

then I launch it using a script provided here

https://pawsey.atlassian.net/wiki/download/attachments/51925972/rstudio-on-singularity.slm?version=1&modificationDate=1655193233392&cacheVersion=1&api=v2

instructs here (not your system probably but might help)

https://pawsey.atlassian.net/wiki/spaces/US/pages/51925972/How+to+Run+RStudio

@mdsumner
Copy link
Contributor

mdsumner commented Aug 28, 2024

I only just figured out there's no need for the docker hub intermediary, we can go straight from ghcr.io:

## example from packages here
singularity pull --dir <my_sif_dir/> docker:ghcr.io/rocker-org/r-ver:4.4.1

@drkrynstrng
Copy link

Just wanted to confirm here that the solution is indeed to specify the server user. Something like:

rserver --server-user=$USER

@nathanweeks
Copy link

An update to the Rocker Singularity guide that includes (among other various updates) rserver --server-user in the job script has been proposed: rocker-org/website#120
Comments or suggested changes would be appreciated.

eitsupi added a commit to rocker-org/website that referenced this issue Dec 14, 2024
Updates the Singularity guide:

* Set `rserver --server-user` in job script
(rocker-org/rocker-versioned2#837)
* Show where R_PROFILE_USER and R_LIBS_USER can be set
  (rocker-org/rocker-versioned2#855)
*  Use rocker/rstudio:4.4.2
* Specify `python3` instead of `python` (the latter may be python 2 or
absent on some hosts)
* Mention Apptainer
* mktemp is simpler and portable-enough (GNU coreutils) for creating
temp dirs
* Simplify creation of various writable directories in container with
`--scratch` and `--workdir`
* Ensure R_LIBS_USER directory exists
* no longer need to create readable database.conf (sqlite seems to be
the default)
* SLURM_CPUS_ON_NODE technically more correct than
SLURM_JOB_CPUS_PER_NODE
  (though shouldn't make a difference with a single-node allocation)

---------

Co-authored-by: eitsupi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants