Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'condor_watch_q -user <some-other-user>' will error out for non-priviledged users #51

Open
elin1231 opened this issue Sep 30, 2020 · 2 comments

Comments

@elin1231
Copy link
Contributor

Bert DeKnuydt provided reported an issue as follows:

But 'condor_watch_q -user ' will error out
for non-priviledged users as follows:

(And often for priviledged users too, if the logfiles are over
e.g. NFS)

~ » id
uid=4653(deknuydt) gid=5800(visics) groups=5800(visics),1207(vdi-users),6200(xvisics)
~ » id xma
uid=35402(xma) gid=5800(visics) groups=5800(visics),1207(vdi-users)
~ » condor_watch_q -user xma
WARNING: Could not open event log at /esat/vauxite/xma/sing_images/ch-resnet-cifar10/scripts/condor/condor_files/2020-09-30-01-58-vgg11-adam-0.01-0.089spjpe9/condor_job.log for reading, so it will be ignored. Reason: JobEventLog not initialized. Check the debug log, looking for ReadUserLog or FileModifiedTrigger. (Or call htcondor.enable_debug() and try again.)
WARNING: Could not open event log at /esat/vauxite/xma/sing_images/ch-resnet-cifar10/scripts/condor/condor_files/2020-09-30-01-58-resnet8-sgd-0.1-0.001btxpvurh/condor_job.log for reading, so it will be ignored. Reason: JobEventLog not initialized. Check the debug log, looking for ReadUserLog or FileModifiedTrigger. (Or call htcondor.enable_debug() and try again.)
Processing new events...ERROR: Unhandled error: list index out of range. Re-run with -debug for a full stack trace.

The reason is simply that the logfiles are not readable (permission denied).

  • The above would make you think something is seriously wrong, but it isn't
    You could easily clarify the error a bit, but that does not solve the real
    problem of permissions; probably not fixable at all. Or do you see
    a workaround?
@elin1231
Copy link
Contributor Author

He has highlighted the importance of this feature for the faculty, so I think I should try and find a workaround if possible.

@bbockelm
Copy link
Owner

Looks like two issues:

  1. If HTCondor isn't doing a good job providing error messages about unreadable files, then we should simply test-open the file first using a normal python open.
  2. Whatever is causing this issue (ERROR: Unhandled error: list index out of range) likely needs to be fixed. Maybe something is assuming the list of usable files is non-empty?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants