-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add warning message to log this error. #118
Conversation
c1a127f
to
69e85c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry i don't understand this warning/error. Why is this to related if the model is initialized?
I added the description in PR detail. I think the error might caused by model initialization failure. Added this warning message for future debugging. |
@philschmid Please let us know if there are any other work needs for this PR. Thanks. |
TBH I am not sure if thats any value if it is always logged and might more confuse people. |
This warning message will only show up when model initialize failed in the first place. We need an indicator when inference toolkit operate in recover mode. |
@philschmid could you please review after @chen3933's updated comment above? |
As said before, this PR makes little sense to me since |
Add check to not show warning for none preload model use case (https://github.com/awslabs/multi-model-server/blob/master/mms/model_service_worker.py#L107) Now this warning message only show up when first initialize failed and we have load it again when first request come in. |
just chiming in here, but looks like @chen3933 is using MMS. In MMS, there are three possible ways for the model to be loaded:
I believe @chen3933 change is referring to the 3rd possibility |
Wondering the status of this PR and if there are any further changes or can we get this merged @philschmid |
Issue #, if available:
(Potential) Root cause:
If model load failed when first time call initialize(), self.load will overwrite by customer's model_fn but self.initialized = False
https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L85-L89
Then when customer send the first request, self.initialize() will get called again https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L248. Since self.model is the same as model_fn, we will always pass in context.
Description of changes:
Add warning message for future debug
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.