Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add warning message to log this error. #118

Merged
merged 1 commit into from
Jun 24, 2024
Merged

Add warning message to log this error. #118

merged 1 commit into from
Jun 24, 2024

Conversation

chen3933
Copy link
Contributor

@chen3933 chen3933 commented May 1, 2024

Issue #, if available:

Customer started received 4xx error with following traceback

2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.10/site-packages/mms/service.py", line 108, in predict
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ret = self._entry_point(input_batch, self.context)
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 267, in handle
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise PredictionException(str(e), 400)
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: model_fn() takes 1 positional argument but 2 were given : 400
2024-02-29T11:07:21,315 [INFO ] W-model-4-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Prediction error

(Potential) Root cause:

If model load failed when first time call initialize(), self.load will overwrite by customer's model_fn but self.initialized = False
https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L85-L89

Then when customer send the first request, self.initialize() will get called again https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L248. Since self.model is the same as model_fn, we will always pass in context.

Description of changes:

Add warning message for future debug

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@chen3933 chen3933 force-pushed the main branch 2 times, most recently from c1a127f to 69e85c2 Compare May 2, 2024 17:35
@chen3933 chen3933 changed the title Add default_handler function to track function signature. Add warning message to log this error. May 2, 2024
@chen3933 chen3933 requested a review from philschmid May 2, 2024 18:18
Copy link
Collaborator

@philschmid philschmid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry i don't understand this warning/error. Why is this to related if the model is initialized?

@chen3933
Copy link
Contributor Author

Sorry i don't understand this warning/error. Why is this to related if the model is initialized?

I added the description in PR detail. I think the error might caused by model initialization failure. Added this warning message for future debugging.

@lxning
Copy link

lxning commented May 16, 2024

Sorry i don't understand this warning/error. Why is this to related if the model is initialized?

I added the description in PR detail. I think the error might caused by model initialization failure. Added this warning message for future debugging.

@philschmid Please let us know if there are any other work needs for this PR. Thanks.

@philschmid
Copy link
Collaborator

TBH I am not sure if thats any value if it is always logged and might more confuse people.

@chen3933
Copy link
Contributor Author

TBH I am not sure if thats any value if it is always logged and might more confuse people.

This warning message will only show up when model initialize failed in the first place. We need an indicator when inference toolkit operate in recover mode.

@nikhil-sk
Copy link

@philschmid could you please review after @chen3933's updated comment above?

@philschmid
Copy link
Collaborator

As said before, this PR makes little sense to me since self. initialize will always be first before the first request and will be initialized on the first request. So every user would receive that warning on their first request.

@chen3933
Copy link
Contributor Author

As said before, this PR makes little sense to me since self. initialize will always be first before the first request and will be initialized on the first request. So every user would receive that warning on their first request.

Add check to not show warning for none preload model use case (https://github.com/awslabs/multi-model-server/blob/master/mms/model_service_worker.py#L107)

Now this warning message only show up when first initialize failed and we have load it again when first request come in.

@btruhand
Copy link

btruhand commented Jun 1, 2024

As said before, this PR makes little sense to me since self. initialize will always be first before the first request and will be initialized on the first request. So every user would receive that warning on their first request.

just chiming in here, but looks like @chen3933 is using MMS. In MMS, there are three possible ways for the model to be loaded:

  1. Explicitly preload the model when instantiating the Python backend
  2. Receiving a load model request sent on initial connection by the Java frontend
  3. Implicitly load during prediction request if 1/2 failed

I believe @chen3933 change is referring to the 3rd possibility

@udaij12
Copy link

udaij12 commented Jun 17, 2024

Wondering the status of this PR and if there are any further changes or can we get this merged @philschmid

@chen3933 chen3933 merged commit 45d94b7 into aws:main Jun 24, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants