Add warning message to log this error. #118

chen3933 · 2024-05-01T23:25:56Z

Issue #, if available:

Customer started received 4xx error with following traceback

2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Traceback (most recent call last):
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.10/site-packages/mms/service.py", line 108, in predict
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - ret = self._entry_point(input_batch, self.context)
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - File "/opt/conda/lib/python3.10/site-packages/sagemaker_huggingface_inference_toolkit/handler_service.py", line 267, in handle
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - raise PredictionException(str(e), 400)
2024-02-29T11:07:21,222 [INFO ] W-model-2-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - mms.service.PredictionException: model_fn() takes 1 positional argument but 2 were given : 400
2024-02-29T11:07:21,315 [INFO ] W-model-4-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Prediction error

(Potential) Root cause:

If model load failed when first time call initialize(), self.load will overwrite by customer's model_fn but self.initialized = False
https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L85-L89

Then when customer send the first request, self.initialize() will get called again https://github.com/aws/sagemaker-huggingface-inference-toolkit/blob/main/src/sagemaker_huggingface_inference_toolkit/handler_service.py#L248. Since self.model is the same as model_fn, we will always pass in context.

Description of changes:

Add warning message for future debug

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

philschmid

Sorry i don't understand this warning/error. Why is this to related if the model is initialized?

chen3933 · 2024-05-10T20:04:47Z

Sorry i don't understand this warning/error. Why is this to related if the model is initialized?

I added the description in PR detail. I think the error might caused by model initialization failure. Added this warning message for future debugging.

lxning · 2024-05-16T16:41:38Z

Sorry i don't understand this warning/error. Why is this to related if the model is initialized?

I added the description in PR detail. I think the error might caused by model initialization failure. Added this warning message for future debugging.

@philschmid Please let us know if there are any other work needs for this PR. Thanks.

philschmid · 2024-05-17T06:56:47Z

TBH I am not sure if thats any value if it is always logged and might more confuse people.

chen3933 · 2024-05-17T17:27:45Z

TBH I am not sure if thats any value if it is always logged and might more confuse people.

This warning message will only show up when model initialize failed in the first place. We need an indicator when inference toolkit operate in recover mode.

nikhil-sk · 2024-05-21T20:07:20Z

@philschmid could you please review after @chen3933's updated comment above?

philschmid · 2024-05-23T11:38:13Z

As said before, this PR makes little sense to me since self. initialize will always be first before the first request and will be initialized on the first request. So every user would receive that warning on their first request.

chen3933 · 2024-05-31T21:08:50Z

As said before, this PR makes little sense to me since self. initialize will always be first before the first request and will be initialized on the first request. So every user would receive that warning on their first request.

Add check to not show warning for none preload model use case (https://github.com/awslabs/multi-model-server/blob/master/mms/model_service_worker.py#L107)

Now this warning message only show up when first initialize failed and we have load it again when first request come in.

btruhand · 2024-06-01T01:13:38Z

As said before, this PR makes little sense to me since self. initialize will always be first before the first request and will be initialized on the first request. So every user would receive that warning on their first request.

just chiming in here, but looks like @chen3933 is using MMS. In MMS, there are three possible ways for the model to be loaded:

Explicitly preload the model when instantiating the Python backend
Receiving a load model request sent on initial connection by the Java frontend
Implicitly load during prediction request if 1/2 failed

I believe @chen3933 change is referring to the 3rd possibility

udaij12 · 2024-06-17T17:46:33Z

Wondering the status of this PR and if there are any further changes or can we get this merged @philschmid

chen3933 force-pushed the main branch 2 times, most recently from c1a127f to 69e85c2 Compare May 2, 2024 17:35

chen3933 changed the title ~~Add default_handler function to track function signature.~~ Add warning message to log this error. May 2, 2024

chen3933 requested a review from philschmid May 2, 2024 18:18

philschmid reviewed May 3, 2024

View reviewed changes

Add warning message.

be43bdb

chen3933 force-pushed the main branch from 69e85c2 to be43bdb Compare May 31, 2024 20:58

chen3933 merged commit 45d94b7 into aws:main Jun 24, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add warning message to log this error. #118

Add warning message to log this error. #118

chen3933 commented May 1, 2024 •

edited

Loading

philschmid left a comment

chen3933 commented May 10, 2024

lxning commented May 16, 2024

philschmid commented May 17, 2024

chen3933 commented May 17, 2024

nikhil-sk commented May 21, 2024

philschmid commented May 23, 2024

chen3933 commented May 31, 2024

btruhand commented Jun 1, 2024 •

edited

Loading

udaij12 commented Jun 17, 2024

Add warning message to log this error. #118

Add warning message to log this error. #118

Conversation

chen3933 commented May 1, 2024 • edited Loading

philschmid left a comment

Choose a reason for hiding this comment

chen3933 commented May 10, 2024

lxning commented May 16, 2024

philschmid commented May 17, 2024

chen3933 commented May 17, 2024

nikhil-sk commented May 21, 2024

philschmid commented May 23, 2024

chen3933 commented May 31, 2024

btruhand commented Jun 1, 2024 • edited Loading

udaij12 commented Jun 17, 2024

chen3933 commented May 1, 2024 •

edited

Loading

btruhand commented Jun 1, 2024 •

edited

Loading