Skip to content

MXNetModel deployment error #1831

@goldmermaid

Description

@goldmermaid

Describe the bug
Deployment fails when instance_type='local'.

To reproduce

import sagemaker
from sagemaker.mxnet.model import MXNetModel

s3_path = "s3://{}/{}".format(sess.default_bucket(), s3_bucket_name)
sagemaker_model = MXNetModel(model_data=s3_model_path, #train_data_upload,
                             image="kdd20200813:latest", # "{}/{}".format(ecr_image_name, "latest"),
                             role=sagemaker.get_execution_role(), 
                             py_version='py3',            # python version
                             framework_version="1.6",
                             entry_point='serve.py',
                             source_dir='.')

Expected behavior
Deployment works correctly local and at remote GPU

Screenshots or logs

Attaching to tmp454nevxt_algo-1-qjr8q_1
algo-1-qjr8q_1  | Warning: Calling MMS with mxnet-model-server. Please move to multi-model-server.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,467 [INFO ] main com.amazonaws.ml.mms.ModelServer - 
algo-1-qjr8q_1  | MMS Home: /usr/local/lib/python3.6/site-packages
algo-1-qjr8q_1  | Current directory: /
algo-1-qjr8q_1  | Temp directory: /home/model-server/tmp
algo-1-qjr8q_1  | Number of GPUs: 0
algo-1-qjr8q_1  | Number of CPUs: 8
algo-1-qjr8q_1  | Max heap size: 13646 M
algo-1-qjr8q_1  | Python executable: /usr/local/bin/python3.6
algo-1-qjr8q_1  | Config file: /etc/sagemaker-mms.properties
algo-1-qjr8q_1  | Inference address: http://0.0.0.0:8080
algo-1-qjr8q_1  | Management address: http://0.0.0.0:8080
algo-1-qjr8q_1  | Model Store: /.sagemaker/mms/models
algo-1-qjr8q_1  | Initial Models: ALL
algo-1-qjr8q_1  | Log dir: /logs
algo-1-qjr8q_1  | Metrics dir: /logs
algo-1-qjr8q_1  | Netty threads: 0
algo-1-qjr8q_1  | Netty client threads: 0
algo-1-qjr8q_1  | Default workers per model: 8
algo-1-qjr8q_1  | Blacklist Regex: N/A
algo-1-qjr8q_1  | Maximum Response Size: 6553500
algo-1-qjr8q_1  | Maximum Request Size: 6553500
algo-1-qjr8q_1  | Preload model: false
algo-1-qjr8q_1  | Prefer direct buffer: false
algo-1-qjr8q_1  | 2020-08-13 20:05:51,555 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-9000-model
algo-1-qjr8q_1  | 2020-08-13 20:05:51,705 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - model_service_worker started with args: --sock-type unix --sock-name /home/model-server/tmp/.mms.sock.9000 --handler sagemaker_mxnet_serving_container.handler_service --model-path /.sagemaker/mms/models/model --model-name model --preload-model false --tmp-dir /home/model-server/tmp
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Listening on port: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - [PID] 71
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - MMS worker started.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,706 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Python runtime: 3.6.10
algo-1-qjr8q_1  | 2020-08-13 20:05:51,707 [INFO ] main com.amazonaws.ml.mms.wlm.ModelManager - Model model loaded.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,713 [INFO ] main com.amazonaws.ml.mms.ModelServer - Initialize Inference server with: EpollServerSocketChannel.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,725 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,725 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,724 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Connecting to: /home/model-server/tmp/.mms.sock.9000
algo-1-qjr8q_1  | 2020-08-13 20:05:51,797 [INFO ] main com.amazonaws.ml.mms.ModelServer - Inference API bind to: http://0.0.0.0:8080
algo-1-qjr8q_1  | Model server started.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,802 [WARN ] pool-2-thread-1 com.amazonaws.ml.mms.metrics.MetricCollector - worker pid is not available yet.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,810 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,811 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,813 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,815 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,817 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:51,819 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Connection accepted: /home/model-server/tmp/.mms.sock.9000.
algo-1-qjr8q_1  | 2020-08-13 20:05:54,764 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000000-17ec808cce3dc3f5-fc4cfd4e
algo-1-qjr8q_1  | 2020-08-13 20:05:54,806 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2901
algo-1-qjr8q_1  | 2020-08-13 20:05:54,809 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-7
algo-1-qjr8q_1  | 2020-08-13 20:05:54,809 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000003-7d6a408cce3dc3f5-98e5582f
algo-1-qjr8q_1  | 2020-08-13 20:05:54,823 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2918
algo-1-qjr8q_1  | 2020-08-13 20:05:54,823 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-1
algo-1-qjr8q_1  | 2020-08-13 20:05:54,845 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000002-fe83808cce3dc3f5-c2242e3f
algo-1-qjr8q_1  | 2020-08-13 20:05:54,846 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2941
algo-1-qjr8q_1  | 2020-08-13 20:05:54,850 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-8
algo-1-qjr8q_1  | 2020-08-13 20:05:54,849 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000001-1dda808cce3dc3f5-8771b606
algo-1-qjr8q_1  | 2020-08-13 20:05:54,850 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2947
algo-1-qjr8q_1  | 2020-08-13 20:05:54,851 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-2
algo-1-qjr8q_1  | 2020-08-13 20:05:54,852 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000007-1d3bc08cce3dc3f5-2c312a2b
algo-1-qjr8q_1  | 2020-08-13 20:05:54,853 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2939
algo-1-qjr8q_1  | 2020-08-13 20:05:54,853 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-4
algo-1-qjr8q_1  | 2020-08-13 20:05:54,906 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000005-7888c08cce3dc3f5-db7c23ef
algo-1-qjr8q_1  | 2020-08-13 20:05:54,906 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 3001
algo-1-qjr8q_1  | 2020-08-13 20:05:54,906 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-3
algo-1-qjr8q_1  | 2020-08-13 20:05:54,909 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000008-5c4fc08cce3dc3f5-26e3f942
algo-1-qjr8q_1  | 2020-08-13 20:05:54,909 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 2970
algo-1-qjr8q_1  | 2020-08-13 20:05:54,910 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-5
algo-1-qjr8q_1  | 2020-08-13 20:05:54,937 [INFO ] W-9000-model-stdout com.amazonaws.ml.mms.wlm.WorkerLifeCycle - Model model loaded io_fd=0242acfffe120002-00000031-00000006-d83ac08cce3dc3f5-941d61cb
algo-1-qjr8q_1  | 2020-08-13 20:05:54,937 [INFO ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerThread - Backend response time: 3032
algo-1-qjr8q_1  | 2020-08-13 20:05:54,937 [WARN ] W-9000-model com.amazonaws.ml.mms.wlm.WorkerLifeCycle - attachIOStreams() threadName=W-model-6
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-19-fdda3d4ea866> in <module>
----> 1 predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='local') # 'ml.c4.xlarge')

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/model.py in deploy(self, initial_instance_count, instance_type, accelerator_type, endpoint_name, update_endpoint, tags, kms_key, wait, data_capture_config)
    515                 kms_key=kms_key,
    516                 wait=wait,
--> 517                 data_capture_config_dict=data_capture_config_dict,
    518             )
    519 

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py in endpoint_from_production_variants(self, name, production_variants, tags, kms_key, wait, data_capture_config_dict)
   2903 
   2904             self.sagemaker_client.create_endpoint_config(**config_options)
-> 2905         return self.create_endpoint(endpoint_name=name, config_name=name, tags=tags, wait=wait)
   2906 
   2907     def expand_role(self, role):

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/session.py in create_endpoint(self, endpoint_name, config_name, tags, wait)
   2420 
   2421         self.sagemaker_client.create_endpoint(
-> 2422             EndpointName=endpoint_name, EndpointConfigName=config_name, Tags=tags
   2423         )
   2424         if wait:

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/local_session.py in create_endpoint(self, EndpointName, EndpointConfigName, Tags)
    264         endpoint = _LocalEndpoint(EndpointName, EndpointConfigName, Tags, self.sagemaker_session)
    265         LocalSagemakerClient._endpoints[EndpointName] = endpoint
--> 266         endpoint.serve()
    267 
    268     def update_endpoint(self, EndpointName, EndpointConfigName):  # pylint: disable=unused-argument

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/entities.py in serve(self)
    472 
    473         serving_port = get_config_value("local.serving_port", self.local_session.config) or 8080
--> 474         _wait_for_serving_container(serving_port)
    475         # the container is running and it passed the healthcheck status is now InService
    476         self.state = _LocalEndpoint._IN_SERVICE

~/anaconda3/envs/mxnet_p36/lib/python3.6/site-packages/sagemaker/local/entities.py in _wait_for_serving_container(serving_port)
    507         i += 5
    508         if i >= HEALTH_CHECK_TIMEOUT_LIMIT:
--> 509             raise RuntimeError("Giving up, endpoint didn't launch correctly")
    510 
    511         logger.info("Checking if serving container is up, attempt: %s", i)

RuntimeError: Giving up, endpoint didn't launch correctly

-System information

  • SageMaker Python SDK version: sagemaker 1.71.0

  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):

  • Framework version:

  • Python version:

  • CPU or GPU:

  • Custom Docker image (Y/N):

Additional context
I am new to Sagemaker SDK. So I wonder if something I was missing. Thanks for the help!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions