I am currently playing around with different CNN and LSTM model architectures for my multivariate time series classification problem. I can achieve validation accuracy of better than 50 %. I would like to lock down an exact architecture at some stage instead of experimenting endlessly. In order to decide this, I want to also tune my hyperparameters. Question: How do I balance the need to experiment with different models, such as standalone CNN and CNN with LSTM against hyperparameter tuning? …
Looks like we don't really use an IDE in any of Machine Learning workflow stages if we use AWS SageMaker. Entire work is done in jupyter notebook. Is this correct?
We have followed the following steps: Trained 5 TensorFlow models in local machine using 5 different training sets. Saved those in .h5 format. Converted those into tar.gz (Model1.tar.gz,...Model5.tar.gz) and uploaded it in the S3 bucket. Successfully deployed a single model in an endpoint using the following code: from sagemaker.tensorflow import TensorFlowModel sagemaker_model = TensorFlowModel(model_data = tarS3Path + 'model{}.tar.gz'.format(1), role = role, framework_version='1.13', sagemaker_session = sagemaker_session) predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge') predictor.predict(data.values[:,0:]) The output was: {'predictions': [[153.55], [79.8196], [45.2843]]} Now the problem …
Hello Guys i have a question We want to start working with AWS Sagemaker. I understand that i can open Jupiter notebook and work like it was in my computer. but i know pandas working on single node. when i working for example on my machine i have 64gb memory and that is the limit for pandas because its not parallel but AWS is parallel so how pandas work with that
I ran a complete AWS SageMaker Autopilot experiment. I now want to generate batch forecasts using this model but I get the error: "No finished training job found associated with this estimator. Please make sure this estimator is only used for building workflow config". I'm using this tutorial as reference. Here's my SageMaker Studio notebook Python code. import sagemaker from sagemaker import get_execution_role import boto3 import os from time import gmtime, strftime, sleep session = sagemaker.Session() bucket = sagemaker.Session().default_bucket() prefix …
Ran my CNN on a SageMaker notebook and it started training, but I had to restart the kernel due to AWS disconnecting. However when I tried to rerun my code, I received an OOM error, and it never started training again. I tried: Restarting the kernel Restarted the AWS machine But the error still persisted. I find this strange due to the fact it ran before. ResourceExhaustedError: OOM when allocating tensor with shape[262145,25600] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator …
So, I've recently created a job using AWS SageMaker Ground Truth for NER purposes, and have received an output in the form a manifest file. I'm now trying to process the manifest file into a dataframe, and I'm failing greatly. The JSON file is incredibly complex. Here's an example of it based on the documentation: { "source": "Amazon SageMaker is a cloud machine-learning platform that was launched in November 2017. SageMaker enables developers to create, train, and deploy machine-learning (ML) …
I am trying to implement a docker file for Amazon Sagemaker Container,in initial step i am following this link https://towardsdatascience.com/brewing-up-custom-ml-models-on-aws-sagemaker-e09b64627722 In above link's section "Creating Your Own Docker Container" last command of docker image is COPY xgboost /opt/program I don't have any idea what xgboost file here is for this? Due to this my docker build is failing , please see below image of docker and its built Docker Image FROM ubuntu:latest MAINTAINER Amazon AI <[email protected]> RUN apt-get -y update …
I am training a DeepAR model (arXiv) in Jupyter Notebook. I am following this tutorial. I create a collection of time series (concat_df), as needed by the DeepAR method: Each row is a time series. This collection is used to train the DeepAR model. The input format expected by DeepAr is a list of series. So I create this from the above data frame: time_series = [] for index, row in concat_df.iterrows(): time_series.append(row) With this list of time series I …
It looks like there are different routes to deploying an ML model on SageMaker. You can: pre-train a model, create a deployment archive, then deploy create an estimator, train the model on SageMaker with a script, then deploy My question is: are there benefits of taking the second approach? To me, it seems like writing a training script would require a bit of trial and error and perhaps some extra work to package it all up neatly. Why not just …
I fine-tuned an Inception V3 model provided in AWS SageMaker to detect COVID-19 Rapid Test Results (see the image below for an example). I provided about 20 pictures of negative and about 20 pictures of positive tests for the training. All pictures were taken with slightly changing angles and positions. However, when testing the fined-tuned model, the recognition did not work at all. Is the deviation between both image classes is too small (only changing red bars). Is there any …
I'm following this tutorial but I keep getting the error: "The number of input images must be bigger or equal to the mini_batch_size." I've tried a series of different permutations of data and hyperparameters but I keep getting this error. # The algorithm supports multiple network depth (number of layers). They are 18, 34, 50, 101, 152 and 200 # For this training, we will use 18 layers num_layers = 18 # we need to specify the input image shape …
I am currently training images for classification. Initially I was using a local machine with a decent GPU. The number of training images is around 3561. I used a batch size of 64. The output during training looks like this. Epoch 47/100 3561/3561 [==============================] - 1643s 461ms/sample - loss: 0.7066 - acc: 0.5906 - val_loss: 2.0526 - val_acc: 0.5577 Epoch 48/100 3561/3561 [==============================] - 2221s 624ms/sample - loss: 0.7076 - acc: 0.5883 - val_loss: 2.8131 - val_acc: 0.5653 As you …
I followed the instructions from this article about creating a code-free machine learning pipeline. I already had a working pipeline offline using the same data in TPOT (autoML). I uploaded my data to AWS, to try their autoML thing. I did the exact steps that were described in the article and uploaded my _train and _test csv files, both with a column named 'target' that contains the target value. The following error message was returned as a failure reason: AlgorithmError: …
I'm trying to calculate effect of parameters of an operation on the thickness of a wall. Each operation is thinning the wall thickness and at some point the wall is replaced and operation starts again. My operational parameters are changing daily and are collected on daily basis. However, the thickness info is measured at the end of the operation approx. after 60 days. Therefore, for each thickness value I have 60 rows of parameter data. I'm new to Machine-Learning. I've …
I want to use lifecycle configuration in Sagemaker studio so that on start of user's notebook it runs the given lifecycle configuration. My lifecycle configuration will have shell script which will launch cronjob having python script to send attached notebook's running duration. #!/bin/bash set -e # PARAMETERS IDLE_TIME=120 echo "Fetching the autostop script" aws s3 cp s3://testing-west2/duration-check.py . aws s3 cp s3://testing-west2/on-start.sh . echo "Starting the SageMaker autostop script in cron" (crontab -l 2>/dev/null; echo "*/1 * * * * …
I wanted to create shap values for my predictions in sagemaker. I found out that I can use "clarify" functionality in sagemaker to get shap values. However, I want to get point predictions as well. The bias config in clarify can include a prediction function but it seems its just for the train data. I wonder how I can include the predict function in the shap.config part of that so it gives me both individual predictions as well as SHAP …
I'm using AWS Sage Maker to build my model. I want to store the model in S3 for later use. How do you save your model in S3 with Amazon Sage Maker? I know this seems trivial but I didn't understand the sources/documentation I've read.
Is there a size limit imposed on models deployed on AWS SageMaker as endpoints? I first tried to deploy a simple TensorFlow/Keras Iris classification model by converting to protobuf, tarring the model, and deploying. The size of the tarred file was around 10KB, and I was able to deploy that successfully as an endpoint. However, I tried the same process with a Nasnet model where the size of the tarred file ended up being around 350MB, and I got the …
Does anyone know if the rank:ndcg is available on AWS Sagemaker? I am currently trying to run a model, but it seems like it's not implemented. Am I using an older xgboost version? Kinda new to AWS Sagemaker, so if you have any tutorials/docs, shoot them over.