Compressing profiles with a large number of dimensions

I 'think' this is a related question, but not sure how to apply it. I'm trying to build out a very crude recommendation system using Amazon ML, Facebook likes, and historical actions. So lets say we have a number of users within a system that promotes products within several categories. To better predict which categories of items to present to the user, we will consider their past interactions with specific items, and the past interactions of other users who share …
Category: Data Science

Docker File AWS

I am trying to implement a docker file for Amazon Sagemaker Container,in initial step i am following this link https://towardsdatascience.com/brewing-up-custom-ml-models-on-aws-sagemaker-e09b64627722 In above link's section "Creating Your Own Docker Container" last command of docker image is COPY xgboost /opt/program I don't have any idea what xgboost file here is for this? Due to this my docker build is failing , please see below image of docker and its built Docker Image FROM ubuntu:latest MAINTAINER Amazon AI <[email protected]> RUN apt-get -y update …
Category: Data Science

Layman's comparison of RMSE

I don't have a maths / stats / data science background and need to evaluate which of the two evaluations below (numerical regression on Amazon Machine Learning) predict more accuracy. Both models use the same data set but it's looking at different time frames both on the independent and dependent variables. How can I evaluate which one of the two models is more accurate? And is there a way to tell how accurate these two models are in general (e.g. …
Category: Data Science

Image Feature Vectors

I have downloaded a dataset from Amazon. http://jmcauley.ucsd.edu/data/amazon/ Dataset involves feature vectors of images. There are around 1.5 M feature vectors. Dataset consists of 10 characters (the product ID), followed by 4096 floats (repeated for every product). Every product image involves feature vectors with (4096x1) size. Feature vectors involve float numbers. What do these float numbers mean? What I understood is, there are at total 4096 features, and each index of feature vectors indicate a specific feature. The values in …
Category: Data Science

Which Amazon EC2 instance for Deep Learning tasks?

I have discovered that Amazon has a dedicated Deep Learning AMI with TensorFlow, Keras etc. preinstalled (not to mention other prebuilt custom AMIs). I tried out this with a typical job on several GPU-based instances to see the performances. There are five such in the Ireland region (maybe in other regions exist even more, I don't know, this variance is a bit confusing): g2.2xlarge g2.8xlarge p2.xlarge p2.8xlarge p2.16xlarge My first question is, what is the difference between the two groups …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.