Why model trains slower on GCP than on my local machine?

I'm using tensorflow-cloud and train a 3D voxel CNN. My local machine: NVIDIA GeForce RTX 2080 Ti 11GB, Intel Core i7 3GhZ, 32 GB RAM This is my machine config on tfc: tfc.MachineConfig(cpu_cores=8, memory=30, accelerator_type=tfc.AcceleratorType.NVIDIA_TESLA_T4, accelerator_count=1), To me this looks comparable. However, the training job takes 2-3 times as long as on my local machine. Do I share the cloud machine with other training jobs? Also the the job might be IO limited, on my local machine my training set …
Category: Data Science

Best way to represent a version feature based on percentiles

We're training a binary classifier in AutoML, and one of the features consist of browser versions. Currently these versions are provided "normalized" to the model, according to the percentile of the browser the current observation falls into. For example, if the percentiles of some specific browser versions are: percentile version p25 34 p50 45 p75 53 p99 70 then an observation with said browser and version=54 would be represented as: p25 p50 p75 p99 1 1 1 0 My question …
Category: Data Science

Query google trend using google BigQuery

I need help with google BigQuery. Am using big query to query data from Google Trends. now I want to get data using a specific keyword example spiderman, and get the result in regions like CSV downloaded in google trend "interest over time". But google trend has this code only view 25 top-trending terms SELECT * FROM `bigquery-public-data.google_trends.top_terms` WHERE refresh_date = DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY) I want to use same syntax to get data for a specific term/keyword.
Category: Data Science

Feature set choice in Google's Vertex AI/AutoML

This is a subjective question on utilizing Vertex AI/AutoML in practice. I posted it on stackoverflow and it was closed. I hope it is within scope here. I'm using Google's Vertex AI/AutoML's Tabular dataset models to learn a regression problem on structured data with human engineered features - it's a score/ranking problem and the training target values are either 0 or 1. Our constructed features are often correlated, sometimes the same data point normalized on different dimensions, e.g. number of …
Category: Data Science

How to schedule importing data files from SFTP server located on compute engine instance into BigQuery?

What I want to achieve: Transfer hourly coming data files onto a SFTP file server located on a compute engine VM from several different feeds into Bigquery with real-time updates effectively & cost-efficiently. Context: The software I am trying to import data from is an old legacy software and does not support direct exports to cloud. So direct connection from software to cloud isn't an option. It does however support exporting data to a SFTP server. Which is not available …
Category: Data Science

How can I create a VM instance with GPUs on Google Cloud Platform?

How can I create a VM instance with GPUs on Google Cloud Platform? When I go to https://console.cloud.google.com/compute -> CREATE INSTANCE, I only see CPUs and no GPUs, as shown in the video below. I did select a region+zone that is supposed to have GPUs according to https://cloud.google.com/compute/docs/gpus https://cloud.google.com/compute/docs/gpus (mirror): I see that some VMs from the marketplace comes with GPUs but I'd prefer to configure the VM myself.
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.