I am fine tuning masked language model from XLM Roberta large on google machine specs. When I copy the model using gsutil and subprocess from container to GCP bucket it gives me error. Versions Versions torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0+cu113 transformers==4.17.0 I am using pre-trained Hugging face model. I launch it as train.py file which I copy inside docker image and use vertex-ai ( GCP) to launch it using Containerspec machineSpec = MachineSpec(machine_type="a2-highgpu-4g",accelerator_count=4,accelerator_type="NVIDIA_TESLA_A100") python -m torch.distributed.launch --nproc_per_node 4 train.py --bf16 I am …
I am using Data Studio for a project and I am connecting to my BigQuery table. My table contains following columns: Date Store_name Footfall I'd love to compare the footfall of two stores using Data Studio but apparently I can't do that! Any hints or should I just switch to another viz tool?
Especially when considering GCP, the analytics offer from Google is quite interesting. Why would you go with Databricks? GCP has also great integration between tools as well as great support for ML/AI, etc.
I am interested in using Google Colab for data modeling. How do I install conda, create an environment and run python in a notebook? I did some searching and found some helpful hints, but had several issues with this. I can only get a partially functional environment so far. I get stuck in running another cell in the same environment. Seems that switching cells resets the environment back to default.
I have been trying to join two tables from different datasets that are in different locations but in the same project. However, I keep getting the error: dataset not found in US location. The datasets' locations are US and us-east1 Here is what I am doing: select a.*, b.* from `project.dataset1.table1` a join `project.dataset2.table2` on a.common_col = b.common_col Please help me out on this.
So I want to test a lot of hyperparameters for an xgboost classification model and also do cross validation for all of these. To do this I use a gridsearch. To speed up the process I want to use as many cpu cores as possible, so I set the n_jobs parameter to the number of available cpu cores in the system. This all works perfectly fine, see code below. xgb_model = XGBClassifier(use_label_encoder=False, eval_metric='auc') njobs = os.cpu_count() gsearch = GridSearchCV(estimator=xgb_model, param_grid=param_tuning, …
I am using the ResNet50 pretrained model to train my images using TensorFlow. I have 70k images and upgraded to Google Colab Pro, but still I am facing a memory error. So how many images I can train in Google Colab? And how much RAM is needed for 70k images? This is how I labeled and loaded images from the drive. labels = [] imagePaths_generater = paths.list_images(Config.DATASET_PATH) imagePaths = [] for item in imagePaths_generater: imagePaths.append(item) for imagePath in imagePaths: label …
I have a recommendation system that recommends articles to different users. I am planning to provide the recommendations in an off-line fashion. Where I already have a table in BigQuery which has the recommendations and an API call returns the recommendations for each page on the website. Now I want to have another table called user_profile which stores the information about the user_id|shown|clicked| articles to the users. This should happen in real-time. I looked into https://cloud.google.com/bigquery/streaming-data-into-bigquery but it has limitations. …
I'm trying to build a Google dataflow pipeline through one of the posts in Medium. https://levelup.gitconnected.com/scaling-scikit-learn-with-apache-beam-251eb6fcf75b However, it seems like I'm missing one of the project argument and it throws the following error. I'd appreciate your help to guide me through. Error: ERROR:apache_beam.runners.direct.executor:Giving up after 4 attempts. WARNING:apache_beam.runners.direct.executor:A task failed with exception: Missing executing project information. Please use the --project command line option to specify it. Code: import apache_beam as beam import argparse from apache_beam.options.pipeline_options import PipelineOptions from apache_beam.options.pipeline_options import …
So, I'm trying to use Google BigQuery for the first time for a project of mine, and I'm a bit confused. The documentation isn't helping much, and it looks like all the Google employees are gone thanks to the current epidemic, judging by the blogpost by Google. I've got several .csv files containing tables of unlabeled numerical data uploaded to Google Cloud storage, in their Australian servers. What I want to do is to use BigQuery to perform k-means clustering …
I have a gaming rig with an i9 CPU, 32GB RAM and RTX 2080, and I have a GCP VM with 4 vCPU, 52 GB RAM and V100. I try to train the same dataset using the same toolchain on both machines and these are my ETA's: GCP VM: 16 days Gaming rig: 5 days How can a single $600 GPU outperform a 10k GPU? What's going on here? And what should I even expect?
I am looking to rent a GPU instance in Google Cloud for casual deep learning model training purposes and wondered about the differences between the available Nvidia Tesla versions which are Nvidia Tesla T4 Nvidia Tesla P4 Nvidia Tesla V100 Nvidia Tesla P100 Nvidia Tesla K80 Here is the GPU Pricing page. If anyone has used it or currently doing or know the best GPU to rent it'll be appreciated if you can share your experience or knowledge about it.