Azure Cloud SQL - Querying large number of rows with Python

I have a Python Flask application that connects to an Azure Cloud SQL Database, and uses the Pandas read_sql method with SQLAlchemy to perform a select operation on a table and load it into a dataframe. recordsdf = pd.read_sql(recordstable.select(), connection) The recordstable has around 5000 records, and the function is taking around 10 seconds to execute (I have to pull all records every time). However, the exact same operation with the same data takes around 0.5 seconds when I'm selecting …
Category: Data Science

Score Columns in Azure ML Studio

So I have a data set I have successfully used to train a model, and have decent results. I am using a Two Class Boosted Decision tree for a Boolean output. So far so good. I now want to analyze each column of my data set and remove any column that is not a meaningful influence on the outcome. I see statistics on columns in my data set: But I don't see whether a column has a strong relationship with …
Category: Data Science

Azure ML / AutoML: problem with univariate time series forecasting

I'm having troubles generating univariate time series forecasts with Azure Automated Machine Learning (I know...). What I'm doing So I have about 5 years worth of monthly observations in a dataframe that looks like this: date target_value 2015-02-01 123 2015-03-01 456 2015-04-01 789 ... ... I want to forecast target_value based on past values of target_value, i.e. univariate forecasting like ARIMA for instance. So I am setting up the AutoML forecast like this: # that's the dataframe as shown above …
Category: Data Science

No result for scored labels in Azure ML Web Service

I am trying to predict scored labels using regression. But when I am about to get the result from Azure ML Web Service in Excel 2016, there is no result appeared in the scored label column. How should I fix this? Below is all my process... Here is my problem I always get. As you are seeing now, there is no result in scored label column when I try to predict.
Category: Data Science

Mean Absolute Error increasing with more correlated factors

I am using Microsoft Azure Machine Learning Studio to predict stock market prices. We have the variables- Index price(target-to be predicted),Low price,High price,dates and days. We use split of 0.7 and run Linear regression. We get Mean absolute error of 109. We then try to add more variables(macroeconomic factors which positively effect the index prices) which are correlated with the target variable and should improve the predictions- we find that the Mean Absolute error increases to 110.I have attached the …
Category: Data Science

Deployment in AzureML for NLP with fastText

I am new to Azure ML. I am working on sentimental analysis on a small tweet dataset with the help of fastText embedding (fastText file 'wiki-news-300d-1M.vec' is around 2.3 GB which I downloaded in my folder). When I run the program in the Jupyter notebook everything runs well. But when I try to deploy the model in Azure ML, while I attempt to run the experiment: run = exp.start_logging() run.log("Experiment start time", str(datetime.datetime.now())) I am getting the error message: While …
Category: Data Science

PicklingError in pyspark (PicklingError: Can't pickle <class '__main__.Person'>: attribute lookup Person on __main__ failed)

I am unable to pickle the below class. I am using data bricks 6.5 ML (includes Apache Spark 2.4.5, Scala 2.11) import pickle class Person: def __init__(self, name, age): self.name = name self.age = age p1 = Person("John", 36) pickle.dump(p1,open('d.pkl','wb'))``` PicklingError: Can't pickle &lt;class '__main__.Person'&gt;: attribute lookup Person on __main__ failed
Category: Data Science

Is it possible to export a model from Azure Cognitive Services, import it in Azure ML Studio and then train it from scratch?

I have access to both Azure Machine Learning Studio and Azure Cognitive Services. Ideally I'd like to export any model that will do a good job at detecting certain objects belonging to a certain class present on a picture from Azure Cognitive Services, then import that model into Models in Azure Machine Learning Studio and then train it from scratch on my own dataset. My question is: is that possible? If the answer is 'no' then what would be the …
Topic: azure-ml
Category: Data Science

Is it possible to use TensorFlow inside a python script in Azure Machine Learning Studio?

I'm trying to get TensorFlow running inside a python script in Azure Machine Learning Studio. As TensorFlow is not part of Azure Machine Learning Studio, I needed to import it using a zip file. I followed the instructions here: https://stackoverflow.com/questions/44593469/how-can-certain-python-libraries-be-imported-in-azure-mllike-the-line-import-hu However, when trying to import TensorFlow, I get: ImportError: No module named _pywrap_tensorflow_internal Failed to load the native TensorFlow runtime. It seems like TensorFlow is much more than just a python library. It seems like it needs a native library …
Category: Data Science

Estimating the uncertainty of regression models

Given a regression model, with n features, how can I measure the uncertainty or confidence of the model for each prediction? Suppose for a specific prediction the accuracy is amazing, but for another it's not. I would like to find a metric that will let me decide if, for each frame, I would like to &quot;listen&quot; to the model or not.
Category: Data Science

Azure AutoML time series endpoint data input

I am using Azure ML studio AutoML to train a best time series model with TCNForcaster algorithm and deploy it as web service. since this is using a deep learning algorithm and the request is different than simple algorithm. I do not know how to enter my request data for forecasting below. I have tried lots of ways but always got &quot;error&quot;:&quot;'date'&quot;. { &quot;data&quot;: [ { &quot;_automl_target_col_WASNULL&quot;: 0, &quot;_automl_target_col_season&quot;: 0, &quot;_automl_target_col_trend&quot;: 0, &quot;_automl_year&quot;: 0, &quot;_automl_half&quot;: 0, &quot;_automl_quarter&quot;: 0, &quot;_automl_month&quot;: 0, …
Topic: azure-ml
Category: Data Science

Service Request classification, questionnaire filling and call logging

I am very new to machine learning. I just went through some of the tutorials in Azure and completed one practice workflow(car price prediction). I hope I can ask basic questions here. Scenario : We get service request from our customers via email. This has fields like customer name, user name, email id, Equipment affected, type of call and Issue experienced(this is a free text area). The employee reads this email, mainly the issue experienced. Based on the issue experienced …
Category: Data Science

Multiple Merges make the data frame in pandas to explode and causing Memory Issue in jupyter notebook

I have made a multiple merges using pandas data frame (refer the example script below). It made the data frame to explode and consume more memory as it records reach to 18 Billion in df3 and try to merge with 5Lack records in df4. This causing the memory issue. It consumes the whole memory in RAM(140 GB of memory) and session got killed. df = df1[df1_columns].\ merge( df2[df2_columns], how='left', left_on='col1', right_on='col2' ).\ merge(df3[df3_columns], how='left', on='ID').\ merge(df4[df4_columns], how='left', on='ID') ) Appreciate …
Category: Data Science

Test dataset contains invalid data. ( Error 0018 ) in Azure ML Studio Evaluate Recommender

I am doing a crop recommender system using the Matchbox recommender system in Azure ml studio. while splitting the dataset using Recommender split, it won't be split. but I split while using split rows, it works. but when evaluating recommender it shows error like 'Test dataset contains invalid data' how to overcome this issue?
Category: Data Science

Hidden integers being attached to my data in Azure ML

I've been trying to solve an issue with a piece of time data for a while now. I cannot convert it to DateTime using the Edit Metadata module, or turn it into a numeric value, or bin it. However, every time I enter it into a model to be trained, the time values return with these underscored integers attached to them as _1, _2, _3, etc. They come at the end of a normal value - for example 01/02/2021 08:33*_1*. …
Category: Data Science

Improve a regression model and feature selection

I am working on Azure ML Studio and try to create a regression model to predict a numerical value. I will try to describe my features and what I have done until now. My data with about 3 million rows : Features: 8 integer features from 1 to 25 2 boolean features with 0 and 1 3 integer features from 1 to 10 2 integer feature from 0 to 500.000 (and 1.000.000 respectively) with about 4.500 unique values 1 integer …
Category: Data Science

What is the default number of cross-validation folds with the "Tune Model Hyperparameters" module in Azure ML studio?

I presume the Azure ML studio's &quot;Tune Model Hyperparameters&quot; module is performing cross-validation, since it shows &quot;average test&quot; metrics like accuracy and precision: However, I don't see a parameter for setting the number of CV folds nor any info in the docs about what this &quot;test&quot; set is. In the pipeline, we are only providing training data (no validation/test set): So is the module performing CV by default? If so, what is the default number of folds? I understand how …
Topic: azure-ml
Category: Data Science

xgboost in R have different results compared to boosted decision tree in Azure ML

I have a small data set (4000 records with 10 features) and I used XGBOOST in R as well as Boosted Decision Tree model in Azure ML studio. Unfortunately the results are different. I like to optimize recall and I could pick that as a measure in Azure but I can not do so in R. I used the same parameters in both platforms. I know seeds might be different but I tried many of them. I always have a …
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.