numpy

Export data from yaml format to Excel

Manish Kumar

2022年6月3日 04:41

This image shows my data in .yaml format. I want to export in column

Topic: numpy python

Category: Data Science

How to find average lag time with variance & confidence of two time series

Ramy Saad

2022年5月30日 16:06

I have two variables as time series, one a consequent of the other, I would like to find the average time delay it takes the dependent variable to act on the independent variable. Additionally, I would like to find the range of variance that is associated with the lag time and its respective confidence level. I am unsure how to go about this in a statistically valid way, but I am using Python. Currently I have used np.diff(np.sign(np.diff(df))) to isolate …

Topic: numpy variance python statistics

Category: Data Science

Model prediction on meshgrid in python

Milan Amrut Joshi

2022年5月30日 03:05

Suppose I have data with two independent variable $X_1$, $X_2$ and one dependent variable say $y$, as follows: $X_1$: $x_{1,1}$, $x_{1,2}$ , $x_{1,3}$ $X_2$: $x_{2,1}$, $x_{2,2}$, $x_{2,3}$ $y$: $y_1$, $y_2$, $y_3$ I built some Machine learning model which is good . Now I want to generate predictions not just for test data but for all possible combinations of test data for example, if our test data looks like $X_1$: $a$, $b$, $c$ $X_2$: $p$, $q$, $r$ then I want predictions …

Topic: numpy python predictive-modeling

Category: Data Science

Is it possible to implement a vectorized version of a Maxout activation function?

Winston

2022年5月29日 20:06

I want to implement an efficient and vectorized Maxout activation function using python numpy. Here is the paper in which "Maxout Network" was introduced (by Goodfellow et al). For example, if k = 2: def maxout(x, W1, b1, W2, b2): return np.maximum(np.dot(W1.T,x) + b1, np.dot(W2.T, x) + b2) Where x is a N*D matrix. Suppose k is an arbitrary value(say 5). Is it possible to avoid for loops when calculating each wx + b? I couldn't come up with any …

Topic: numpy activation-function deep-learning python machine-learning

Category: Data Science

Python: calculate the weighted average correlation coefficient

Richard H

2022年5月28日 14:19

I am calculating the volatility (standard deviation) of returns of a portfolio of assets using the variance-covariance approach. Correlation coefficients and asset volatilities have been estimated from historical returns. Now what I'd like to do is compute the average correlation coefficient, that is the common correlation coefficient between all asset pairs that gives me the same overall portfolio volatility. I could of course take an iterative approach, but was wondering if there was something simpler / out of the box …

Topic: numpy pearsons-correlation-coefficient correlation pandas python

Category: Data Science

Proper datashape and model architecture for recognizing highs and lows in a chart

Mahdi_J

2022年5月27日 17:06

I am using a Keras LSTM model to try to pinpoint the highs and lows (relative high points and low points) in a chart (I need the actual coordinates to those highs and lows, not just an image). The training process has no errors in it but the prediction output is completely irrelevant to the training output. what I've done so far is, I created the output data by feeding the input data to an algorithm from Scipy, argrelextrema. For …

Topic: numpy machine-learning-model keras deep-learning python

Category: Data Science

Pre-process data images before training OneClassSVM and decrease number of features

riadrifai

2022年5月25日 13:07

I want to train a OneClassSVM() using sklearn, and I have a set of around 800 images in my training set. I am using opencv to read the images and resize them to constant dimensions (960x540) and then adding them to a numpy-array. The images are RGB and thus have 3-dimensions. For that, I am reshaping the numpy array after reading all the images: #Assume X is my numpy array which contains all the images before reshaping #Now I reshape …

Topic: numpy preprocessing scikit-learn python machine-learning

Category: Data Science

Different approaches of creating the test set

James K J

2022年5月23日 14:01

I came across different approaches to creating a test set. Theoretically, it's quite simple, just pick some instances randomly, typically 20% of the dataset and set them aside. Below are the approaches The naive way of creating the test set is def split_train_test(data,test_set_ratio): #create indices shuffled_indices = np.random.permutation(len(data)) test_set_size = int(len(data) * test_set_ratio) test_set_indices = shuffled_indices[:test_set_size] train_set_indices = shuffled_indices[test_set_size:] return data.iloc[train_set_indices],data.iloc[test_set_indices] The above splitting mechanism works, but if the program is run, again and again, it will generate a different …

Topic: numpy preprocessing python machine-learning

Category: Data Science

Identifing this dataset for sanitising

George

2022年5月22日 06:02

I am beginner here starting with data science for analytics. I am trying to figure out what data set this is and how to read it from python. I have an idea of the steps but not sure how to code it in python. Open & read the file Search for keywords based on another file If keyword found, search for Term from that line up and copy value of id: which is below it. If more than one keyword …

Topic: numpy pandas python

Category: Data Science

unable to pass X_train and y_train in my regressor variable. i got a ValueError

Chris dylan J'TEMFACK

2022年5月18日 20:14

import pandas as pd import numpy as np import matplotlib.pyplot as plt data = pd.read_csv('housing.csv') data.drop('ocean_proximity', axis=1, inplace = True) data.head() longitude latitude housing_median_age total_rooms total_bedrooms population households median_income median_house_value 0 -122.23 37.88 41.0 880.0 129.0 322.0 126.0 8.3252 452600.0 1 -122.22 37.86 21.0 7099.0 1106.0 2401.0 1138.0 8.3014 358500.0 2 -122.24 37.85 52.0 1467.0 190.0 496.0 177.0 7.2574 352100.0 3 -122.25 37.85 52.0 1274.0 235.0 558.0 219.0 5.6431 341300.0 4 -122.25 37.85 52.0 1627.0 280.0 565.0 259.0 3.8462 342200.0 …

Topic: numpy linear-regression pandas python

Category: Data Science

Slice NumPy arrays differently along axes (without looping)

piano_man

2022年5月17日 20:36

I am trying to analyze a temporal signal sampled by a 2D sensor. Effectively, this means integrating the signal values for each sensor pixel (array row/column coordinate) at the times each pixel is active. Since the start time and duration that each pixel is active are different, I effectively need to slice the signal for different values along each row and column. # Here is the setup for the problem import numpy as np def signal(t): return np.sin(t/2)*np.exp(-t/8) t = …

Topic: numpy sampling indexing

Category: Data Science

How to solve this ValueError: Dimensions must be equal

nmorsi

2022年5月17日 01:23

I'm trying to train an autoencoder model with colored image samples but I got this error ValueError: Dimensions must be equal, but are 476 and 480 for '{{node mean_squared_error/SquaredDifference}} = SquaredDifference[T=DT_FLOAT](model_4/conv2d_28/BiasAdd, IteratorGetNext:1)' with input shapes: [?,476,476,1], [?,480,480,3]. although i have checked the dimensions of the test and training sets all are (480,480,3) from matplotlib import image,pyplot import cv2 IMG_HEIGHT=480 IMG_WIDTH=480 def prepro_resize(input_img): oimg= cv2.imread( input_img, cv2.COLOR_BGR2RGB) return cv2.resize(oimg, (IMG_HEIGHT, IMG_WIDTH),interpolation = cv2.INTER_AREA) x_train_ = [(prepro_resize(x_train[i])).astype('float32')/255.0 for i in range(len(x_train))] x_test_ …

Topic: opencv numpy tensorflow deep-learning python

Category: Data Science

OpenCV warpAffine error during image augmentation using Albumentations

Tian

2022年5月15日 22:16

I have been trying to do image augmentation using a library called Albumentations. But I got some error from OpenCV while transforming the images. I ran the code below on Kaggle's notebook. The dataset is called "Intel image classification" on kaggle. It has 6 classes. Each image is 150 * 150 * 3. import numpy as np import tensorflow as tf import albumentations as a train_data = tf.keras.utils.image_dataset_from_directory( x_train_path, seed=123, image_size=(150, 150), batch_size=128) x_train_path = "../input/intel-image-classification/seg_train/seg_train" transforms = Compose([ a.Rotate(limit=40), …

Topic: opencv numpy tensorflow python machine-learning

Category: Data Science

How to solve MemoryError problem

nmorsi

2022年5月15日 05:12

I've created and normalized my colored image dataset of 3716 sample and size 493*491 as x_train, its type is list I'm tring to convert it into numpy array as follows from matplotlib import image import numpy as np import cv2 def prepro_resize(input_img): oimg=image.imread(input_img) return cv2.resize(oimg, (IMG_HEIGHT, IMG_WIDTH),interpolation = cv2.INTER_AREA) x_train_ = [(prepro_resize(x_train[i])).astype('float32')/255.0 for i in range(len(x_train))] x_train_ = np.array(x_train_) #L1 #print(x_train_.shape) but i get the following error when L1 runs MemoryError: Unable to allocate 10.1 GiB for an array with …

Topic: opencv numpy tensorflow deep-learning python

Category: Data Science

Are there any graph embedding algorithms like this already?

monomonedula

2022年5月13日 13:00

I wrote an algorithm for generating node embeddings based on the graph's topology. Most of the explanation is done in the readme file and the examples. The question is: Am I reinventing the wheel? Does this approach have any practical advantages over existing solutions for embeddings generation? Yes, I'm aware there are many algorithms for this based on random walks, but this one is pure deterministic linear algebra and it is quite simple, from my perspective. In short, the algorithm …

Topic: numpy representation embeddings graphs python

Category: Data Science

How to run list comprehensions on GPU?

Ishmael89

2022年5月10日 00:03

Is there a way to run complex list comprehensions like the following on GPU? [[x[index] if x[index]>len(x) else x[index]-1 for x in slice] if (len(slice)==1) else slice for slice,index in zip(slices,indices)] To what degree is it Possible? Do I have to convert it to some kind of numpy comprehension (if so what part is speciffically possible/necessary) The goal is performance optimization on large datalists/arrays.

Topic: numpy gpu performance

Category: Data Science

Integration of NLP and Angular application

Ravi Kumar B

2022年5月8日 14:02

I'm doing a small POC in which I've trained my Machine Learning model (Naive Bayes) and is saved in ".pkl" (pickle) format. Now my next task is to develop a web application which asks the user to enter the Text for the Text classification analysis. This newly taken (from the user) "TEXT" will be the testing dataset which can be fed to the Naive Bayes model that I built in the earlier stage and make prediction on the "text" taken …

Topic: numpy classification nlp python machine-learning

Category: Data Science

Tensor dot product with rank one tensor from vector

fabio

2022年5月5日 14:56

I'm trying to compute an inner product between tensors in numpy. I have a vector $x$ of shape (n,) and a tensor $y$ of shape d*(n,) with d > 1 and would like to compute $\langle y, x^{\otimes d} \rangle$. That is, I want to compute the sum $$\langle y, x^{\otimes d} \rangle= \sum_{i_1,\dots,i_d\in\{1,\dots,n\}}y[i_1, \dots, i_d]x[i_1]\dots x[i_d].$$ A working implementation I have uses a function to first compute $x^{\otimes d}$ and then uses np.tensordot: def d_fold_tensor_product(x, d) -> np.ndarray: """ …

Topic: numpy python

Category: Data Science

How to create a complex Gaussian random noise with a specific covariance matrix

gurluk

2022年4月29日 11:07

I am trying to generate a complex Gaussian white noise, with zero mean and the covariance matrix of them is going to be a specific matrix which is assumed to be given. Assume i to be a point on the grid of x axis, where there are N points on the axis. The problem is to generate a complex valued random noise at each point (let's call the random value at the point i as $y_i$), which obeys Gaussian distribution …

Topic: noise numpy gaussian python

Category: Data Science

Why fourier transform extrapolation goes to extreme on edges but not in the middle, how to fix it

Shimon Doodkin

2022年4月28日 21:59

Why fourier transform extrapolation goes to extreme on edges but not in the middle, how to fix it with python """ Code to create the Fuorier trasfrom """ data_FT = dataset_ex_df[['Date', 'GS']] close_fft = np.fft.fft(np.asarray(data_FT['GS'].tolist())) fft_df = pd.DataFrame({'fft':close_fft}) fft_df['absolute'] = fft_df['fft'].apply(lambda x: np.abs(x)) fft_df['angle'] = fft_df['fft'].apply(lambda x: np.angle(x)) plt.figure(figsize=(14, 7), dpi=100) fft_list = np.asarray(fft_df['fft'].tolist()) for num_ in [3, 6, 9, 100]: fft_list_m10= np.copy(fft_list); fft_list_m10[num_:-num_]=0 plt.plot(np.fft.ifft(fft_list_m10), label='Fourier transform with {} components'.format(num_)) plt.plot(data_FT['GS'], label='Real') plt.xlabel('Days') plt.ylabel('USD') plt.title('Figure 3: Goldman Sachs (close) stock …

Topic: numpy time-series python

Category: Data Science

About