multivariate-distribution

Forecasting on multivariate time series containing quaternions

chronosynclastic

2022年6月3日 12:50

I have a multivariate time series containing 3D position data ($x,y,z)$ and orientation data (as quaternions) obtained from motion sensors. My goal is to forecast the future position/orientation, and for this I'm looking into use sequence models, esp. LSTMs. A quaternion has 4 elements, one of them denoting the real/scalar part (say $q_w$) and the other three denoting the imaginary/vector part (say $q_x, q_y, q_z$). So my time series has 7 columns in total. My question: Considering that quaternion elements …

Topic: multivariate-distribution lstm sequence

Category: Data Science

Polynomial regression with two variables. How can I find expressions to describe the coefficients?

Ant

2022年5月25日 20:59

I'm not sure if this is an appropriate place for this question, so please feel free to redirect me if it is not. I just moved it from Super User, where it seemed like there weren't many similar questions. Please also feel free to suggest tags. I'm trying to modify part of an old code. It uses regression to describe the relationship between two variables (described as "a fourth order power series in X and y"). I know very little …

Topic: multivariate-distribution matlab regression python statistics

Category: Data Science

Which dataset for multivariate time series forecasting

Djakarta_zero

2022年5月20日 08:44

I'm trying to forecast Real estate Price , it's not a prédiction. But a forecast Like the Price of a an appartement in 2023 or 2024, i'm asking about how should be my dataset ? Can I use a dataset from 2018 to 2021 of 13 columns You can find the dataset here: https://www.kaggle.com/datasets/mrdaniilak/russia-real-estate-20182021 Date, area, kitchen_are, nb_rooms Please note that every row is a new house indépendant from others, I'm having this dataset by scrapping a website of ads …

Topic: multivariate-distribution forecast arima regression time-series

Category: Data Science

Getting vague results using VAR time series forecasting in python!

The P Guy

2022年5月18日 16:32

Firstly, I am a beginner in this field of Data Science and have tried to implement some time series models for wind speed forecasting. Also, I am aware of the fact that some regression models might give better results, but still, my aim is to crack the same with the help of VAR I tried to implement multivariate time series forecasting - VAR in python. To start with I followed the code in this article- https://towardsdatascience.com/simple-multivariate-time-series-forecasting-7fa0e05579b2 However, the forecasted value …

Topic: multivariate-distribution forecasting implementation time-series python

Category: Data Science

Confidence intervals in multivariate linear regression

Jsevillamol

2022年5月11日 06:04

I am fitting my data to a multivariate linear regression $Y = BX + \Xi$, where the response is bivariate $Y\in R^{n\times 2}$, and the predictor is uni-variate but elevated to the projective plane to account for the intercept $X\in R^{n\times 2}$. Now, finding the best fit reduces to $\hat B = (X^T X)^{-1}X^T Y$. But I am interested in finding a $0.7$ confidence region around $\hat B$. How do I do that?

Topic: multivariate-distribution linear-regression regression

Category: Data Science

multi variate time forecasting

JanWillem

2022年5月3日 17:00

I want to forecast in a time serie the 'output'. I have from the past the correlated time series 'output', 'capacity' and 'load'. I also know from the nearby future the time series from the 'capacity' and 'load'. See picture. I'm looking for a solution to this problem in python. All variable have the same unit in man-hours per hour (mh/h). For your interest; the output is the work that is finished in a skill group based on the baseline. …

Topic: multivariate-distribution time-series python

Category: Data Science

How to build multiple variable regression having a mix of numerical & categorical features?

Артём Ощепков

2022年4月26日 22:02

There is a need to estimate Annual Average Daily Traffic Volume (AADT). We have bunch of data about vehicles' speeds during several years. It is noticed that AADT depends on the average number of such samples during some time, so a regression model $Y = f(x_1)$ could help estimating the AADT. The problem is there are other features affecting the dependency which are both numerical $(x_2, .., x_k)$ and categorical $(c_1 = data\ provider, c_2 = road\ class, .., c_m)$. …

Topic: multivariate-distribution features regression categorical-data

Category: Data Science

Getting a balanced sample across many variables

user

2022年4月24日 11:04

Let’s say each element in my population has several attributes. Let’s call then A, B, C, D, E, F. Let’s say, for simplicity, each attribute has 10 values (but could be any number between 2 and 30). Now I want to get a sample such that the distribution is the same across all features. So for example if the whole population has about 15% of people in feature A with value 1, my sample should be the same. What should …

Topic: multivariate-distribution distribution sampling statistics

Category: Data Science

regarding computing the centroid of high dimensional data

user297850

2022年4月23日 04:44

In scikit-learn, or other python libraries, are there any existing implementations to compute centroid for high dimensional data sets?

Topic: multivariate-distribution anomaly-detection scikit-learn clustering machine-learning

Category: Data Science

Can the dependency between variables be deduced from data? And if so, how?

coliva

2022年4月13日 20:08

I have a data set $X$ that consists of $m$ vectors $\vec{x}$ of $n$ real-valued components. Each vector component lies within a corresponding predefined interval of valid values, which is the same for all vectors in $X$. The assumption is that there exists a dependency graph between the components of each vector, which is also the same for all vectors; for example, the value of the component $x_k$ (maybe) depends on the values of both components $x_p$ and $x_q$ for …

Topic: multivariate-distribution data-mining machine-learning

Category: Data Science

Multivariate data preprocessing

Canovich

2022年3月31日 09:16

I am trying to understand how multivariate data preprocessing works but there are some questions in my mind. For example, I can do data smoothing, transformation (box-cox, differentiation), noise removal in univariate data (for any machine learning problem. Not only time series forecasting). But what if one variable is not noisy and the other is noisy? Or one is not smooth and another one is smooth (i will need to sliding window avg. for one variable but not the other …

Topic: multivariate-distribution regression classification time-series data-cleaning

Category: Data Science

Sampling trying to keep as much multivariate variance as possible

PascalVKooten

2022年3月11日 07:06

I was thinking if anyone considered a sampling technique that would try to aim keeping as much of the variance as possible (e.g. as many unique values, or very widely distributed continuous variables). The benefit might be that it will allow development of code around the sample, and really work with the edge cases in the data. You can then later always take a representative sample. So, I am wondering if people have tried to sample for maximum variance before …

Topic: multivariate-distribution variance sampling

Category: Data Science

Interpretation of PCA/FAMD results

Inuraghe

2022年3月3日 14:00

I wrote a code about a mix PCA (FAMD - factor analysis of mixed data), where I have a dataset with some categorical variable and some numerical variable. This is my example code in R: library(dplyr) library(PCAmixdata) data <- starwars db_quali <- as.data.frame(starwars[,4:6]) db_quanti <- as.data.frame(starwars[,2:3]) pca_table <- PCAmix(X.quanti = db_quanti, X.quali = db_quali, rename.level=TRUE, graph = TRUE) Gender <- factor(data$gender) par(xpd=TRUE,mar=rep(8,4)) plot(pca_table ,choice="ind",label=FALSE, posleg=xy.coords(2,-10), main="Observations", coloring.ind = Gender) and the output graph is: How this method calculate the coordinate …

Topic: pcamixdata multivariate-distribution pca r

Category: Data Science

Getting mean and covariance matrix for multivariate normal from keras model

Tanzin Farhat

2022年2月19日 16:05

I have a dataset that has 6 input features and 5 output features. I want to use a keras sequential model to estimate the mean vector and covariance matrix from any row of input features assuming the output features to be following Multivariate Normal Distribution. That is for my dataset for any row of 6 input features, I want to get a mean vector of 5 values and a 5*5 covariance matrix. sample=pd.DataFrame({'X1':[1,2,3,4,5,6], 'X2':[1,3,1,5,2,7], 'X3':[3,0,0,7,5,0], 'X4':[0,4,3,2,5,8], 'X5':[9,7,0,2,4,5], 'X6':[1,1,8,7,0,0], 'Y1':[0.5,1.2,6.3,4.5,1.5,6.6], 'Y2':[6.1,4.3,2.1,1.5,4.2,8.7], …

Topic: multivariate-distribution keras tensorflow python

Category: Data Science

Should I concat multiple stock timeseries datasets into one?

Ubler

2022年2月8日 01:00

I have several timeseries datasets of stock data, with fundamental indicators. I would like to build a model that selects stocks for buy and hold. I understand that to perform this task I have two options: Train a model for each stock: This way, I understand that it is the most practical, however, the amount of data for each model will be very reduced (Each dataset has less than 1000 lines). Putting all the data together in a single dataset: …

Topic: multivariate-distribution finance time-series machine-learning

Category: Data Science

Two variables polynomial fit with Python

Filippo Caleca

2021年12月18日 17:27

I have two numpy arrays (the first is 2D, the second 1D) in the form: $X = [[x_1,y_1],[x_2,y_2],[x_3,y_3],...]$ $Z = [z_1,z_2,z_3,...]$ I would like to fit them as I expect they respect a polynomial law. $z = A xy + B x + C y + D$ (the model is separately linear in $x$ and $y$) So I would like a function which takes the two arrays and gives the coefficients $A,B,C$ and $D$. Is there any way to do …

Topic: multivariate-distribution linear-regression python

Category: Data Science

Tensorflow Probability Implementation of Automatic Differentiation Variational Inference with Mixtures

jonas

2021年8月25日 14:35

In this paper, the authors suggest using the following loss instead of the traditional ELBO in order to train what basically is a Variational Autoencoder with a Gaussian Mixture Model instead of a single, normal distribution: $$ \mathcal{L}_{SIWAE}^T(\phi)=\mathbb{E}_{\{z_{kt}\sim q_{k,\phi}(z|x)\}_{k=1,t=1}^{K,T}}\left[\log\frac{1}{T}\sum_{t=1}^T\sum_{k=1}^K\alpha_{k,\phi}(x)\frac{p(x|z_{k,t})r(z_{kt})}{q_\phi(z_{kt}|x)}\right] $$ They also provide the following code which is supposed to be a tensorflow probability implementation: def siwae(prior, likelihood, posterior, x, T): q = posterior(x) z = q.components_dist.sample(T) z = tf.transpose (z, perm=[2, 0, 1, 3]) loss_n = tf.math.reduce_logsumexp( (−tf.math.log(T) + …

Topic: multivariate-distribution vae tensorflow probability machine-learning

Category: Data Science

How to find mixing ratios in a mixture model with known parameters?

Jona Engel

2021年7月18日 17:43

This question does not ask for a formal solution or rephrasing, but for a practical implementation. That is why I am asking here and not on [cross-validate](https://clustering stats.stackexchange.com) Let us assume I have $y$ observations and a mixture model of $g$ Normally distributed components with mixing ratios $\lambda$ and I know their parameters $\theta$. How can I estimate only the ratios $\lambda$ and not the parameters $\theta$? So far I have only managed to estimate the entire mixture model, meaning …

Topic: normal multivariate-distribution r

Category: Data Science

Meaning of the covariance matrix?

Ben

2021年6月24日 13:37

I wonder about the excessive usage of the covariance matrix across all kinds of machine learning tools. So far, for me, the covariance is just a pre-step to get to the correlation. And as there is an obvious reason for the correlation itself, I wonder why I encounter the covariance so often. And, however, I wonder in general why it is used so much. What is/are the purposes for the covariance matrix?

Topic: multivariate-distribution correlation machine-learning

Category: Data Science

MLE for Poisson conditioned on multivariate Gaussian?

olympiader

2021年5月19日 23:46

I am writing some Python code to fit 2D Gaussians to fluorescent emitters on a dark background to determine the subpixel-resolution (x, y) position of the fluorescent emitter. The crude, pixel-resolution (x, y) locations of the pixels are stored in a list xy. The height of the Gaussian represents the predicted pixel intensity at that location. Each 2D Gaussian has 5 parameters, and my end goal is to find the optimal value of those 5 parameters for each peak using …

Topic: poisson multivariate-distribution density-estimation parameter-estimation python

Category: Data Science

About