self-study

Analysis for basic weight training analysis?

stevemn

2022年4月4日 08:02

TL;DR: I'm doing a fairly basic project which involves exercise. It seems that descriptive statistics and basic data vis (ex: line graph) would be most appropriate for this project, but I wonder if you have any recommendations for analyses. For this project, I am performing the same set of 15 single-joint exercises each week (we'll call these "Exercises"). Every 4 weeks, I'm performing 3 different multi-joint exercises (we'll call these "Lifts"). My goals are to: Track my progress (strength gains) …

Topic: data-analysis descriptive-statistics self-study

Category: Data Science

Recommendations for Master courses in Data Science in Europe

Julia

2022年1月27日 18:20

I am hoping somebody can help a (hopefully) future data scientist: I am looking for a Master course in data science at a school in Europe. I have a Bachelor in Media Management (PR & Communication) and have then worked for several year as a market place manager for an international online retailer. I am now looking to pivot more into the data science part of this and am searching for a suitable master program. Given that I do not …

Topic: self-study

Category: Data Science

Machine Learning resources

Apoorva

2022年1月12日 19:11

I'm not sure if this is the right place to ask this question, but is there any online source that provides a complete in-depth explanation of Machine Learning algorithms, all at one place, but not too complicated for a beginner to understand? Every source I refer to either covers the topics superficially or focuses on only one aspect of the algorithm which makes me waste a big chunk of my study time going through different websites & videos looking for …

Topic: self-study machine-learning

Category: Data Science

Resources on on-line machine learning

Slim Shady

2021年12月28日 13:25

I am wondering if there are any books/articles/tutorials about "on-line machine learning"? For example, this website has nice lecture notes (from lec16) on some of the aspects: https://web.eecs.umich.edu/~jabernet/eecs598course/fall2015/web/ or this book: https://ii.uni.wroc.pl/~lukstafi/pmwiki/uploads/AGT/Prediction_Learning_and_Games.pdf I can't seem to find much resources on this. I'm trying to understand the basics, not read research papers. If anyone can share resources that would be nice.

Topic: self-study online-learning machine-learning

Category: Data Science

global contrast normalization implementation

Nachiket

2021年12月15日 12:15

I'm trying to understand figure 12.1 in Goodfellow available here. I'm not able to reproduce figure 12.1, and I'm wondering what is it I'm missing. The denominator of equation 12.3 is a constant, and thus equation 12.3 reduces to a subtraction and a scaling. I'm finding it hard to believe that it will map the points to a sphere/circle as shown in figure 12.1. I'd expect something non-linear in order to do that. What am I missing? My code is: …

Topic: question-answering image-preprocessing self-study

Category: Data Science

Where can I find study materials?

Inuraghe

2021年9月30日 21:13

Can anyone recommend me some material (books, blogs, youtube channels, ...) to study statistics, Machine Learning and in general Data Science topics? Thanks

Topic: self-study statistics bigdata machine-learning

Category: Data Science

What is init_score in lightGBM?

WCMC

2021年9月17日 19:45

In the tutorial boosting from existing prediction in lightGBM R, there is a init_score parameter in function setinfo. I am wondering what init_score means? In the help page, it says: init_score: initial score is the base prediction lightgbm will boost from Another thing is what does "boost" mean in lightGBM?

Topic: self-study gbm

Category: Data Science

Studying and choosing between different neural network structures

Irongun

2021年6月28日 03:26

I would like to develop a model that uses convolutional neural networks for image classification. From the many different network structures described in papers and articles online, I would like to choose, as a starting point, the one that better suits my problem and dataset. I know that there is no certain answer and the best structure is highly dependent on each problem, but I imagine that there is some method behind building such a network beyond pure chance and …

Topic: convolutional-neural-network image-classification self-study neural-network

Category: Data Science

Random Forest Techniques/Models

Felix Tenn

2021年6月2日 04:57

Can anyone tell about different Techniques/algorithms of Random forest? I know, Random Forest is itself an algorithm/model, but I'm looking for another version of it as we have in decision trees. List of Algorithms based on Random Forest? Thanks

Topic: machine-learning-model self-study random-forest data-mining machine-learning

Category: Data Science

Feeling Stuck on a Beginner – Intermediate level

Lucas

2021年6月2日 04:40

Over the past two years, I have been working as a full-time data scientist for a government company. As the sole data science team in the organization, our job is a hybrid between data science and machine learning engineering. We need to research and develop ml solutions for the organization's business problems as well as implement them in production environments. The problem is I'm feeling stuck knowledge-wise and I don't know what can I do about that. Let me explain. …

Topic: books self-study deep-learning machine-learning

Category: Data Science

How to make a linear model with a constant value in R?

mjc

2021年3月30日 17:24

I'm working on an unassessed homework problem from unpublished course notes of a statistics module from a second year university mathematics course. I'm trying to plot a 2-parameter full linear model and a 1-parameter reduced linear model for the same data set. I can't figure out how to plot the 1-parameter model; all attempts so far have either given errors or a non-constant slope. xs <- c(0,1,2) ys <- c(0,1,1) Data <- data.frame(x = xs, y = ys) mf <- …

Topic: linear-models rstudio self-study

Category: Data Science

What ML techniques work on imbalanced datasets

Sm1

2021年3月22日 00:23

I have some specific questions for which I could not find answers in textbooks/research articles. Shall be grateful for an answer. These are: Are there ML techniques that can be directly applied on class imbalanced datasets? OR is it a practice to balance the dataset either by using some weighted approach or SMOTE methods? What is the standard way for real datasets/industries? I am referring to fraud detection, anomaly and water leak detection where inherently the dataset would always be …

Topic: self-study class-imbalance classification predictive-modeling

Category: Data Science

Resampling : My dataset is categorical or numerical?

DOT

2021年3月16日 12:02

I have a dataset with 203 variables. Like age>40 (0 -yes, 1-no), gender(0 or 1), used or not 200 types of drugs (one hot encoded into 200 variables), and one target variable (0 or 1). This is an imbalanced dataset where Counter({0: 5607, 1: 1717}). May I know what kind of resampling strategy I should adopt for this kind of dataset? Is this dataset considered as numerical or categorical datset? I tried random under sampling and over sampling, but not …

Topic: features self-study class-imbalance binary dataset

Category: Data Science

Does ensemble (bagging, boosting, stacking, etc) always at least increase performance?

WCMC

2021年3月4日 20:29

Ensembling is getting more and more popular. I understand that there are in general three big fields of ensembling, bagging, boosting and stacking. My question is that does the ensembling always at least increase the performance in practice? I guess mathematically, it is not true. I am jsut asking a real life situation. For example, I could train 10 base learner, and then stack them with another learner, which is at 2nd level. Does this 2nd-level learner always outperform the …

Topic: ensemble-modeling self-study

Category: Data Science

Logic behind the Statement on Non-Parametric models

jayant98

2020年7月27日 04:50

I am currently reading 'Mastering Machine Learning with scikit-learn', 2E, by Packt. In Lazy Learning and Non-Parametric models topic in Chapter 3- Classification and Regression with k-Nearest Neighbors, there is a paragraph stating- Non-parametric models can be useful when training data is abundant and you have little prior knowledge about the relationship between the response and the explanatory variables. kNN makes only one assumption: instances that are near each other are likely to have similar values of the response variable. …

Topic: non-parametric k-nn machine-learning-model self-study

Category: Data Science

Oracle in optimization

Blade

2020年6月22日 07:11

I have encountered the word oracle in the following context: Given an $\alpha$-approximate oracle for stochastic optimization we show how to implement an $\alpha$-approximate solution for robust optimization under a necessary extension, and illustrate its effectiveness in applications. I saw this question, but it doesn't seem to have the same meaning. I was wondering what does oracle mean in this context. Edits: I found the following definition in this paper: A $\rho$-approximate Bayesian optimization oracle is a function $\mathcal{O}_{\rho}:(\Theta \rightarrow …

Topic: self-study optimization

Category: Data Science

Understanding SVM Kernels

Mehran Torki

2020年4月23日 18:36

Following Andrew Ng's machine learning course, he explains SVM kernels by manually selecting 3 landmarks and defining 3 gaussian function based on them. Then he says that we are actually defining 3 new features which are $f_1$, $f_2$ and $f_3$. $\hskip0.9in$ And by applying these 3 gaussian functions on every input data: $$x=(x_1,x_2)\to \hat{x}=(f_1(x), f_2(x), f_3(x))$$ it seems that we are mapping our data from $\mathbb R^2$ space to a $\mathbb R^3$ space. Now our goal is to find a …

Topic: kernel self-study svm machine-learning

Category: Data Science

How do I test a difference between two proportions representing fatality rate for Covid 19 in Philippines and World (except Philippines)?

Benj Cabalona Jr.

2020年3月19日 22:22

I'm trying to analyse if the fatality rate from my country (A third world country) vary significantly from the world's fatality rate. So I'd basically have two samples, labeled (Philippines) and (World excluding the Philippines) then i can compute the fatality rate for the 2 groups. Does Mcnemar's test apply here for me to check if fatality rate in the Philippines is higher, or do you have any suggestions? Thanks

Topic: distance self-study statistics

Category: Data Science

Fastest way to relearn machine/deep learning

dosvarog

2020年1月14日 19:53

I hope I came to the right place to ask this question. Back when I was at collage I studied machine and deep learning in-depth. My whole programme was based on those areas. I knew all underlying maths, even today I know how to derive backpropagation for any feed-forward network. Well, maybe I would need to take a peek. But I still understand the math and I can follow without problems. Back then (2017) I was even doing something with …

Topic: self-study deep-learning machine-learning

Category: Data Science

Four parameter self-starting function based on SSfpl

HYDR0GEN

2020年1月2日 15:47

I am currently working with a self-starting function for four parameters which I based on SSfpl but with a different formula. This is the formula for my self-starting function: (b1 * ((b2 * x)^b4)) / (1 + ((b2 * x)^b4))^(b3 / b4) The code below is for SSfpl with a formula A + (B - A)/(1 + exp((xmid - input)/scal)): ir <- as.vector(coef(lm(x ~ I(log(prop/(1-prop))), data = xy))) pars <- as.vector(coef(nls(y ~ cbind(1, 1/(1 + exp((xmid - x)/ exp(lscal)))), data …

Topic: self-study parameter r

Category: Data Science

About