indexing

Slice NumPy arrays differently along axes (without looping)

piano_man

2022年5月17日 20:36

I am trying to analyze a temporal signal sampled by a 2D sensor. Effectively, this means integrating the signal values for each sensor pixel (array row/column coordinate) at the times each pixel is active. Since the start time and duration that each pixel is active are different, I effectively need to slice the signal for different values along each row and column. # Here is the setup for the problem import numpy as np def signal(t): return np.sin(t/2)*np.exp(-t/8) t = …

Topic: numpy sampling indexing

Category: Data Science

Replacing rows of dataframe with rows of another dataframe that have the same index

Davis

2022年5月2日 17:02

I have a dataframe that has rows with indices 0 to 128 and a smaller dataframe with indices 4, 8, 105, and 107. I made edits to the rows in the smaller dataframe and am now trying to replace rows indexed 4, 8, 105, and 107 in the large dataframe with rows indexed 4, 8, 105, and 107 in the smaller dataframe. Why can I not just do: bigDF[smallDF.index] = smallDF How would I accomplish this replacement? Thank you!

Topic: dataframe pandas indexing python

Category: Data Science

Formatting df's multi index

BenW

2022年4月12日 17:32

I have a multi-index dataframe used in a block of code. It's index looks like this: MultiIndex([('American Indian or Alaska Native', '1-4 years'), ('American Indian or Alaska Native', '10-14 years'), ('American Indian or Alaska Native', '15-17 years'), ('American Indian or Alaska Native', '18-19 years'), ('American Indian or Alaska Native', '20-24 years'), ('American Indian or Alaska Native', '25-29 years'), ('American Indian or Alaska Native', '30-34 years'), ('American Indian or Alaska Native', '35-39 years'), ('American Indian or Alaska Native', '40-44 years'), ('American …

Topic: pandas indexing python

Category: Data Science

How to solve this IndexError?

TariqS

2022年3月29日 05:47

I have created a training dataframe Traindata as following: dataFile='/content/drive/Colab Notebooks/.../Normal_Anomalous_8Digits.csv' data8=pd.read_csv(dataFile) And Traindata looks like the following: Here Output is predicted variable which is not included in test data. Col1 Col2 Output 0 0.001655 0.464986 1 1 0.943110 0.902166 0 2 0.071235 0.674283 1 ... ... ... .. 1007 0.698048 0.058458 1 1008 0.289333 0.702763 1 1009 rows × 3 columns Now the model is trained as following commands: from pgmpy.models import BayesianModel, BayesianNetwork from pgmpy.estimators import MaximumLikelihoodEstimator model …

Topic: dataframe pandas data-indexing-techniques indexing python

Category: Data Science

Block matrix indexing

Frank Ch

2022年1月11日 17:41

Hello this might be a stupid question but i need some help indexing a Matlab matrix consisting of several sub-matrices. for k = 1:tf-1 r(k) = rand(1)/4; u(k+1) = 0.5; x1(k+1) = A(1,1)*x1(k) + A(1,2)*x2(k) + B(1,1)*u(k); x2(k+1) = A(2,1)*x1(k) + A(2,2)*x2(k) + B(2,1)*u(k); x = [x1(k) x2(k)]'; y(k) = C*x + r(k); P_prior(k+1) = A*P(k)*A.' + Q; K(k+1) = P_prior(k+1)*C.'/(C*P_prior(k+1)*C.' + R); xhat(k+1) = x(k+1) + K(k+1)*(y(k) - C*x(k+1)); P(k+1) = (eye(size(1,1)) - K(k+1)*C)*P_prior(k+1); end For example i want …

Topic: matrix matlab indexing

Category: Data Science

Tiering after clustering with Kmeans

Roger Steinberg

2022年1月1日 18:09

I would like to have some suggestions on possible avenues that would make sense in the following context. 3 Optimal clusters have been identified in a 5000 list of customers using Kmeans Data model has 30 features and a PCA was performed prior to Kmeans. I would like to further breakdown each of these 3 clusters into smaller tiers for each cluster. These tiers would server in ranking each customer within his cluster. For example: Cluster 1, 2, 3 could …

Topic: unsupervised-learning indexing clustering

Category: Data Science

Microsoft Access Partial Unique Index

vicatcu

2021年10月25日 00:21

In many databases (MongoDB comes to mind) there's a way to specify a partial unique index, which expresses the sentiment: "Please make sure no two records in this table are duplicates with respect to this set of fields, as long as this condition on the record holds true (otherwise don't consider this record in the uniqueness constraint)." Does Microsoft Access have a way of expressing this kind of a constraint?

Topic: indexing databases

Category: Data Science

Equitable selection of users through ranking

ShoDaKhan

2021年3月29日 02:52

I am looking to take a dataset largely derived of user input in categorical form, this sign up sheet asks for many data points such as age group, race, sign up date, as well as a few others. My goal is to create a weighted system to choose users equitably based on their responses, I've tried a frequency approach but there are pit falls to that, if 65% of the sign ups are White/Caucasian there will be a disproportionate number …

Topic: one-hot-encoding indexing categorical-data

Category: Data Science

How to stem plural words properly?

Mahdi Ghajary

2021年3月7日 14:06

I'm looking for a way to avoid removing ending s when s isn't a suffix. In order to do that, I first check if a word exists in my index, if it does, I don't remove the ending s but If it doesn't, I go on and remove the ending s and add it to the index. But the problem is what to do when starting to build the index. Imagine we encounter books, I remove s and add book …

Topic: indexing nlp information-retrieval

Category: Data Science

Book indexing data science project

Prashant Akerkar

2020年9月6日 09:53

Is it possible to perform Book index searching using Machine learning algorithms? Inputs : 1 Book pages with page numbers as images. 2 Index words in the book. Output: Tracing the page number/s with the indexes provided.

Topic: machine-learning-model books data-indexing-techniques indexing machine-learning

Category: Data Science

Pandas dataframe with multiple hierarchical indices

Phil

2020年8月12日 07:43

I have a data frame which looks like this FRUIT ID COLOR WEIGHT Apple 142 Red Heavy Mango 231 Red Light Apple 764 Green Light Apple 543 Green Heavy And I want the following result: FRUIT COUNT Apple COLOR Red 1 Green 2 WEIGHT Heavy 2 Light 1 Mango COLOR Red 1 Green 0 WEIGHT Heavy 0 Light 1 I tried different variations of set_index, groupby() and unstack() on the dataframe in combination with ['ID]'.count() and .size(), but my grouping …

Topic: dataframe pandas indexing

Category: Data Science

Primary indexes

Angelo Giannuzzi

2020年2月22日 14:52

I would like to ask you two questions about indexing: 1) Since a primary index, or clustering index, stores the tuples of a relation in the primary index itself (but primary index might also be separated from the file containing the tuples), how can we implement this kind of indexes? 2) When we associate to a file a primary index, the file itself must be sequentially ordered. Is true that a primary index (not separated from file) is always an …

Topic: indexing

Category: Data Science

Primary indexes and index-sequential files

Angelo Giannuzzi

2020年2月20日 13:12

I am studying the physical organization of databases and right now I trying to understand the concept of primary index or clustering index. The book states the primary index can be realized by storing the tuples on the index itself ( the index stores the tuples ). According to the book, in this case (the index is not separated by the file containing the tuples) the primary index is so called because the storing method can be done by storing …

Topic: indexing

Category: Data Science

Why keep vocabulary and posting list separate in a search engine

icehawk

2019年12月18日 10:01

I am taking a class in information retrieval. We learned that the index of a search engine has (possibly among other things): A vocabulary mapping terms to their statistics (frequency, type, ...) and A posting list mapping terms to the documents were they are stored (with or without positions, fields, ...) These are separate data structures. I understand why those information is needed and what for. But I don't understand why we want to keep them separate. Why can't we …

Topic: indexing information-retrieval search

Category: Data Science

How to create dictionary with multiple keys from dataframe in python?

KHAN irfan

2019年4月19日 17:06

I have a pandas dataframe as follows, I want to convert it to a dictionary format with 2 keys as shown: id name energy fibre 0 11005 4-Grain Flakes 1404 11.5 1 35146 4-Grain Flakes, Gluten Free 1569 6.1 2 32570 4-Grain Flakes, Riihikosken Vehnämylly 1443 11.2 I am expecting the result to be of nutritionValues = { ('4-Grain Flakes', 'id'): 11005, ('4-Grain Flakes', 'energy'): 1404, ('4-Grain Flakes', 'fibre'): 11.5, ('4-Grain Flakes, Gluten Free', 'id'): 11005, ('4-Grain Flakes, Gluten Free', …

Topic: pandas indexing python

Category: Data Science

Pandas: Assign back to table from grouping by column and index

Steztric

2018年12月26日 09:58

I am trying to implement Exponential Moving Average calculation on a DataFrame. The formula is An additional complication is that my table is grouped and there is a unique bin number per group. This is what I tried import numpy as np import numpy.random as rand n = 5 groups = np.array(['one', 'two', 'three']) data = pd.DataFrame({ 'price': rand.random(3 * n) * 10, 'group': np.repeat(groups, n), 'bin': np.tile(np.arange(n),3)}, index=np.arange(3 * n)) print(data) price group bin 0 1.601310 one 0 1 …

Topic: dataframe pandas indexing python

Category: Data Science

How to resolve too many indices for array Index Error

shiva

2018年6月19日 16:34

I'm performing a binary classification in Keras and attempting to plot the ROC curves. When I tried to compute the fpr and tpr metrics, I get the "too many indices for array" error. Here is my code: #declare the number of classes num_classes=2 #predicted labels y_pred = model.predict_generator(test_generator, nb_test_samples/batch_size, workers=1) #true labels Y_test=test_generator.classes #print the predicted and true labels print(y_pred) print(Y_test) '''y_pred float32 (624,2) array([[9.99e-01 2.59e-04], [9.97e-01 2.91e-03],...''' '''Y_test int32 (624,) array([0,0,0,...,1,1,1],dtype=int32)''' #reshape the predicted labels and convert type y_pred …

Topic: keras image-classification indexing

Category: Data Science

convert single index pandas data frame to multi-index

Rakib

2017年6月2日 00:24

I have a data frame with following structure: df.columns Index(['first_post_date', 'followers_count', 'friends_count', 'last_post_date','min_retweet', 'retweet_count', 'screen_name', 'tweet_count', 'tweet_with_max_retweet', 'tweets', 'uid'], dtype='object') Inside the tweets series, each cell is another data frame containing all the tweets of an user. df.tweets[0].columns Index(['created_at', 'id', 'retweet_count', 'text'], dtype='object') I want to convert this data frame to a multi-index frame, essentially by breaking the cell containing tweets. One index will be the uid, and another will be the id inside tweet. How can I do that? …

Topic: pandas indexing python

Category: Data Science

Counting indexes in pandas

Mr. Hasquestions

2016年11月9日 00:23

I feel like this is a rudimentary question but I'm very new to this and just haven't been able to crack it / find the answer. Ultimately what I'm trying to do here is to count unique values on a certain column and then determine which of those unique values have more than one unique value in a matching column. So for this data, what I am trying to determine is "who" has "more than one receipt" for all purchases, …

Topic: pandas indexing python

Category: Data Science

Index for efficient argmax(w.x) query ~ 20d

user1158559

2015年5月20日 14:11

I'm looking for a spatial index that can efficiently find the most extreme n points in a certain direction, i.e. for a given w, find x[0:n] in the dataset where x0 gives the largest value of w.x and x1 the second largest value of w.x, etc... . Is there a name for this type of query? What would be an efficient data structure to use? x might have around 20 dimensions. Thankyou!

Topic: data-indexing-techniques indexing

Category: Data Science

About