I would like to set up a server which could support a data science team in the following way: be a central point for storing, versioning, sharing and possible also executing Jupyter notebooks. Some desired properties: Different users can access the server and open and execute notebooks that were stored by them or by other team members. The interesting question here is what would be the behavior if user X executes cells in a notebook authored by user Y. I …
I am new to machine learning and don't quite know the necessary softwares for paper writing. For example, this graph From the paper https://aclanthology.org/2020.emnlp-main.676.pdf
Good Day, Microsoft offers their Azure Machine Learning Platform: https://azure.microsoft.com/en-ca/services/machine-learning/ Azure Machine Learning is designed for applied machine learning. Use best-in-class algorithms and a simple drag-and-drop interface—and go from idea to deployment in a matter of clicks. ... Use Azure Machine Learning to deploy your model into production as a web service in minutes—a web service that can be called from any device, anywhere, and that can use any data source. By their demo and their photos online it looks …
There are categorical features which have two different value in my dataframe next to numerical features. I've converted these categorical values to 0 or 1. I will apply correalation elimination on features after calculating correlation coefficients. Depending on type of features, methods are given below: Numeric - Numeric: Pearson Numeric - Categoric: Cramer_V Categoric - Categoric: Correlation Ratio That's why I could not be sure what should be type of converted categorical features? Numerical or categorical ? Another reason to …
We run a games platform with millions of users (+- 150,000,000 gameplays / month). We want to find tools or set up a data stack to: collect basic metrics for a specific game such as average gameplay time, 1 day return rate, 7 day return rate,... be able to segment these data by any dimension that we pass along (e.g. by country, by network speed, by ...) generate more advanced insights for a specific game, e.g. this is the distribution …
I know 'easy to use' is going to be subjective, so let me qualify the question a little. Is there a library or working scrap of code that I can essentially copy into my project, change the number of neurons in each layer, the number of layers, and the source of the inputs and then click run. I've written at least 15 of these myself, not a single one has worked properly - I need to see something working in …
I'm planning an eCommerce site currently. We are likely running WooCommerce and looking to implement Algolia for our search features. We feel that for our particular purposes, a visual search would be a crucial feature to implement, due to our product types. For the purpose of my question, I will use the example of sculptures and ceramics, with various forms both abstract and utilitarian, textures, colors, and so forth. The idea is a customer can upload a photo of their …
I am new to machine learning so please bare with me. I'll try to keep this short and sweet. We are building a makeup simulation and recommendation system. My part is to recommend a makeup which is personalized to the user and also on par with the current makeup trends. I will be building a set of rules with the help of a beautician that will say which makeup is suitable for a particular set of features. The outputs will …
I faced a problem which I'd like to solve w/o any programming. And looking for a software to do this. I have a dataset, for example: (brand-id, brand-name, product-class-name;) 0, Audi, economy business premium; 1, Rolls Royce, luxury; 2, Seat, economy; 3, Tesla, business premium; And I'd like to automatically process this dataset, resulting in creating an additional table to classify parameters in column 3, like: (product-class-id, product-class-name, brand-id;) 0, economy, 0 2; 1, business, 0 3; 2, premium, 0 …
In my class I often need to work with color map images. I would show the image and try to make inferences/observations about different subjects. Often times I need to actually quantify some aspects, but it is always very approximate and somehow vague because the images are provided "as is" and I do not necessarily know their content a priori. Let's imagine I'm working with two images (*). Is it possible to indicate the computer "learn" the color scale bar …
In all implementations of recommender systems I've seen so far, the train-test split is performed in this manner: +------+------+--------+ | user | item | rating | +------+------+--------+ | u1 | i1 | 2.3 | | u2 | i2 | 5.3 | | u1 | i4 | 1.0 | | u3 | i5 | 1.6 | | ... | ... | ... | +------+------+--------+ This is transformed into a rating matrix of the form: +------+-------+-------+-------+-------+-------+-----+ | user | item1 | item2 …
I want to perform "reliable rule learning", i.e. mining a set of rules with a very low number of false negatives. I recently read the paper "Reliable agnostic learning" by Kalai et al. (https://doi.org/10.1016/j.jcss.2011.12.026) and they basically describe what I want: Rules are determined to reliably classify data points, and the reliability is partly reached by allowing "I don't know" as an additional answer. Sadly, their paper is purely theoretical and I could not find a corresponding implementation. Is there …
I'm looking for a Python library that can compute the confusion matrix for multi-label classification. FYI: scikit-learn doesn't support multi-label for confusion matrix) What is the difference between Multiclass and Multilabel Problem
Which all are the equivalent or advanced libraries in Python for building recommendation systems like Mahout for Collaborative Filtering and Content Based Filtering ? Also is there a way to integrate Mahout with Python?
I hope this question is okay for the forum. I want to ask for your experiance with Python editors. Currently, I use VS-Code to work with Python. However, in R Studio I really appreciate that it holds data frames in the memory and makes it easy to view/inspect dataframes and other items. I'm "closer to the data" in R Studio. Also line-by-line/blockwise execution of code is really helpful. So my question: Is there anything like R Studio for Python (preferably …
I want to create a chatbot which informs the user about traffic at the streets but not in real-time for the moment. I have created a small database with MySQL which has some data stored regarding traffic and I fetch them with a PHP script whenever this is appropriate depending on the interaction of the user with the chatbot. I wonder how to deal with the case when the user asks variations of the same question which therefore can be …