Collaborative predictive modeling

How do you share modeling work among several programmers? Our team has split apart the work of writing the SQL code to create our dataset. However, we will soon need to build a machine learning model. My machine learning approach is a linear/iterative process. Read data in, then split data, then normalize, then try model A, model B, score the model, and go back to tweak various hyperparameters. Experiment. Outside of pair programming, how can you distribute/share this work?
Category: Data Science

How do you do data management?

I have about a few million records (small CSV / JSON file) from different sources, with about 50k added everyday. All on my local host. Until now, I have been using simple file structure to manage them, but it's getting cumbersome. Ideally, I'd like to query files by their meta data (source, type, etc), and pipe that into my ML pipeline (TFX). Id like to keep them local if possible does anyone have a good solution that you think will …
Category: Data Science

Working with others: Tidy data vs Pretty data

When working with someone whose background and skill level in data work may not be strong, how do you best make the argument for tidy data over "pretty" data? There are notes of what I want to ask/discuss in this StackExchange post, but I want to ask more about the collaborative aspect. Part of my job is to collect data from many different organizations which are all members of a common group. The spreadsheets they send me are untidy in …
Category: Data Science

Adding sections in the 'Pages' post type

I'm wondering if there is a way to add titles or nested groups to the pages post type. A website of ours have an enormous amount of pages so we want to order the pages in (preferably) groups/categories without changing the permalink structure. Now: - Page 1 - Page 2 - Page 3 to: Normal pages: - Page 1 Landing pages: - Page 2 - Page 3 I can't find a plugin or a way to develop this into the …
Category: Web

Tools for project management Data Science

I'm in charge of a small data science team (3 data scientists, me included). We do our projects with at least one business person (PM) per project ( we have 5 of these). We managed everything with meetings and emails, but as the number of projects and people keeps increasing, I find it necessary to have a proper management tool. I would like to have something were we could, per project, add business needs (requirements). These requirements could translate into …
Category: Data Science

Multiple models in the same notebook

Having working on data sets, sometimes we want to keep track of mtiple models with different architectures which work on the same data set on which we have made some transformations and preprecessing of data has been done. So I would like to know what is the elegant way to work on multiple models which use the same data set? Because having multiple models on the same notebook is cumborsome and recreating the same data preprocessing nd transformations on separate …
Category: Data Science

Do you think it's normal for data science projects to have some amount of "what should we do" -time? Or does it mitigate by experience?

Do you think it's normal for data science projects to have some amount of "what should we do" -time? Or does it mitigate by experience? By "what should we do" -time I refer to time being spent on reading about and experimenting on "possible ways to do things, when many alternatives exist". This kind of "time" has bothered me, because someone could think it's some kind on inefficiency or failure to concentrate. On the other hand I've rationalized that it …
Topic: management
Category: Data Science

Which are the strategies to counter the 80/20 dilema in Data Science projects?

Most of the time in Data Science projects is not spent in (performing) actual analytics but rather in other tasks, such as organizing data sources, collecting samples and preparing datasets, compiling and validating business rules in data, etc.This fact has been studied as the 80/20 dilemma in Data Science projects In order to tackle this dilema, I would like to ask which are the strategies used to decrease the 80% of time spent in the other stages (organizing data sources, …
Category: Data Science

Best practices for scaling data science / engineering teams

I am trying to find best practices for scaling data science teams, i.e find an efficient workflow/methodology to divide work between Software Engineers and Researchers working on a same product. I’ll explain: both the SE and Researchers need the output produced by the others but they don’t necessarily have the same constraints. - What’s important for a SE is: code maintainability, testing, CI/CD, refactoring codebase for improved development velocity,l as little branches as possible in the repository - What’s important …
Category: Data Science

Scan multiple websites for malware that are in same webhost root?

I have a bunch of WordPress sites that I host with the same hosting company. I manage them with the same account so they sit in the same root directory. I noticed that one of my sites was infected with malware. Is there a quicker way for me to check all my sites other than installing an anti-virus plugin in each of my websites and scanning that website? It's hosted with a webhosting company so I cannot install software on …
Category: Web

How to create a specific role to manage users

I have created a non-admin user role to manage users. I have given this role the following capabilities: Create User, Delete User, Edit User, List Users, list roles. A member with this role CAN create a new user. However when they list Users from the dashboard, they cannot edit any users. They do not get a edit button. I am using the "members" plugin to mange roles, although I see the same results when I set the capabilities programatically. I …
Category: Web

Multisite - Looking for ideas to best manage a main site change

I run about 5 sites on a multisite installation, the main one being under my main domain. What I'm currently in the process of is creating a brand new site (with content changes etc) which will be on my main domain. So I set about creating it as a new site with a subdomain. It's now finished and ready to go live. So in essence, I want to move newsite.domain.com to domain.com and move the existing site at domain.com to …
Category: Web

How can we make managing lots of pages in WordPress Admin better?

WordPress obviously comes from a blogging background but can be used to serve sites with a lot of Pages. However, where it falls short for me is not in regard to performance but in the Admin area's handling of lots of Pages, child pages etc. It quickly becomes a chore to move through the list of Pages trying to find what you're looking for, especially without the ability to drill down into page hierarchies etc. What techniques / plugins do …
Category: Web

Advice On How to Backup Wordpress

I have my site hosted with Cloudways and am using ManageWP to manage the site. I am wondering about backup options. Cloudways backs up the server and has an option to restore a particular app on that server (so if one site goes down and the others are fine, I can restore just that site). ManageWP has automatic backups (either once a month free or more often for $1.00/month). I am considering the $1.00/month option. Then there are also plugins …
Category: Web

Order management including recurring orders on woocommerce

I've been trying to find a good way for handling order management as I have some recurring payments on my business. I need to set up a system that informs me how many items I should have in stock every week based on the recurring orders and the average of the daily orders that I receive from online and telephone. I really appreciate if someone helps me on this. also, is it ok to keep everything on Wordpress or I …
Category: Web

Wordpress Database Cleanup

How hard would it be to rebuild a WordPress database and change the directory structure for a website? I inherited the maintenance of two websites and I am pretty sure the directory structure is set up incorrectly and the database is a total mess (6 databases for two websites?). I am new to WordPress and suspect that at this point it would be easier just to rebuild the website using the existing one. Losing data is not a big deal …
Category: Web

Keep track of trainings, datasets eetc

After searching quite some time for it on Google I could not find a sufficient software/toolbox that can manage trainings of neural networks. I thought of a program that combines visualization techniques without the need to write code as well as having the possibility to compare several trainings of neural networks and be able to store them easily. Does a program like this exist? Regards Lukas
Category: Data Science

How to build a data analysis pipeline procedure

I have a series of scripts. Some are in R, some in Python, and others in SAS. I have built them in such a way that one code outputs a .csv file that the next code obtains and then that code outputs a .csv file, and so on... I want to create a script that will automatically run each script in order so that the final output can be generated automatically. What method would be best for this and can …
Topic: management
Category: Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.