What is the difference between Pachyderm and Git?

I learned that tools like Pachyderm version-control data, but I cannot see any difference between that tool with Git. I learned from this post that: It holds all your data in a central accessible location It updates all depending data sets when data is added to or changed in a data set It can run any transformation, as long as it runs in a Docker, and accepts a file as input and outputs a file as result It versions all …
Category: Data Science

Organizing datasets, dataset version control, MLOps and other questions

I am currently looking into structuring data and work flows for my ML end to end pipeline. I therefore have multiple problems, and ideally I am looking for one platform that can do all: Visualize and organize multiple datasets. ideally something like the Kaggle datset webinterface Do dataset exploration to quickly visualize errors in data, biases in annotations etc. Annotate images and potentially point clouds commenting functionality for all features Keep track of who annotated what on what date dataset …
Category: Data Science

Updating WordPress core with zero downtime - I mean zero

I have a critical website which is under version control. When I update the core WordPress files I do so locally, then commit the changes. When I do that the live site (obviously) does not even display a maintenance notice, it simply errors out for a few minutes as the core files are deployed, and presumably are either read part way through, or are temporarily incompatible one to the other. I could update from the live admin, and then deploy, …
Category: Web

Automatic updates and merging manual changes

I have Wordpress running on my dedicated server, and I thought it would be a good idea to use Git so that I can track changes that I make over time. I have updated a few things, mostly just little changes to CSS or commenting out some HTML from PHP pages. Q1: Will Wordpress notice these changes and merge them into new versions of the files it installs? A few days ago I was able to update translations and a …
Category: Web

version control for code and output models

I have a question about version control for both code and the models it generates. We are developing ML models that often involve hyperparameters and so we might do many runs with different hyperparameter settings. We currently store the output models in cloud buckets, but we keep the code in Github or BitBucket. This seems to invite things to get out of synch. I guess we could store both code and output models in the same place, but for code …
Category: Data Science

Suggestion on practice to model and dataset version documentation

I want to steer my question towards the practical side of ML. As a practitioner, I feel keeping different versions of models and datasets is difficult. From time to time I need to revisit my data and model code to verify if certain assumptions are ensured/implemented, which becomes difficult when the the number of runs/experiments increase exponentially. Thus, I would to hear some advice from senior practitioners about how you version your things (data/model code)? I know you must tell …
Category: Data Science

Contact Form 7 - Replace database configured form template with a static file

TL;DR - Is there a way CF7 can pull it's form markup and tags from a static file, instead of the database? So I can maintain that file in version control. We have a very complex form running under Contact Form 7. It has branching, logic and more, all handled by custom code using hooks and filters. However we still have to copy and paste the (thousands of lines) of code that describes this form into the "form editor" in …
Category: Web

Is there a plugin for versioning files in the theme (style , .js and .php files)?

I checked the available wordpress plugins for versioning, but none seem to do what I need. I would like a way to peform some simple version control on the style sheet or .php files in the theme. Ideally a simple way so that the file history of each file is available and if a file is changed I can return to a previous version easily if needed. Wordpress does this by defaut for pages and posts, but I can't find …
Category: Web

Checking for a new version from WP Repos

I would like to know if there is a new version from the official WP Repos for a specific plugin. How can I check for this in json/xml format? Or other format maybe? Something like this: https://api.wordpress.org/core/version-check/1.7/ But for each single plugin in the WP repo
Category: Web

How to version data science projects with large files

I am working on a project with large data files (~300MB). I want to version my work along with the data files so that it is always available online. I tried using git-lfs but it has a 1GB/month bandwidth limit, beyond which you're blocked for a month. What are versioning tools using by data scientists for projects that use >100MB data files (both static and generated)?
Category: Data Science

A the end of a big DS project, should I make trained models available on GitHub?

I almost completed two big Data Science personal projects based on Deep Learning. They are the fanciest models I've implemented up to now, and I'm pushing all my code on GitHub. Do you advice to upload trained models too? Or should I let other users run the code and get their own? What do you do? What are pros and cons?
Category: Data Science

Embedding git commit into the resulting data

Our pipeline works something like that: Collect bunch of raw data (10-100 GB) from microscope Process data using MATLAB scripts Change few parameters based on raw data, as well as add new features to the scripts Commit the scripts with new features to git repo Re-run processing, save results as figures and CSV data (1-10 MBs) Are there tools available for seamlessly linking MATLAB figures or tabulated data with the code version (git commit hash) that was used to produce …
Category: Data Science

Using Subversion to deploy WordPress

I use Subversion with my websites. Up until now, this has meant creating a new repo for each of my sites. However, this is wholly inefficient as it means me lugging around the whole WordPress source for each of the sites. It also has meant that I have to copy plugins between repos and thus duplicate the code each time. So what I wanted to do was have a repo which only really contained my theme file (and possibly other …
Category: Web

How to deal with version control of large amounts of (binary) data

I am a PhD student of Geophysics and work with large amounts of image data (hundreds of GB, tens of thousands of files). I know svn and git fairly well and come to value a project history, combined with the ability to easily work together and have protection against disk corruption. I find git also extremely helpful for having consistent backups but I know that git cannot handle large amounts of binary data efficiently. In my masters studies I worked …
Category: Data Science

Wordpress pages creation work distribution & then combining - Localhost XAMPP

Is there any best practice for distributing Pages creation work amongst 4 people and then exporting & combine all their work. I was thinking for following steps: creating XAMPP setup on each system install duplicator backup let all 4 create their set of pages export pages using Wordpress Export tool Import all pages to the main localhost system's site Will there be any issue while importing?(since all are on localhost) Will there be any conflict in page-id?(since all start from …
Category: Web

WordPress Health Tool reporting version control as a critical issue

I use git to deploy my production site, and as a result, the sites Health tool is reporting a critical issue due to the auto-updates not working when a site is under version control. To be honest, it wouldn't be an issue to allow auto-updates, and just periodically commit the production changes. Is there a way of just forcing them and solving/suppressing this 'critical issue'?
Category: Web

How can I keep the content of my pages version controlled?

We have a WordPress-based website that provides documentation to our REST API. Since our API is constantly changing, so is the documentation. However, we would like to keep the documentation version controlled so it can be matched against API commits. Is there a way to have WordPress pages get their content from a remote repository (GitHub, for example)? Or is there a way to push content to WordPress from some repository?
Category: Web

Getting Started with Subversion, Git, or similar Version Control System to keep a History of my Files?

I realize this may be a broad question on the surface, but I'm looking for specific examples of setups/workflows that people use to keep a version history of edited files on a WordPress site. For instance, when developing a site (and even after it's live), I often make changes to CSS and PHP files, but I don't have a great way of reverting to older versions of those files. For my purposes, making changes on a local development installation and …
Category: Web

How to use one git (github) repository for version control for multiple themes

I have built and maintain many themes for various clients. I'd like to be able to put them all in github for lovely version control. However github gets a bit expensive when you have over 20 private repositories. I'm going to have about 30+. The themes are not used together, each one is for a separate client. I know I can host my own git install, but then I lose the great diff tools and social aspects of github. It's …
Category: Web

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.