Can depth be used as a feature when predicting rock type from well log data?

I am trying to predict the lithofacies, i.e. the rock type, from well log data, a project very similar to the one described in this tutorial.

A well log can be seen as a 1D curve tracking how a given property (e.g. gamma radiation, electrical resistivity, etc...) varies as a function of depth. The idea is to use these 1D arrays as the input features to train a Machine Learning model (e.g. SVM or Random Forest), to infer the facies at a given depth. For instance, in the image below:

  • the first 5 tracks (GR to PE) are the well logs used as features
  • while the last 2 tracks (Facies and Prediction) correspond to the true and predicted facies.

One of my colleague started using depth as a feature, thus obtaining much higher scores than when working with well logs only.

While this may make sense from a geological standpoint, as certain rock types are expected within a given depth range, I think that this will cause model overfitting [EDIT from June 1, 2022] I am concerned that doing so would put too much constraint on the model.

Is this explanation correct, or may depth (or position) be used as a feature to train a ML model?

Topic training svm

Category Data Science


I don't see any problem using depth. Instead of putting "too much constraint" on, I would say it provide "extra information" or "predictive power" to the model, just like the well logs. This is what a feature does.

Think it in another way, if depth would harm the model (say by 'overfit'), I can argue that any of the 5 tracks of well log features could do the same.

As a separate topic, "putting extra constraint on a model" usually ease overfitting, which is often done deliberately. This technique is called regularization.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.