When to use deep learning for java as opposed to python

I have been asked to explore options to build deep learning based applications using java, so i happend to browse a website called dl4j (https://deeplearning4j.org) which has got implemantations of different neural networks starting from MLP to RNN/LSTM.

But I couldn't understand the rationale of using dl4j over python based implemenation. So, could someone please clarify on following items please.

  1. ETL
  2. Data pre-processing
  3. Making use of pre-trained models / transfer learning
  4. Distributed computing
  5. Processing large voulme of data (images,time series data,sensor data..etc)
  6. Production implementation (batch or real time / online)
  7. Post production training (bacth or real time / online)
  8. Mobile app development
  9. IoT edge device computing

Topic tensorflow java deep-learning python machine-learning

Category Data Science


When it comes to picking a language for a certain application, many factors may play a role:

  • Who will be building/maintaining the application and are they familiar with that language?
  • What other systems will the application need to communicate with and is one language easier to do that with than the other?
  • Is this project experimental or is it supposed to be a cornerstone for the next few years?
  • Are there existing implementations/frameworks of what needs to be done in that language? Does it fit the need or does it need to be altered?
  • Etc.

If we look at your list, here is my take:

  • ETL

ETL is quite a language agnostic, both Java and Python have good bindings to tools necessary for ETL.

  • Data pre-processing

Similar to ETL, although Python may be easier to do simple things.

  • Making use of pre-trained models / transfer learning

Python is much more convenient as most existing frameworks and pre-trained models are compatible with Python, and likely not with Java.

  • Distributed computing

The Hadoop stack is very Java friendly. Good to note that Python is increasingly become easier to use with Hadoop products (but Java is still running underneath)

  • Processing large volume of data (images,time series data,sensor data..etc)

Although Python has a reputation of being slow, if you know what you're doing, most Python libraries you'll use will ultimately run C/C++ code anyway, so speed shouldn't be a huge concern.

  • Production implementation (batch or real time / online)

Java is statically typed which offers a lot of advantages for production, it's becoming steadily easier to do this with Python as well (i.e. pydantic, mypy, etc.)

  • Post production training (bacth or real time / online)

This is independent of the language, this has more to do with your MLOps and your pipelines in place.

  • Mobile app development

Java definitely has the edge, although in a world of microservices, Python can prove very helpful.

  • IoT edge device computing

When it comes to device computing, Java probably has the edge, although, C/C++ are likely your best allies.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.