How to compute the median of a Date type of column in Spark (JAVA)

I have extracted a column from a dataset that contains Date type of values:

+-------------------+
|  Created_datetime |
+-------------------+
|2019-10-12 17:09:18|
|2019-12-03 07:02:07|
|2020-01-16 23:10:08|

The Type of the column being StringType in Spark. And i want to compute the average of these dates, for example in the above case will be 2019-12-03 07:02:07 since it is the median date of the three dates. How to achieve that in Spark in Java? I tried using

dataset.select(org.apache.spark.sql.functions.avg(dataset.col(Created_datetime).cast(timestamp))).first().getDouble(0)

But as it is clear, it is returning a double value not a date; Thanks for the help.

Topic java apache-spark

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.