Can i access data on one spark cluster from another?

What if I want to sample a table over hive on Spark-cluster-1, but I'm logged in on Spark-cluster-2?

Connecting to jdbc:hive2://spark.cluster.1:10000/default;principal=hive/[email protected];ssl=true

This call returns error: "Error: Could not open client transport with JDBC Uri:" when I issue the call from spark.cluster.2 using this call:

hive -e "select * FROM database.tablename where rand() = 0.0001 order by rand() limit 10"

What are the limitations to do this? I should be able to read a table even if I'm not logged-in to the cluster where the hive tables reside.

This doesn't make these data fluid. The current workaround is to copy them tables manually from one cluster to the other.

Is there a better way to do this?

Topic hive apache-spark

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.