Can i access data on one spark cluster from another?
What if I want to sample a table over hive on Spark-cluster-1, but I'm logged in on Spark-cluster-2?
Connecting to jdbc:hive2://spark.cluster.1:10000/default;principal=hive/[email protected];ssl=true
This call returns error: "Error: Could not open client transport with JDBC Uri:" when I issue the call from spark.cluster.2 using this call:
hive -e "select * FROM database.tablename where rand() = 0.0001 order by rand() limit 10"
What are the limitations to do this? I should be able to read a table even if I'm not logged-in to the cluster where the hive tables reside.
This doesn't make these data fluid. The current workaround is to copy them tables manually from one cluster to the other.
Is there a better way to do this?
Topic hive apache-spark
Category Data Science