Issue with IPython/Jupyter on Spark (Unrecognized alias)
I am working on setting up a set of VMs to experiment with Spark before I spend go out and spend money on building up a cluster with some hardware. Quick note: I am an academic with a background in applied machine learning and work quit a bit in data science. I use the tools for computing, rarely would I need to set them up.
I've created 3 VMs (1 master, 2 slaves) and installed Spark successfully. Everything appears to be working as it should. My problem lies in creating a Jupyter server that can be connected to from a browser not running on a machine on the cluster.
I've installed Jupyter notebook successfully... and it runs. I've added a new IPython profile connecting to a remote server with Spark.
now the problem
The command
$ ipython --profile=pyspark
runs fine and it connects to the spark cluster. However,
$ ipython notebook --profile=pyspark
[stuff is here] Unrecognized alias: "profile=pyspark", it will probably have no effect.
defaults to the default
profile not the pyspark
profile.
My notebook config for pyspark
has:
c = get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8880
c.NotebookApp.server_extensions.append('ipyparallel.nbextension')
c.NotebookApp.password = u'some password is here'
Topic ipython pyspark apache-spark python
Category Data Science