Deployment in AzureML for NLP with fastText

I am new to Azure ML. I am working on sentimental analysis on a small tweet dataset with the help of fastText embedding (fastText file 'wiki-news-300d-1M.vec' is around 2.3 GB which I downloaded in my folder). When I run the program in the Jupyter notebook everything runs well. But when I try to deploy the model in Azure ML, while I attempt to run the experiment:

run = exp.start_logging()                   
run.log(Experiment start time, str(datetime.datetime.now()))

I am getting the error message:

While attempting to take snapshot of .
Your total snapshot size exceeds the limit of 300.0 MB.

The folder where my Jupyter files are there is close to 2.5GB. Is there any way to get over this problem or is it possible to write the NLP program without downloading the fastText embedding? Any suggestions?

Topic jupyter azure-ml

Category Data Science


Seems like the recommended option would be to store the trained embeddings in Azure Blob Storage and add it as a Dataset to the Azure ML workspace see here.

Other option can be to have the embeddings file out of the snapshot see here

To prevent unnecessary files from being included in the snapshot, make an ignore file (.gitignore or .amlignore) in the directory. Add the files and directories to exclude to this file. For more information on the syntax to use inside this file, see syntax and patterns for .gitignore. The .amlignore file uses the same syntax. If both files exist, the .amlignore file is used and the .gitignore file is unused.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.