Why is GTZAN dataset so widely used without copyright permission

Question

Why is GTZAN dataset so widely used without copyright permission

Ross Gardiner

2022年2月3日 16:05

I am hoping to use the GTZAN music dataset to evaluate the performance of several noise-cancelling algorithms as part of a project for my undergrad. I notice that GTZAN is widely used across the literature for audio classification and even has exposure within Tensorflow and Pytorch APIs.

Unfortunately, I cannot find any information about the copyright status of data within GTZAN besides on the marsyas website itself where it is revealed that no permissions to redistribute the data have been given.

http://marsyas.info/downloads/datasets.html

Quote:

the database was collected gradually and very early on in my research so I have no titles (and obviously no copyright permission etc). The files were collected in 2000-2001 from a variety of sources including personal CDs, radio, microphone recordings

Perhaps someone may help me understand how using these data and re-distribution is legal? Even better, someone UK/EU based may be able to tell me if fair use still applies in my part of the world?

Topic audio-recognition deep-learning dataset machine-learning

Category Data Science

Why is GTZAN dataset so widely used without copyright permission

About