June 28, 2018 at 10:36AM via https://www.kdnuggets.com
Announcing Microsoft Research Open Data, a cloud hosted platform for sharing datasets
Microsoft announces Microsoft Research Open Data, datasets representing many years of data curation and research efforts by Microsoft that were published as research outcomes.
Last week, Microsoft announced Microsoft Research Open Data: a new data repository in the cloud dedicated to facilitating collaboration across the global research community. Microsoft Research Open Data, in a single, convenient, Azure-hosted location, offers datasets representing many years of data curation and research efforts by Microsoft that were published as research outcomes.
“The goal of Microsoft Research Open Data is to provide a simple platform for Microsoft’s researchers and internal and external collaborators to share datasets and related research technologies and tools”, said Vani Mandava, Data Science Director, Microsoft Research in a blog post detailing the release.
Microsoft Research Open Data is designed to simplify access to research datasets, facilitate collaboration between researchers using cloud-based resources and enable reproducibility of research. It has been seeded with over 50 datasets with more being added incrementally.
All datasets are available for any researcher / data scientist to freely download, or with a few clicks, copy to an Azure based virtual machine / Data Science virtual machine, that comes pre-loaded with several open source data science tools for seamless development with R/Python and a variety of popular open source deep learning frameworks.
Researchers at Microsoft and in academia are excited about the potential for seamless sharing of research datasets.
“This is a game changer for the big data community. Initiatives like Microsoft Research Open Data reduce barriers to data sharing and encourage reproducibility by leveraging the power of cloud computing”, said Sam Madden, Professor, MIT
“I am often asked to share my research data, and the public sharing I have done in the past has been popular. Coordinating and cataloging these datasets in one place with Azure will be helpful for both internal and external researchers, giving them easy access, encouraging collaboration, and providing convenient cloud based access to the wealth of MSR’s shared data”, said John Krumm, Principal Researcher, MSR AI
See original blog post here.