Case Study: Unlocking Hidden Research Data for the Marine Biological Association
Project at a Glance
The Marine Biological Association (MBA) is one of the world’s leading marine science research organisations based in Plymouth. Being a research facility, they produce vast amounts of scientific data and research files. As data volumes increased over their 140 years of operation, they faced challenges such as duplicate files, legacy storage across multiple systems, and limited visibility into how data was being used.
To reorganise their valuable research data, MBA reached out to 101 Data Solution. Working together MBA undertook a comprehensive data discovery and audit project to identify inefficiencies, reduce unnecessary storage costs and create a solid foundation for future research and AI-driven initiatives.

Research Institution

140 Years Established

293,747 Samples Analysed
Continuous Plankton Recorder (CPR) Survey Data
The Challenge
As a research-led institution, the Marine Biological Association generates and stores massive amounts of data, including research datasets, imagery, documents, and historical records. How we record data has changed tremendously over the 140 years it has been established. Therefore, they have data in many different formats and stored in multiple ways. This became increasingly difficult to manage and maintain visibility.
As the amount of data grew, the MBA began to experience challenges around duplicate and redundant files, limited insight into who had access to critical data, and increasing storage demands. Legacy data had accumulated over the years without a clear retention policy; therefore, it was not clear what information remained valuable for research and what could be archived or removed. Without a clear view of their data landscape, the organisation faced growing pressure to manage risk, optimise storage infrastructure, and ensure their data environment was ready to support future research, analytics, and emerging technologies such as AI.
The Solution
MBA recognised they needed to restructure their data infrastructure and reach out to 101 Data Solutions. To understand the current state of MBA’s data environment, 101 Data Solutions carried out an in-depth data discovery and audit. The project focused on gaining a clear understanding of how data was stored, accessed, and used across the organisation.
The team examined the structure of MBA’s data and analysed the data environments to identify duplicate files, redundant data, and areas where governance and access controls needed to be improved. The analysis gave vital insights into how the information had grown over time and highlighted opportunities to streamline storage, improve visibility, and reduce unnecessary data.
The audit enabled the MBA to take a more strategic approach in managing their data, as they had full visibility of their data landscape. The organisation was able to understand what data should be retained for research purposes, what could be archived, and where improvements in governance and data management practices could be strengthened, and lay the foundation for the sustainability of their research infrastructure.
Implementation
The project began with a deep dive discovery phase where 101 Data Solutions analysed the organisation’s file systems to understand the scale and structure of its data.
The analysis revealed patterns of duplicate data, unused files, and legacy content that had accumulated over many years. With this insight, the MBA was able to prioritise which data should be retained, archived, or removed.
The results of the audit provided a clear roadmap for improving storage efficiency while maintaining the integrity and accessibility of valuable research data.
Importantly, the process also helped the MBA implement stronger governance practices, ensuring that future data growth could be managed more effectively.
Key Results
- Identification of duplicate and redundant data
- Greater visibility and control over file data
- Improved data accessibility for researchers
- Reduced storage inefficiencies and infrastructure costs
- Stronger data governance and compliance practices
- A clearer foundation for future analytics and AI initiatives
