Applying FAIR Principles to Data Management with Benchling
Modern life science companies are leveraging the power of biotechnology to positively impact human health, agricultural productivity, and environmental sustainability. It’s mission-critical for these organizations to manage and connect complex data and processes, from initial discovery all the way to commercialization, in order to bring those ideas to market. Speed and efficiency have become more vital in such a competitive market, and comprehensive digitalization of R&D is not only helpful, but crucial in winning the race for the next breakthrough. Legacy systems, including isolated and inflexible kinds of software, slow down progress by limiting knowledge transfer, siloing data, and wasting precious time on data cleanup and management. By implementing a modern unified platform that is designed for complex science, and that conforms to community standards, life science organizations can significantly increase productivity, improve collaboration, and streamline decision-making — enabling them to get their products to market before competitors do. FAIR data principles have become an international guideline for high-quality data stewardship on such a platform, and serve as helpful reference points for the development of an effective data management strategy.
The acronym FAIR emphasizes that all digital objects should be “Findable, Accessible, Interoperable and Reusable,” both for machines and for people. FAIR principles were born from an urgent need to improve data management infrastructure, as incomplete, unusable and disorganized data was preventing organizations from extracting the maximum benefit from their investments. A 2018 PwC EU report estimated that lack of FAIR data costs the European economy €10.2-26 bn1 annually.
Hence, FAIR data provides tremendous value by reducing R&D costs, promoting collaboration, enhancing data integrity, and enabling powerful technologies, such as artificial intelligence and machine learning, to solve complex problems. Benchling understands the significance of all these factors, and has built its entire platform on a foundation of FAIR data principles.
To support an organization's FAIR journey, Benchling works to ensure data is captured right from the start, resulting in a solid backbone that can be leveraged to drive more efficient research. We’ve assembled a solution brief that illustrates how Benchling supports companies in their adoption of FAIR data principles, and demonstrates the value of achieving this transformation.
If a scientist or data scientist is unable to find data, then it’s useless in answering a question or solving a problem. Therefore, data must be structured in a way that makes it easily discoverable and query-able by both humans and machines. This requires effective use of unique IDs, and rich metadata tags.
That’s why Benchling assigns each registered entity (such as a cell line or protein) its own unique ID, with a URL link that serves as the unique and persistent identifier. Every entity can be associated with critical metadata, which not only describes it, but also provides useful context. For example, if the entity is an antibody, then the metadata could include the plasmid prep used, as well as the cell line info, author, date of production, and any analytical data associated with the protein.
Data and metadata are stored and indexed in Benchling’s searchable cloud application, and can be accessed and discovered through the UI, through REST APIs, or through the data warehouse.
For data to be accessible, it needs to be easily searchable by humans and computational systems, without requiring extensive knowledge about its creator or date of creation. Benchling makes data more accessible through its SaaS-based structure, which enables entries to be accessed by authorized personnel using the unique identifier and HTTP URL (internet link). Benchling also conforms to the openAPI standard, which allows computational systems to discover and understand the capabilities of the service without access to source code or documentation.
Furthermore, individual Benchling user accounts can be configured using a granular permission structure, which offers flexible customer-admin defined user roles, along with the ability to grant specific types of access to external users using the UI and API. These REST APIs communicate with the client by automatically using the HTTPS protocol for authentication and protection of data in transit.
“Adopting the Benchling platform has vastly increased the efficiency of our research, saving each of our scientists nearly an entire day per week while doubling the speed of collaboration.”
– Brian McNatt, IT Director / Research & Early Development, Sanofi
To achieve interoperability, an informatics system must use broadly applicable vocabulary and ontologies. Benchling’s configurable data model is built with standard vocabulary and ontologies that are agreed upon prior to implementation, and can map to any scientific process. While Benchling encourages the use of industry standards that follow FAIR principles, customers are free to use their own vocabularies, in order to conform to standards applicable to their industry.
In Benchling, meaningful “smart links” are created between data and metadata resources These links provide qualified references between the (meta)data, enriching researchers’ contextual knowledge about each piece of information. Users can click on any single element and instantly view the whole picture — author, antibody chain info, number of antibody lots, plasmid preps info, etc.
When existing data can be leveraged to answer new questions, it becomes reusable. Benchling’s unified informatics platform makes data points reusable, by making it easy for organizations to link them to the context under which they were originally generated — such as the materials used, protocol used, date of generation, and experimental parameters.
Benchling has adopted the HELM standard to deal with many subtleties of biologics (antibodies, antibody fragments, oligos, etc.). This standard aligns with the frameworks of many scientific communities, allowing information to be easily shared among scientists and organizations across multiple disciplines.
All teams surveyed for this analysis agreed that the adoption of FAIR data with Benchling delivered significant near-term value: 62% productivity gains, 71% improved cross-team collaboration, 72% higher data integrity, and ~$3.4M yearly decreased R&D costs for a 100-scientist organization.
For all these reasons, FAIR data is essential to solving the inefficiencies that have arisen from legacy data management practices, such as data silos, inconsistent terminologies, and lack of sufficient context on the data. Benchling is designed from the ground up to support FAIR data, and has an in-house professional services team who facilitate the smooth transition to FAIR principles throughout every stage of an organization’s journey. Adopting the Benchling platform will enable the creation of a centralized and unified data foundation, which is becoming increasingly necessary for leading-edge life science discoveries. Hundreds of customers report that Benchling’s cumulative value across all dimensions — costs, time, and satisfaction — has helped them reach milestones quickly, and get the right product into the market.