DTU Biosustain

How DTU Biosustain powers protein engineering with AI/ML

The Novo Nordisk Foundation Center for Biosustainability, more commonly known as DTU Biosustain, was initially funded to advance Denmark’s contributions to bioprocessing and bioengineering. With the aim of developing new knowledge and technologies to support the transformation from conventional, and often unsustainable, industrial production methods to a bio-based industry, DTU is researching for a sustainable future. 

An interdisciplinary research center, DTU Biosustain supports the production of bio-chemicals using microbial production hosts called cell factories. “Recent progress in our ability to read and write genetic programs, combined with advances in automation, analytics and data science, has opened up for the discovery of new solutions regarding sustainable production in biological systems,” shares Carlos Acevedo-Rocha, Sr. Researcher, head of the Computational Protein Engineering (CPE) group. With the center’s core outputs supporting the mission of offering sustainable products for sustainable lifestyles, research activities fall under three categories: Sustainable Chemicals, Natural Products, and Microbial Foods. 

To achieve the goal of sustainability across these domains, and support continuous growth and innovation, DTU Biosustain needed to first establish a unified data infrastructure across scientific teams and research groups. Here’s a look at how DTU Biosustain and Benchling together worked to refine data management.

Navigating disconnected data

Prior to adopting Benchling, teams at DTU Biosustain faced several challenges in capturing, managing, and engaging with data. Foundational to these challenges was a lack of a centralized  data infrastructure. Scientists commonly entered data using different tools, with inconsistency in how each individual or team records, labels, and tracks data. 

As an extension of not-centralized infrastructure, DTU Biosustain faced difficulties with data management; non-standardized, siloed, and unlinked data leads to a lack of traceability. Searching for data posed a real problem, with limitations in surfacing a protocol, sample, or result. Similarly, access to data failed to extend across teams, prohibiting valuable interdepartmental collaboration. With manual aggregation and sample management processes, data compilation and sample traceability lagged behind. 

Data sharing, analysis, and reporting at DTU Biosustain were equally time-consuming practices due to manual processes. Rather than experiencing the benefits of seamless collaboration, hand-offs took place via PowerPoint, amongst ad-hoc meetings, and emails. Due to the nature of fragmented, siloed data, deriving insights was extremely challenging, as well as the ability to synthesize such data into actionable, shareable reports.

Standardizing and centralizing with Benchling

To mitigate the various challenges and continue implementing an efficient data strategy, DTU Biosustain and Benchling partnered up. Benchling, a cloud-native platform, is an intuitive software, purpose built for life sciences research and designed to match the flexibility and speed of R&D. Connecting Benchling to other infrastructure components and addressing the lack of centralized data infrastructure, teams at DTU Biosustain transformed inconsistent data capture into standardized protocols, workflows and schemas, driving consistency across the entire organization. And with natively integrated applications, users no longer need to switch between various modules nor enter data multiple times, saving time and increasing scientist productivity. 

With improved data infrastructure comes improved data management. Using a unified data model, users are given the chance to standardize data capture and adopt common practices across research groups. The global search also provides the opportunity to search data across groups and increase data findability and usability.  

Advancing this interconnectedness, Benchling has helped DTU Biosustain to improve cross-team collaboration. Hand-offs now take place directly on the platform ensuring the collection of full experimental context, and integration with the other pieces of the data infrastructure at DTU Biosustain. Through implementing a cloud-native platform with a unified data model, system-wide search, and codeless configuration, DTU Biosustain is in the process of achieving a standardized and centralized data strategy. 

Making AI/ML sustainable

AI can be overwhelming. At the CPE group, the clarity on where to get started came from a catalyst itself. For sustainability, although enzymes are the catalysts for reactions, they are also highly complex, with multiple properties and dimensions. While the  throughput of one million variants can be achieved in the wet lab, an opportunity to optimize this process with the application of models is prime.

Delivering on this AI focus, the CPE team faced obstacles and headwinds finding the right people who are fluent in ML fundamentals. Once the expertise was secured, establishing a baseline of harmonized, clean, readily available data is a critical first step. Adhering closely to FAIR data principles and creating a set of rules, conditions, and parameters enables success. Paired with an informatics solution that enables implementing structured data capture, Benchling’s ability to support DTU’s adoption of FAIR data principles will lay the foundation. 

When it comes to ML, large language models are a given. At the CPE group, there is a strong interest in innovation that is currently transpiring in the enzyme engineering and design fields using LLMs, presenting promising advancements for the speed of science. As the team looks toward the future, measuring success is key to achieving their AI/ML goals. Specifically, the ability to tailor and use models for improving proteins and enzymes that the CPE group needs, while extending efficiencies to the wet lab, are priorities. At Benchling, we celebrate DTU’s mission of creating a more sustainable world, and look forward to seeing this growth.

Learn about how we're transforming R&D operations for the better

Helix Image