Streamlining the automated discovery of porous organic cages

Annabel R Basford; Steven K Bennett; Muye Xiao; Lukas Turcani; Jasmine Allen; Kim E Jelfs; Rebecca L Greenaway

doi:10.1039/d3sc06133g

Streamlining the automated discovery of porous organic cages

Chem Sci. 2024 Mar 13;15(17):6331-6348. doi: 10.1039/d3sc06133g. eCollection 2024 May 1.

Authors

Annabel R Basford¹, Steven K Bennett¹, Muye Xiao¹, Lukas Turcani¹, Jasmine Allen¹, Kim E Jelfs¹, Rebecca L Greenaway¹

Affiliation

¹ Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus, 82 Wood Lane W12 0BZ UK k.jelfs@imperial.ac.uk r.greenaway@imperial.ac.uk.

Abstract

Self-assembly through dynamic covalent chemistry (DCC) can yield a range of multi-component organic assemblies. The reversibility and dynamic nature of DCC has made prediction of reaction outcome particularly difficult and thus slows the discovery rate of new organic materials. In addition, traditional experimental processes are time-consuming and often rely on serendipity. Here, we present a streamlined hybrid workflow that combines automated high-throughput experimentation, automated data analysis, and computational modelling, to accelerate the discovery process of one particular subclass of molecular organic materials, porous organic cages. We demonstrate how the design and implementation of this workflow aids in the identification of organic cages with desirable properties. The curation of a precursor library of 55 tri- and di-topic aldehyde and amine precursors enabled the experimental screening of 366 imine condensation reactions experimentally, and 1464 hypothetical organic cage outcomes to be computationally modelled. From the screen, 225 cages were identified experimentally using mass spectrometry, 54 of which were cleanly formed as a single topology as determined by both turbidity measurements and ¹H NMR spectroscopy. Integration of these characterisation methods into a fully automated Python pipeline, named cagey, led to over a 350-fold decrease in the time required for data analysis. This work highlights the advantages of combining automated synthesis, characterisation, and analysis, for large-scale data curation towards an accessible data-driven materials discovery approach.