Streamlining the automated discovery of porous organic cages

Chem Sci. 2024 Mar 13;15(17):6331-6348. doi: 10.1039/d3sc06133g. eCollection 2024 May 1.

Abstract

Self-assembly through dynamic covalent chemistry (DCC) can yield a range of multi-component organic assemblies. The reversibility and dynamic nature of DCC has made prediction of reaction outcome particularly difficult and thus slows the discovery rate of new organic materials. In addition, traditional experimental processes are time-consuming and often rely on serendipity. Here, we present a streamlined hybrid workflow that combines automated high-throughput experimentation, automated data analysis, and computational modelling, to accelerate the discovery process of one particular subclass of molecular organic materials, porous organic cages. We demonstrate how the design and implementation of this workflow aids in the identification of organic cages with desirable properties. The curation of a precursor library of 55 tri- and di-topic aldehyde and amine precursors enabled the experimental screening of 366 imine condensation reactions experimentally, and 1464 hypothetical organic cage outcomes to be computationally modelled. From the screen, 225 cages were identified experimentally using mass spectrometry, 54 of which were cleanly formed as a single topology as determined by both turbidity measurements and 1H NMR spectroscopy. Integration of these characterisation methods into a fully automated Python pipeline, named cagey, led to over a 350-fold decrease in the time required for data analysis. This work highlights the advantages of combining automated synthesis, characterisation, and analysis, for large-scale data curation towards an accessible data-driven materials discovery approach.