Knowledge discovery through chemical space networks: the case of organic electronics

J Mol Model. 2019 Mar 7;25(4):87. doi: 10.1007/s00894-019-3950-6.

Abstract

Modern materials discovery and design studies often rely on the computational screening of large databases. Complementing experimental databases, virtual databases are thereby increasingly established through the first-principles calculation of computationally inexpensive, but for a given application, decisive microscopic quantities of the system. These so-called descriptors are calculated for vast numbers of candidate materials. In general, the sheer volume of datapoints generated in such studies precludes an in depth human analysis. To this end, smart visualization techniques, based e.g., on so-called chemical space networks (CSN), have been developed to extract general design rules connecting structural modifications to changes in the target functionality. In this work, we generate and visualize the CSN of possible crystalline organic semiconductors based on an in-house database of > 64,000 molecular crystals that we extracted from the exhaustive Cambridge Structural Database and for which we computed prominent charge-mobility descriptors. Our CSN thereby links clusters of molecular crystals based on the chemical similarity of the scaffolds of their molecular building blocks and thus groups communities of similar molecules. Including each cluster's median descriptor values, the CSN visualization not only reproduces known trends of good organic semiconductors but also allows us to extract general design rules for organic molecular scaffolds. Finally, the local environment of each scaffold in our visualization shows how thoroughly its local chemical space has already been explored synthetically. Of special interest here are those clusters with promising descriptor values, yet with little or no connections in the sampled chemical space, as these offer the most room for scaffold optimization.

Keywords: Chemical space networks; Materials design; Organic electronics.