The Inverse Problem in Material Discovery: Given a Target Property, find the material
The need to find materials with specified functionalities: Society’s goals to deliver a material that is 30% more efficient at converting sunlight to electricity, or a battery with 5 times higher energy density, or a flat-panel display with a ten-fold increase in the conductivity of the transparent window rely on our ability to both discover and synthesize the next generation of functional semiconductor materials. Such modern technologies are increasingly based on specialized functionalities, which "live" in specific materials and no others. Yet, actual materials with such specific technology-enabling functionalities are often unknown: We understand the functionality needed, but often we do not have the materials that provide those functionalities.
“Materials” VS “Material Property”: When one refers to “materials”, one generally tend to define them by specifying their atomic constituents A (in essence atomic numbers), the composition C (either integer stoichiometric composition, or fractional alloy composition), and structure S (crystallographic, or nanoscopic or mesoscopic, or short range/long range order parameters), that is the ACS. When we say “ material properties”, we generally refer either to an “effect”, (such as magnetism, semiconductivity, superconductivity, thermoelectricity, piezoelectricity, etc.), or to a “functionality” (such as certain chemical or biological reactivity).
Generally, specific properties/effects live within specific materials: Despite the great fascination that generic, “electrons-without-atoms” (such as electron gasses, Fermi liquids) have held throughout the history of solid-state theory, the natural habitat of electrons tends to be in the general neighborhood of ions. Indeed, many effects of interest in condensed matter physics and chemistry tend to “live” in certain specific materials, and no others. This plurality and diversity—the existence in Nature of an enormous number of chemical and structural species, each with its specific physical properties, establishes the ‘spice of life’ in Material Science (and indeed, in biology, the essence of life itself). One could argue, in fact, that much of the research in inorganic solid-state chemistry, metallurgy, structural biology and semiconductor nano- science are defined by the quest for understanding which ACS produce given material property or functionality.
The ACS controls material properties: The fact that material properties are affected by the identity of the atomic constituents is obviously reflected by the fact that solid state chemistry books are organized by chapters bearing the names of the main constituent elements in a family of compounds (“nitrides”, ”oxides”, “manganite”), and indeed by the structure of the Periodic Table of elements itself. The specific way in which structure, (e.g., atomic configuration at fixed atomic composition) is associated with specific physical properties is also clearly emerging from the study of ordered compounds vs superlattices vs nanostructures of the same chemical constitution. Indeed, the central dogma to emerge from nanoscience is that one can modify the physical properties (e. g, band gaps, absorptivity, chemical reactivity or magnetism) of materials not only by changing their chemical identity and composition, but also by changing the shape and size of the nanostructure (configuration). Perhaps the most celebrated example of configuration-property dependence in crystals involves the three forms of elemental carbon—diamond, graphite, and C60—all having the same chemical composition (but different geometric configurations), yet you will be hard pressed to find any other system with a more diverse set of physical properties, such as conductivity, hardness, optical transparency, etc. The different structural polytypic of SiC, (such as 3C zincblende and 2H wurtzite) have but a subtle (starting with the 3rd nearest neighbor) difference in atomic configuration, but a huge difference (almost 1 eV) in optical band gap. The two (left and right) enantiomers of Ketamine have the same composition and bond connectivity, yet functionally, one is an anesthetic and the other is a hallucinogenic. Similarly staggering differences in properties associated with small but specific configurational differences are manifested by the different ferromagnetic Tc (ranging from 0K to 380 K) of different atomic arrangements of Mn ions in the GaAs host crystal. The way that different geometrical ensembles of chemical dopants in insulators produce radically different excitonic emission lines has been known for almost half century in the work of Thomas and Hopfield on nitrogen impurities in light-emitting GaP. The significantly different optical and transport properties of various stackings of atomic layers in semiconductor superlattices is the basis for “superlattice engineering”.
Many atomic configurations can be realized in the laboratory (even if they are not the ground state): What makes the association of various configurations with specific physical properties not only interesting but also relevant is the fact that to some extent one can now realize almost arbitrary atomic configurations in the laboratory. Locally (if not globally) stable and practically long-lived configurations can be manufactured even if they do not correspond to the thermodynamic ground state. Examples include the innumerably large number of possible super lattice configurations that can be grown layer-by-layer from semiconductor components, or insulator, or metal components. The superlattices formation enthalpy is often positive (so the ground state corresponds to phase-separation into constituents, not the superstructure) yet the process of layer-by-layer growth at high T followed by cooling to room temperature generally creates a dynamically stable structure that can be used for useful optoelectronic devices with impunity, because the activation barrier for phase separation is often insurmountable at room temperature. In principle, each of the innumerably many superlattice configurations would have distinct physical property such as band gap or optical response. But which one should one make to achieve a given property?
Similarly, ordered configurations of isovalent atomic components (such as GaP + InP) are not the thermodynamic ground state in the bulk (phase separation is). Yet, such ordered structures grow spontaneously to macroscopic dimensions at the free surface of a reconstructed semiconductor substrate, where the ordered alloy is the thermodynamic ground state. Once covered by subsequent incoming atomic layers, such ordered alloys are frozen-in, even though they are no longer the ground state. Such spontaneously ordered alloys have distinct optical band gaps, interband transitions and effective masses compared with, say, the equivalent random alloys. Other examples of stabilized non-ground-state configurations include the innumerably large number of artificial solids that can be made by dragging atoms with an STM tip to lattice locations that are un-natural, yet dynamically stable and protected from decay into the ground state by practically insurmountable activation barriers.
Material discovery has traditionally not taken full advantage of the evolving understanding of the material vs material property specificity, nor of the close dependence of properties on configuration. A few historic observations testify to this fact.
First, many important technology-enabling properties were discovered by trial and error, if not accidentally (including semiconductivity of III-V’s, hardness of diamond, the superconductivity of many oxides, or the giant magneto resistivity of lanthanides). The shortcoming of accidental discovery is that a target property may be missed, and that the R&D period tends to be rather long given that we are not really sure what we have.
Second, very few materials are currently being exploited in semiconductor based “high” technology (including silicon in microelectronics, CdTe and CuInSe2 in thin film solar cells, GaN in semiconductor lasers, or Indium Tin Oxide in transparent conductors).
Third, many materials that could have been made are actually missing. For example, out of the 483 possible ABX 18-electron materials , only 83 were previously made.
The idea underlies Inverse Design: Given that the atomic configuration controls the material property, and that many atomic configurations can be realized in the laboratory suggests that perhaps one might first articulate the needed functionality/property, and then look for the material that has this property (see Figure 1).The target property could be a given band gap, or Curie temperature, or impurity level, or inverted order of bands akin to topological insulators, etc. Then, one would search quantum mechanically either for (i) the (non-equilibrium) spatial atomic configuration in artificial structures, or for (ii) the chemical compound in equilibrium (ground state structure) which has the desired “target property”. The approach driving a structure or material from the desired property is inverted relative to the time-honored approach of starting with a given structure (or symmetry) and then calculating or measuring the properties. It also differs from the traditional “model Hamiltonian” approach of describing an effect by way of postulated “effect producing interactions, notwithstanding the association of the interactions with a material ACS.
Two approaches to the problem.
One is is the data-driven discovery where researchers compute as many of the properties/functionalities (electronic, structural, thermodynamic) of structures and configurations listed in compilations of known compounds, and then sift through the database in search for useful trends. This computational analogue of combinatorial chemistry is illustrated in the bottom part of Figure 2 by the blue squares representing existing compounds with atomic numbers ZA and ZB whereas the question mark area represents chemically plausible but so far unreported (“missing”) compounds. This successful database approach is, however, not without limitations. For example, artificially grown heterostructures (superlattices and quantum well) that are central to numerous energy technologies (PV, LED, Quantum Hall physics) are excluded because the astronomic number of possible layer combinations in such heterostructures is outside the capability of any finite database.
The alternative paradigm used here is functionality-driven discovery (or ‘Inverse Design’). It uses optimization and search methods (such as evolutionary algorithms) to directly follow the functionality surface (illustrated by the top part of Fig. 2), thus directly identifying the structures and configurations whose functionality comes close to the desired target needed for a particular application. Because optimization methods learn the structure-functionality landscape, only a small fraction of the total number of configurations needs generally to be explored.
The three modalities of Inverse Design: In Inverse Design of materials we generally declare first the functionality or property required, and then we search for the material or configuration that has this functionality. Different circumstances require different approaches to this Inverse Design problem, all requiring a theoretical/computational step upfront. I will refer to these as “Modalities of Inverse Design”, describe them in what follows and give examples.
Modality 1 applies to cases where a single material system (or a simple A/B chemical combination thereof) is considered, such as Si/Ge of GaAs/AlAs, but despite the chemical simplicity, the system can have a very large (in fact, astronomic) number of spatial configurations (equilibrium or non-equilibrium) that can in principle be realized. Possible examples include the various layer sequence and layer orientations possible within a simple, two-component A/B superlattice grown by, say, MBE, as well as numerous core-shell and other nanostructure configurations. Physical properties or functionalities sought are typically those that can be calculated currently in a deterministic fashion for a given configuration, such as electronic structure related quantities (band gaps ; effective masses; Ferromagnetic Curie temperature , properties of impurities in solids , strength of absorption [5, 6], valley splitting , or Topological Insulators). The strategy applied to modality 1 problems often involves (a) calculating the final property “on the fly” for various assumed (equilibrium or non-equilibrium) configurations  (or via a surrogate model such as cluster expansion ), guided by (b) Genetic Algorithm or simulated annealing to find the configuration with the target property, followed by (c) laboratory realization of the “best of class” case.
Modality 2 applies to cases where numerous material systems need to be considered (such as different ternary semiconductors documented in ICSD ), but despite this chemical complexity/diversity, there is a relative structural simplicity, since in this case the materials are known, and so the (equilibrium) crystal structure are also generally documented and need not be predicted. This modality then deals with materials where the structure and composition are known from previous studies, but the properties are unknown. Physical properties or functionalities sought can include, photovoltaic absorption , transparent-conductivity [10, 11], photo electrochemical water-splitting ability, or thermoelectricity. The strategy applied to Modality 2 involves (a) identification of a calculable metric that is simpler to compute than the final property sought, but can act as a marker/descriptor for it. Examples include the PV absorption SLME metric , or the TCO design principles . Then, (b) the marker is calculated, high throughput style, for all the chemical compounds in the group at the knows crystal structure, the result are sorted, and “best of class” identified and (c) laboratory realization of best of class attempted.
Modality 3 applies to cases where numerous material systems need to be considered, but they were not previously known, so we do not know if they are stable, what is their stoichiometry, and certainly we do not know their property. Examples include compounds that can be hypothesized, but are undocumented in ICSD, such as the “missing spinels”  or “missing Half-Heusler alloys” [1, 13]. This modality then deals with materials where the structure and composition are unknown and so do the properties. Physical properties or functionalities sought are identical to those in modality 2. The strategy applied to Modality 3 involves (a) predict the stable crystal structure (as well as few nearest metastable ones) using either a fixed-list of candidate structure-types” [1, 12, 13], or an accelerator for a fixed list such as data mining . A more general approach is the “Global Space Group Optimization”  which can start from zero knowledge on the Lattice-vectors and Wyckoff positions, (b) examine the stability of the predicted stable crystal structure with respect to decomposition to any possible sub-system. Once a stable new material has been identified it can be subjected to the strategy of modality 1 or 2, i.e. the material is added to the group being searched for a target functionality.
Recent successes of Inverse Design. The Inverse Design protocol has only recently been applied, and as shown here, has had a number of striking successes (Boxes 1-3). These successes lay the groundwork for the present project. (1) We were able to identify a particular ‘magic sequence’ of Si-Ge layers in a heterostructure (here, out of 2^40 possibilities) that would attain the ‘holy grail’ of direct band gap structure made of indirect gap building blocks (see Box 1 for optical selection rules by design). (2) An example of magnetism by design was illustrated by the theoretical identification of an ordered (201)-oriented superlattice of MnAs/GaAs (here, by directly calculating but ~100 out of 3 × 10^6 possible configurations), finding a predicted magnetic Curie temperature exceeding by ~100 K the current record of random GaMnAs alloys (a 50% increase) . (3) Likewise, ferroelectric polarization by design was recently demonstrated in a heterostructure comprised of PbTiO3 and SrTiO3 building blocks, where calculating the polarization of only ∼50 configurations sufficed to construct a robust cluster expansion  that permitted an effortless survey of approximately 3 × 10^6 heterostructures, identifying one that yields the maximal ferroelectric polarization . While, to our knowledge, laboratory syntheses of such recently predicted designed superlattices were not attempted yet; Inverse Design predictions of designed compounds were recently examined. (4) In the area of prediction of previously ’missing’ functional compounds, we point to the recent successful synthesis of 17 of our predicted never-before-made stable half Heusler compounds (Box 2), a nontrivial and selective accomplishment given that we predicted hundreds of other compounds to be missing because they are calculated to be unstable . The emergence of one of these—TaIrGe—(see Box 3) as a transparent conductor with wide (~ 3 eV) band gap and amazing hole mobility of ~ 2700 cm2/Vs, comes as a total surprise (because one typically would not associate iridates with large band gap semiconductor) and have provided confidence in our approach (although Ir-bearing compounds are not candidates for practical low-cost devices). (5) An example prediction of unsuspected functionality in known compounds through our new techniques is illustrated by our use of a theory-derived photovoltaic functionality metric , leading to the prediction and subsequent experimental confirmation  of non-standard poly-anionic semiconductors illustrated by CuSbS2 having a stronger absorption than current champion thin-film PV material.
Our VISION is to transform our experience with Inverse Design into a suit of codes that will enable a new level of collaboration between condensed matter theorists and experimentalists, in search of specific materials with exciting and novel target properties.
 Gautier, R. et al. Prediction and accelerated laboratory discovery of previously unknown 18-electron ABX compounds. Nature Chemistry 7, 308–316 (2015).
 Franceschetti, A. & Zunger, A. The inverse band-structure problem of finding an atomic configuration with given electronic properties. Nature 402, 60-63 (1999).
 Franceschetti, A. et al. First-Principles Combinatorial Design of Transition Temperatures in Multicomponent Systems: The Case of Mn in GaAs. Phys. Rev. Lett. 97, 047202 (2006).
 Dudiy, S. V. & Zunger, A. Searching for Alloy Configurations with Target Physical Properties: Impurity Design via a Genetic Algorithm Inverse Band Structure Approach. Phys. Rev. Lett. 97, 046401 (2006).
 d’Avezac, M., Luo, J. W., Chanier, T. & Zunger, A. Genetic-Algorithm Discovery of a Direct-Gap and Optically Allowed Superstructure from Indirect-Gap Si and Ge Semiconductors. Phys. Rev. Lett. 108, 027401 (2012).
 Zhang, L., d’Avezac, M., Luo, J.-W. & Zunger, A. Genomic Design of Strong Direct-Gap Optical Transition in Si/Ge Core/Multishell Nanowires. Nano Lett. 12, 984-991 (2012).
 Zhang, L., Luo, J.-W., Saraiva, A., Koiller, B. & Zunger, A. Genetic design of enhanced valley splitting towards a spin qubit in silicon. Nature Communications 4, 2396 (2013).
 ICSD, Inorganic Crystal Structure Database; Fachinformationszentrum Karlsruhe: Karlsruhe, Germany, 2006.
 Yu, L. & Zunger, A. Identification of Potential Photovoltaic Absorbers Based on First-Principles Spectroscopic Screening of Materials. Phys. Rev. Lett. 108, 068701 (2012).
 Trimarchi, G. et al. Using design principles to systematically plan the synthesis of hole-conducting transparent oxides: Cu3VO4 and Ag3VO4 as a case study. Phys. Rev. B 84, 165116 (2011).
 Yan, F. et al. Design of TaIrGe: a ternary half-Heusler transparent hole conductor. Nature Communications 6, 7308 (2014).
 Zhang, X., Stevanovic, V., d’Avezac, M., Lany, S. & Zunger, A. Prediction of A2BX4 metal-chalcogenide compounds via first-principles thermodynamics. Phys. Rev. B 86, 014109 (2012).
 Zhang, X., Yu, L., Zakutayev, A. & Zunger, A. Sorting Stable versus Unstable Hypothetical Compounds: The Case of Multi- Functional ABX Half-Heusler Filled Tetrahedral Structures. Adv. Funct. Mater. 22, 1425-1435 (2012).
 Fischer, C. C., Tibbetts, K. J., Morgan, D. & Ceder, G. Predicting crystal structure by merging data mining with quantum mechanics. Nature Materials 5, 641 - 646 (2006).
 Trimarchi, T. & Zunger, A. Global space-group optimization problem: Finding the stablest crystal structure without constraints. Phys. Rev. B 75, 104113 (2007).
 Deng, J., Zunger, A. & Liu, J. Z. Cation ordering induced polarization enhancement for PbTiO3-SrTiO3 ferroelectric-dielectric superlattices. Phys. Rev. B 91, 081301 (2015).
 Yu, L., Kokenyesi, R. S., Keszler, D. A. & Zunger, A. Inverse Design of High Absorption Thin-Film Photovoltaic Materials. Adv. Energy Mater. 3, 43-48 (2013).