# 2018, Vol.91, No.6

Two unresolved issues in molecular self-assembly are discussed. Firstly, a novel method for the investigation of molecular self-assembly processes (QASAP: quantitative analysis of self-assembly process) is introduced and recent progress in the understanding of coordination self-assembly processes revealed by QASAP is described. Secondary, a challenge to the construction of discrete molecular self-assemblies that are formed with the aid of weak, nondirectional molecular interactions (such as van der Waals interactions) and the hydrophobic effect is discussed. In the course of the development of hexameric cube-shaped molecular self-assemblies (nanocubes) from gear-shaped amphiphiles (GSAs) in water, a design principle of hydrophobic surface engineering and a novel strategy for the construction of thermally stable discrete assemblies, molecular ‘Hozo’, are presented.

Molecular self-assembly is one of the fundamental, ubiquitous phenomena in nature.1 The organization working in self-assembly is based on the mutual interaction or communication between the constituents in the system. In molecular self-assembly, the final assembly is determined by the balance of various factors such as the directionality of the chemical bonds connecting the components, attractive and repulsive interactions between the components, and the solvation of the components. We see the importance of molecular self-assembly in life by the fact that many biologically essential structures such as double-stranded DNAs, protein assemblies, and cell membranes are constructed with the aid of molecular self-assembly.2 The most striking advantage of molecular self-assembly is its quite high efficiency and accuracy for the formation of well-defined structures from small pieces of components in a spontaneous manner. Thus molecular self-assembly has also been recognized as one of the best strategies for the fabrication of nm- to µm-sized structures in materials sciences. Thanks to a vast number of contributions in a variety of scientific fields, the concept of molecular self-assembly has already been established and the field of molecular self-assembly has matured. Nevertheless, there are several issues that remain to be solved in molecular self-assembly. In this account, two main topics our group is currently focusing on are discussed, presenting our recent progress. The first is to unveil how molecular self-assembly takes place on the molecular level and to understand how the components assemble along the reaction pathway(s). A novel method for the investigation of molecular self-assemblies (QASAP: quantitative analysis of self-assembly process) is firstly presented and the self-assembly processes of several coordination self-assemblies revealed by QASAP are discussed. The other issue is the challenge to the construction of discrete well-defined self-assembled structures only using weak, non- or less directional molecular interactions than covalent bonds under the hydrophobic environment. For the rational design of molecular building blocks in a hydrophobic self-assembly, a design principle of molecular surface engineering is presented and based on the concept of molecular ‘Hozo’, discrete, thermally stable cube-shaped assemblies, nanocubes, were designed and synthesized.

Understanding natural phenomena in detail has been the natural desire of human beings their the emergence, so unveiling the mysteries of the natural world has attracted much attention. If we look back on the history of chemistry, it is clear that the mechanistic understanding of chemical phenomena at the molecular level has taken this science to the level where one can control the system at will. For this reason, mechanistic investigations of chemical reactions have been appreciated even though they are pure fundamental research and are not always directly related to applications.

The most important point in order to obtain structurally well-defined self-assembled entities is molecular design: the design of the chemical structure of the components and choosing molecular interaction(s) that define the spatial arrangement of the components in the final structure. Molecular modeling studies enable the rational design of the components. During the last three decades, a huge number of artificial molecular self-assemblies have been developed using organic and inorganic molecules and bioconjugates as constituents.1 Many aspects of these processes, such as why one of the possible assemblies is selected in equilibrium under thermodynamic control, have often been discussed by comparing structures of conceivable assemblies, but the understanding of how the molecular components mutually interact with each other to form transient intermediates that are finally converted into the thermodynamically most stable assembly is still very little. This is partly because the final structures are always produced regardless of the assembling pathways as long as the self-assembly proceeds under thermodynamic control. According to Lindsay's classification of molecular self-assembly,1a such a perfectly reversible self-assembly system (Class I) satisfies the strict definition of self-assembly. However, many molecular self-assemblies found in nature are produced under a partially reversible or irreversible conditions.1a In other words, these assemblies are metastable and kinetically selected on an energy landscape, so in such systems self-assembly depends on the pathway. Thus, in order to create metastable self-assemblies, which biological systems very often use, it is inevitable to approach molecular design based on the self-assembly pathway(s) and mechanism(s) of these systems.

Another reason why molecular self-assembly processes have scarcely been investigated is due to the difficulty in the detection of the intermediates produced during the self-assembly. In addition, even if all the intermediates can be observed, because a huge number of intermediates consisting of the same components are produced during the self-assembly, it is impossible to distinguish all the species. It is true that the direct detection of all the intermediates is the best and most straightforward way to reveal molecular self-assembly processes, but this approach is impractical and almost impossible. In some cases, certain intermediates can be detected and characterized mainly due to their long lifetime. Some are converted to the final self-assemblies and others are not because the energy barriers of their conversion are significantly high. Such kinetically trapped species (KTSs) provide valuable information on the self-assembly process. But as is discussed below, self-assembly often takes place not along a single pathway but through several pathways similar to protein folding. In such cases, the final product is not completely assembled through KTSs that are finally transformed into the final assembly but part of the product is formed faster not through KTSs.

As mentioned above, the investigation of molecular self-assembly processes is difficult because of the unique circumstances that are not seen in other chemical reactions. To settle this issue, we have recently proposed a method to investigate self-assembly processes (QASAP: quantitative analysis of self-assembly process).3 QASAP has been applied to discrete coordination self-assemblies formed from transition metal ions and multitopic ligands.4 The concept of QASAP is simple. Contrary to the intermediates, substrates and products can almost always be detected and quantified, so in QASAP the information about all the intermediates is indirectly obtained as the average composition of all the intermediates. Let us consider the self-assembly of a M2L4 coordination cage from MX4 and L (eq (1) and Figure 1).

$$4\cdot \text{L} + \text{2MX_{4}} \rightleftarrows \text{M_{2}L_{4}} + 8\cdot \text{X}$$
(1)
where M, L, and X are the metal ion possessing four binding sites, the ditopic ligand, and the leaving ligand, respectively. The elementary reaction in the coordination self-assembly is only the ligand exchange between the ligands (L and X). A variety of intermediates consisting of M, L, and X, are produced and are generally expressed as MaLbXc (a, b, and c are positive integers or 0). Here, the two parameters indicating the character of the intermediates, n and k, are defined as follows:
$$n = \frac{4a - c}{b}$$
(2)
$$k = \frac{a}{b}$$
(3)
The n value indicates the average number of metal ions bound to a single ditopic ligand L, while the k value represents the ratio between the metal ion and the ditopic ligand.

Now we assume that no intermediates can be observed. In this case the species that can be quantified are the substrates (MX4 and L) and the products (M2L4 and X) in eq (1), from which the amounts of M, L, and X contained in the intermediates are calculated to lead to the average composition of all the intermediates, MaLbXc. Thus, the n and k values for the average composition of all the intermediates are indicated by the following equations:

$$\langle n \rangle = \frac{4\langle a \rangle -\langle c \rangle}{\langle b \rangle}$$
(4)
$$\langle k \rangle = \frac{\langle a \rangle}{\langle b \rangle}$$
(5)
The 〈n〉 and 〈k〉 values can also be indicated in a different way. When the molecular formula of an intermediate i is expressed as $$\text{Pd}_{a_{i}}\text{L}_{b_{i}}\text{X}_{c_{i}}$$, the average molecular formula of all the intermediates is expressed as $$\text{Pd}_{{\bar{a}}}\text{L}_{{\bar{b}}}\text{X}_{{\bar{c}}}$$, where $$\bar{a}$$, $$\bar{b}$$, and $$\bar{c}$$ are defined as follows:
$$\bar{a} = \sum\nolimits_{i}^{\textit{all}}m_{i} a_{i}/\sum\nolimits_{i}^{\textit{all}}m_{i}$$
(6)
$$\bar{b} = \sum\nolimits_{i}^{\textit{all}}m_{i} b_{i}/\sum\nolimits_{i}^{\textit{all}}m_{i}$$
(7)
$$\bar{c} = \sum\nolimits_{i}^{\textit{all}}m_{i} c_{i}/\sum\nolimits_{i}^{\textit{all}}m_{i}$$
(8)
where mi is the mole number of species i. The average (n, k) value of all the intermediates can also be expressed as follows:
$$\langle n \rangle = \frac{4\bar{a} - \bar{c}}{{\bar{b}}}$$
(9)
$$\langle k \rangle = \frac{{\bar{a}}}{{\bar{b}}}$$
(10)

As is often the case with molecular self-assembly, most of the intermediates cannot be detected by any spectroscopic or other measurement methods. Even in such a case, QASAP enables us to analyze the self-assembly process from the changes in the existence ratios of the substrates and the products and from the (〈n〉, 〈k〉) values of the intermediates. This is a great advantage of QASAP. As the experimentally obtained (〈n〉, 〈k〉) values contain the information about the average composition of all the intermediates, the analysis using the (〈n〉, 〈k〉) values should carefully be carried out. If only a single intermediate exists in the reaction mixture, the (n, k) value of this intermediate is the same as the (〈n〉, 〈k〉) value determined by experiment. Thus in this special case, the intermediate can easily be determined. But unfortunately, in most cases, several intermediates coexist in the reaction mixture, so the (〈n〉, 〈k〉) value contains the information of a mixture of intermediates. Thus the determination of the intermediates only by the (〈n〉, 〈k〉) value is difficult with some exceptions. However, even in such a situation, it is possible to discuss the self-assembly process by the time variation of the (〈n〉, 〈k〉) value. Let's consider several prototypical examples we have experimentally observed in the following section.

## (1) Intra- or Intermolecular Ligand Exchanges in the Intermediates.

During the time when neither the multitopic ligand nor the metal ion is consumed, when the 〈n〉 value increases, while the 〈k〉 value does not show any significant changes, the leaving ligand (X) should be released from the intermediates by ligand exchanges with L in the intermediates. This indicates that the intermolecular reactions between the intermediates ($$\text{M}_{a_{i}}\text{L}_{b_{i}}\text{X}_{c_{i}}$$ and $$\text{M}_{a_{j}}\text{L}_{b_{j}}\text{X}_{c_{j}}$$) to produce a large intermediate (eq (11) and Figure 2a) and/or the intramolecular ligand exchanges in the intermediates to lead to a closed structure (eq (12) and Figure 2b) take place.

$$\text{M}_{a_{i}}\text{L}_{b_{i}}\text{X}_{c_{i}} + \text{M}_{a_{j}}\text{L}_{b_{j}}\text{X}_{c_{j}} \to \text{M}_{a_{i}+a_{i}}\text{L}_{b_{i}+b_{j}}\text{X}_{c_{i}+c_{j}-x} + x\cdot \text{X}$$
(11)
$$\text{M}_{a_{i}}\text{L}_{b_{i}}\text{X}_{c_{i}} \to \text{M}_{a_{i}}\text{L}_{b_{i}}\text{X}_{c_{i}-x} + x\cdot \text{X}$$
(12)

## (2) Incorporation of the Multitopic Ligands or the Metal Ions in the Intermediates.

The decrease in the 〈k〉 value with a slight increase in the 〈n〉 value suggests the incorporation of the ditopic ligand in the intermediate with the release of X (eq (13) and Figure 2c). This is confirmed by the decrease in the existence ratio of L and the increase in the existence ratio of X. Conversely, an increase in the 〈k〉 value with a slight decrease in the 〈n〉 value indicates the incorporation of the metal ion(s) in the intermediates.

$$\text{M}_{a_{i}}\text{L}_{b_{i}}\text{X}_{c_{i}} + \text{L} \to \text{M}_{a_{i}}\text{L}_{b_{i} + 1}\text{X}_{c_{i}-x} + x\cdot \text{X}$$
(13)

## (3) Production of Larger Intermediates that Contain More Components than the Final Assembly.

When the final assembly is produced from intermediates containing not more components than in the final assembly, at least one molecule of X should be released to achieve the formation of a single molecule of the product. Thus the case where the number of X released for the formation of one molecule of the product is less than one indicates that the final assembly is produced through large intermediates that contain more components than in the final assembly. In the case of the self-assembly of a M2L4 cage in eq (1), as the average molecular formula of all the intermediates is expressed by $$\text{M}_{{\bar{a}}}\text{L}_{{\bar{b}}}\text{X}_{{\bar{c}}}$$ (eqs (6)–(8)), the average molecular formula of all the intermediates after the release of a single molecule of the cage is expressed as $$\text{M}_{\bar{a} - 2}\text{L}_{\bar{b} - 4}\text{X}_{\bar{c} - x}$$ (eq (14)).

$$\text{M}_{\bar{a}}\text{L}_{\bar{b}}\text{X}_{\bar{c}} \to \text{M}_{2}\text{L}_{4} + \text{M}_{\bar{a} - 2}\text{L}_{\bar{b} - 4}\text{X}_{\bar{c}-x} + x\cdot \text{X}$$
(14)
The difference in the 〈n〉 value between the two intermediates, Δ〈n〉 (= 〈n1〉 − 〈n2〉), is expressed as follows:
$$\Delta \langle n \rangle = \frac{4(\langle n_{2} \rangle - 2) - x}{\bar{b} - 4}$$
(15)
where 〈n1〉 and 〈n2〉 are the 〈n〉 values for $$\text{M}_{\bar{a} - 2}\text{L}_{\bar{b} - 4}\text{X}_{\bar{c} - x}$$ and $$\text{M}_{{\bar{a}}}\text{L}_{{\bar{b}}}\text{X}_{{\bar{c}}}$$, respectively. When the cage is produced from a large intermediate $$\text{M}_{{\bar{a}}}\text{L}_{{\bar{b}}}\text{X}_{{\bar{c}}}$$ ($$\bar{b} > 4$$), the denominator in eq (15) is positive. In the case of a M2L4 cage consisting of ditopic ligands, the 〈n〉 value is never larger than 2, so the numerator in eq (15) is always negative. Consequently, when $$\bar{b} > 4$$, Δ〈n〉 is negative regardless of x, whether x is larger or smaller than 1. Thus when the final assembly is produced from large intermediates containing more components than in the product, the 〈n〉 value decreases with time.

In the following several sections, the self-assembly processes of cages,3b3d capsules,3e,3k rings,3f3h a tetrahedron,3i and a sphere3j and the chiral self-sorting process in the self-assembly of homochiral cages3l are discussed.

## 4.1 Self-Assembly Process of Pd2L4 Cages.

Pd2L4 cages are one of the simplest coordination assemblies composed of only six components. Various Pd2L4 cages have been reported using a variety of ditopic organic ligands.5 Firstly, the self-assembly process of the Pd214 cage5a (Figure 3) was investigated by QASAP.3b In QASAP, the quantification of both the substrates and the products is essential. NMR spectroscopy is the best way to easily quantify all the substrates and the products by a single measurement with high precision, although the time scale of NMR is relatively slower than other spectroscopic methods. As both the free ligand and the final assembly are detected by 1H NMR spectroscopy, if the metal ion source (MX4) and the leaving ligand (X) can be detected by 1H NMR, all four species in eq (1) can be quantified through the integration of each signal. To do so, 1H NMR-detectable X should be chosen. In addition, the coordination ability of X should be tuned so that the equilibrium in eq (1) shifts towards the right side and that X is not replaced with a solvent molecule nor a counter anion. Thus, the coordination ability of X should be stronger than that of the solvent and the counter anion and weaker than that of the coordination sites in the multitopic ligand (L). Using 3-chloropyridine (Py*) as X and CD3NO2 as solvent fulfills these requirements, so in QASAP the self-assemblies have mostly been carried out using [PdPy*4]2+ as the Pd(II) ion source in CD3NO2. In order to slowly monitor the self-assembly by 1H NMR spectroscopy and to prevent the formation of coordinately unsaturated Pd(II) centers during the self-assembly, the reaction is carried out at 298 K.

According to n-k analysis of the self-assembly of the Pd214 cage from [PdPy*4]2+ and 1 (Figure 4a), the 〈n〉 value increased from 5 to 15 min and the 〈k〉 value slightly decreased, which suggests that the incorporation of the free ligands in the intermediates and the inter- and/or intramolecular ligand exchanges in/between intermediates (eqs (11) and (12)) took place until 15 min. After 5 min, the (〈n〉, 〈k〉) value stayed around (1.65, 0.5), though the Pd214 cage was produced from the intermediates (45% of the cage was produced after 15 min). In other words, the (〈n〉, 〈k〉) value of the intermediates does not change after the release of the cage from the intermediates. These results suggest that Pd214Py*2 and Pd214Py* are the main intermediates and their composition ratio remains constant. Signals for these species were detected by ESI-TOF mass spectrometry of the reaction mixture after 15 min. Pd214Py*2 has six structural isomers and four of the six can be transformed into Pd214Py* through only one intramolecular ligand exchange. DFT calculations of the four structural isomers of Pd214Py*2 suggest that one of the isomers (I in Figure 3a) is more stable than the others by 0.4–3.4 kcal mol−1. In addition, the activation energy of the intramolecular ligand exchange in I is 2–3 kcal mol−1 lower than that of the other isomers. Thus, it is the most probable that I is the main species with the formula Pd214Py*2.

The energy barriers for the intramolecular ligand exchanges for Pd214Py*2 and Pd214Py* were determined by the time variation of the formation ratio of the cage and the 〈n〉 value.3b The activation energies for Pd214Py*2 and Pd214Py* are comparable (22.3 and 21.9 kcal mol−1, respectively), indicating that the intramolecular ligand exchanges in Pd214Py*2 and Pd214Py* are the rate-determining steps in the self-assembly of the Pd214 cage. The ligand exchange on Pd(II) centers takes place by the associative mechanism, in which the incoming ligand firstly coordinates to a square-planar Pd(II) center to form a square-pyramidal intermediate, which is then converted into a trigonal bipyramidal transition state (Figure 3c).6 The activation energies for the two steps determined by DFT calculations are also similar to each other (17.5 and 17.7 kcal mol−1) (Figure 3b) and are higher than that of the intermolecular ligand exchange of PdPy4 and 1 (PdPy4 + 1 → Pd1Py3 + Py (Py indicates pyridine): 10.7 kcal mol−1). Higher energy barriers of the intramolecular ligand exchanges for Pd214Py*2 and Pd214Py* arise from the distortion of the five-coordinate geometry in their transition states (Figure 3c). The Nin-Pd-Nout angles in the transition state for the intramolecular ligand exchanges in Pd214Py2 and Pd214Py are narrower than that in the model reaction (PdPy4 + Py → PdPy4 + Py). This result suggests that in the case of the Pd(II)-linked coordination self-assemblies consisting of rigid multitopic ligands, the late stages of the self-assembly tend to be the rate-determining steps. This tendency is also seen in the self-assembly of octahedron-shaped Pd6L8 capsules (section 4.2.1).

As the time variation of the existence ratios of the substrates and the products is obtained in QASAP, one might expect that the self-assembly process can be analyzed by a conventional kinetic analysis. In the case of the Pd214 cage, the rate constants for the rate-determining steps were previously experimentally determined, so we tried to estimate the rate constants for the other faster steps under the assumption that the cage formation takes place through a simple single pathway. However, a set of rate constants that well fit the experimental data could not be found. This indicates that the self-assembly process is much more complicated than the model process we considered in the kinetic analysis. Furthermore the self-assembly process cannot be analyzed by a usual rate equation approach, because the self-assembly takes place through multiple pathways in a network where many intermediates are connected by simple reactions (ligand exchanges in coordination self-assemblies). Recently, the coordination self-assembly processes of octahedron-shaped Pd6L8 capsules were analyzed by a novel master equation approach,7 where all the possible intermediates smaller than the capsule were considered, and the speciation of the intermediates that is impossible by QASAP was realized (section 4.2.1).

In general, coordination self-assemblies are carried out using metal sources possessing very weak leaving ligands such as counter anions and solvent molecules at higher temperature to efficiently lead to the thermodynamically most stable assembled structures. If the self-assembly process is not altered by the nature of the leaving ligand, the self-assembly should be faster using a metal source with weaker leaving ligands. To investigate the effect of the leaving ligand on coordination self-assembly processes, the self-assembly of the Pd214 cage from [Pd(CH3CN)4]2+ was carried out at 298 K. It was found that though the coordination ability of CH3CN is much weaker than that of Py*, the rate of the cage formation from [Pd(CH3CN)4]2+ is significantly slower than from [PdPy*4]2+ and that the yield of the cage at 298 K dramatically dropped (40%) to produce a lot of KTSs, which were finally converted into the cage through heating. This indicates that the leaving ligand influences the self-assembly process of the cage and that Py* tends to produce a cage passing through the pathways with low energy barriers. The employment of such leaving ligands would help to obtain assemblies with a higher efficiency under mild conditions.

The effect of the chemical structure of the ditopic ligands on the self-assembly process was then investigated. At first, the Pd224 cage assembled from [PdPy*4]2+ and ditopic ligand 2, where two pyridyl groups are connected to a central benzene ring by ether linkages to endow a higher flexibility, was investigated (Figure 5).3c As expected, the rate of the formation of the Pd224 cage is slower than that of the Pd214 cage. QASAP indicates that the change in the (〈n〉, 〈k〉) value with time is totally different between the two cages (Figure 4a and 4b). In the self-assembly of Pd224, from 5 to 30 min, the 〈n〉 value increased and the 〈k〉 value decreased, suggesting the incorporation of the free ligand 2 in the intermediates with the release of Py*. After 30 min, the 〈n〉 value increased with no significant change in the 〈k〉 value. Dynamic light scattering (DLS) measurements of the reaction mixture indicated that 200-nm-sized species were produced at 30 min and the size of the large species slightly decreased, so the increase in the 〈n〉 value after 30 min arises from the intramolecular ligand exchanges in the large intermediates (eq (12)). At 2 h, the 〈n〉 value reached 1.96, almost equal to its maximum value, 2 (Figure 4b). After 2 h, the 〈n〉 value decreased, suggesting that the Pd224 cage is produced from the large intermediates containing more components than the cage, which is consistent with the observation of submicrometer-sized species by DLS. Through scanning transmission electron microscopy (STEM) measurements, it was found that these large intermediates had a sheet like structure. Molecular modeling studies showed that sheet structures are possible, forming molecular grids in which Pd(II) ions are linearly connected by extended conformation of 2 with a separation of 1.58 nm between the neighboring Pd(II) ions (Figure 5b). The (〈n〉, 〈k〉) value of (1.96 ± 0.03, 0.59 ± 0.01) at 2 h indicates that the sheet structures possesses defects where Py*s coordinate to the Pd(II) centers placed in the core of the sheet, creating free pyridyl groups of 2.

The conversion mechanism of the sheet to the cage was also investigated. It was found that when 10% of free 2 was added, the cage formation was dramatically accelerated. On the other hand, when [PdPy*4]2+ was added to cap the pyridyl groups of free 2 that remained in the reaction mixture, the cage formation stopped. These results indicate that the coordination of the free ditopic ligands to the Pd(II) centers in the large sheet structures triggers the release of the cages from them through intramolecular ligand exchanges.

Finally, the effect of interaction between the neighboring ditopic ligands on the self-assembly process of Pd2L4 was investigated (Figure 6).3d The ditopic ligands 1 and 3 are geometrically similar but 3 contains two anthracene moieties, so in Pd234 the anthracene panels are in close vicinity and form an inner space surrounded by π-surfaces, while the Pd214 cage has an open structure. The self-assembly of Pd2345d is slower than that of Pd214, suggesting that the steric interactions between the anthracene panels in the intermediates decelerate the capsule formation. It was found that the self-assembly of the Pd234 capsule takes place mainly through two pathways. 46% of the capsules are firstly assembled from the primary intermediates smaller than the capsule (IntP: $$\text{Pd}_{{\bar{a}}}3_{{\bar{b}}}\text{Py}_{{\bar{c}}}^{*}$$, $$\bar{a} \leq 2$$). Some of IntP are converted to large intermediates that contain more Pd(II) ions than the capsule (IntL: $$\text{Pd}_{{\bar{a}}}3_{{\bar{b}}}\text{Py}_{{\bar{c}}}^{*}$$, $$\bar{a} \geq 3$$). The formation of the large intermediates was confirmed by the fact that the release ratio of Py* per single molecule of the capsule is less than 1 and that the 〈n〉 value decreased after 1 h (Figure 4c) (section 3). As the DLS measurements of the reaction mixture only showed a peak for as large species as the capsule, we have concluded that extremely large intermediates as were observed in the flexible Pd224 cage are not produced during the self-assembly of Pd234. 35% of Pd234 were produced from IntL and the rest (19% based on the ditopic ligand) remained as KTSs. The 〈n〉 value of the KTSs is less than 2, indicating that the KTSs contain free pyridyl groups of 3 but only 11% of the free pyridyl groups were capped with Pd(II) ions. This suggests that the coordination ability of most of the free pyridyl groups in the KTSs was weakened probably due to steric hindrance, which is the reason why the KTSs could not be converted into the capsule.

When small amount of partially deuterated 3 (3-d) were added to a mixture of the KTSs, the conversion of the KTSs to the capsule was significantly accelerated. About half of 3-d were incorporated in the capsule right after the addition of 3-d but the intake of 3-d soon dropped and then the capsules were mainly composed of 3, which indicates that the coordination of 3-d to the Pd(II) centers in the KTSs is the trigger of the conversion of the KTSs to the cage, that 3-d are incorporated into the cage only right after the addition of 3-d, and that after the initiation the cages are released from where 3-d were not incorporated. In the self-assembly of Pd234, the steric interactions between the neighboring ditopic ligands decelerate the formation of Pd234 by the production of large intermediates (IntL), some of which finally lead to the KTSs. However, this trend is not always the case. In the self-assembly of octahedron-shaped Pd6L8 capsules, the molecular meshing between the neighboring tritopic ligands accelerates the self-assembly (section 4.2.1).

## 4.2 Self-Assembly Process of Pd6L8 Capsules.

4.2.1 QASAP of Pd6L8 Capsules:

The self-assembly processes of octahedron-shaped Pd6L8 capsules composed of Pd(II) ions and tritopic ligands in which three 3-pyridyl groups are attached to a central hexaphenyl benzene core (4 or 5) were investigated by QASAP (Figure 7a).3e As the tritopic ligands are gear-shaped hexaphenyl benzene derivatives, the neighboring tritopic ligands in the Pd6L8 capsules mesh with each other, stabilizing the capsule structure as a whole. Thus, QASAP for the capsules is possible using pyridine (Py), which has stronger coordination ability than Py*, as the leaving ligand and more coordinative CD3CN than CD3NO2 as solvent. In both studied cases (Pd648 and Pd658), the 〈n〉 value increased, while the 〈k〉 value decreased (Figure 7b), suggesting the incorporation of the free ligand in the intermediates and the inter- and/or intramolecular ligand exchanges in the intermediates until 20 min. Then the (〈n〉, 〈k〉) value stayed at around (2.88, 0.75) for Pd648 and (2.75, 0.75) for Pd658 after 20 min. The fact that the capsule formation continued with a constant (〈n〉, 〈k〉) value after 20 min indicates that the intermediate(s) whose (n, k) value is equal to the (〈n〉, 〈k〉) value exist(s) as the main intermediate(s). Thus Pd648Py and Pd658Py2 predominantly exist as the main intermediate for each self-assembly and the intramolecular ligand exchanges for these species are the rate-determining steps (Figure 7a), which arises from the associative ligand exchange mechanism on Pd(II) centers.6 The intramolecular ligand exchanges in the late stages of the self-assembly of systems formed from rigid multitopic ligands being the rate-determining steps was also observed in the formation of the Pd214 cage.

Considering the changes in the existence ratios for the substrates and the products between the Pd648 and Pd658 capsules, it is found that in the beginning of the self-assembly the rates of consumption of the free ligand (4 or 5) and [PdPy4]2+ and of the release of Py are similar, that those consumption rates for Pd648 are faster than those for Pd658 in the middle and late stages of the self-assembly, and that the rates of the formation of the Pd648 and Pd658 capsules are similar throughout the self-assembly. These results indicate that the coordination ability of the pyridyl groups of 4 and 5 are similar as expected from their similar chemical structures. It can also be assumed that the intramolecular ligand exchanges in the middle stage of the self-assembly of the Pd648 capsule were accelerated due to higher molecular meshing between the neighboring ligands. Finally, we can conclude that the energy barrier of the rate-determining step for the Pd648 capsule is higher than that for the Pd658 capsule (Figure 7c) because of the more rigid structure of the partial capsule, Pd648Py. This increased rigidity causes a higher distortion of the trigonal bypyramidal Pd(II) center in the transition state of the intramolecular ligand exchange.

4.2.2 Theoretical Investigation of the Self-Assembly Process of Pd6L8 Capsules:

In the theoretical investigation of self-assembly processes, molecular orbital calculations (ab initio or DFT) or molecular dynamics (MD) simulations have fatal disadvantages. It is practically impossible to trace all the possible reaction pathways in molecular self-assemblies by these approaches. In molecular orbital calculations, the energies for the ground states of all the possible species and for the transition states connecting all the possible ground states have to be determined. Thus, it is better to utilize this molecular orbital approach only for specific reaction step(s) in the self-assembly such as the rate-determining step. In a previous section (4.1), we discussed the reason why the intramolecular ligand exchanges in the final stages of the self-assembly of the Pd214 cage is the rate-determining step by DFT calculations. MD simulations can follow the reaction in a very short time range (ps to ns), which is much shorter than the time-scale of molecular self-assemblies. Thus in most cases, to overcome these problems, unrealistic potentials are introduced. In addition, even if molecular self-assembly processes can be followed by MD simulations, it is uncertain whether the trajectory found by MD simulations is the major pathway of the self-assembly or not. For that reason, several trajectories were obtained starting from different initial configurations. But the possibility that the final product might be produced through another pathway still remains.

To solve these difficulties, H. Sato et al. recently developed a novel master equation approach.7 Contrary to molecular orbital approaches and MD simulations, the master equation approach can investigate molecular self-assembly processes by considering all the possible intermediates at low calculation costs. In the master equation approach, calculations are carried out based on experimental data such as the existence ratios of the species in the reaction mixtures and the (〈n〉, 〈k〉) values determined by QASAP. The advantage of the master equation approach is that speciation of the intermediates that cannot be characterized by QASAP is possible. This characteristic renders QASAP and the master equation approach mutually complementary techniques.

The self-assembly processes of the octahedron-shaped capsules (Pd648 and Pd658) were analyzed by the master equation approach.7 In this case, all the 153 intermediates that contain not more components (L and Pd(II)) than the capsule were considered. Although QASAP revealed that the final step (Pd648Py → Pd648 + Py) is the rate-determining step in the self-assembly of Pd648, the pathways to Pd648Py remained. The speciation of the intermediates produced before the formation of Pd648Py was carried out by the master equation approach. In the beginning of the self-assembly, nine species containing 1–3 Pd(II) ions are mainly produced (Figure 7d). In these nine species, none of the sides of the octahedron have been formed yet. These intermediates then grow into mainly five species, in which several sides have formed (Figure 7d). Finally, these species are converted to Pd648Py.

It is suggested by the master equation approach that Pd648Py has two structural isomers that have different activation energies to the capsule. The crystal structure of a closely related Hg648 capsule8 indicates that the capsule is composed of enantiomeric isomers of 4 with helicity arising from the tilt of the six benzene rings in the hexaphenylbenzene core (Figure 7e), that the helical isomers of 4 are alternately arranged, and that the 3-pyridyl groups of 4 coordinating to the Pd(II) centers tilt in the same direction on all the vertices of the octahedron (Figure 7f). Thus, Pd648Py has two diastereomeric isomers (Figure 7g). Because the six tritopic ligands tightly mesh with each other in Pd648Py, the interconversion between the two enantiomers is difficult under the mild conditions of QASAP. Molecular orbital calculations indicate that the activation energies for the two isomers to the capsule is considerably different and that the isomers (PP and MM) play the role of a kinetic traps of the final stage of the self-assembly of the capsule.9

## 4.3 Self-Assembly Process of a Pd4L8 Tetrahedron.

The self-assembly of rigid ditopic ligands (6) and Pd(II) ions leads to a Pd468 tetrahedron-shaped structure (Tet),10 in which two of the six sides are built by two ditopic ligands, while the rest are formed from a single ligand (Figure 8). Thus the Pd468 tetrahedron contains two chemically nonequivalent ditopic ligands in a 1:1 ratio. The self-assembly process of the tetrahedron from 6 and [PdPy*4]2+ in CD3NO2 at 298 K was investigated by QASAP.3j It was found that a Pd366 double-walled triangle (DWT) was kinetically produced faster than Tet and that DWT was finally converted to Tet. Metastable DWT was isolated and characterized by various NMR spectroscopies and ESI-TOF mass spectrometry. DLS measurements indicate that submicrometer-sized large intermediates (IntL) were also produced in the beginning of the self-assembly.

These results suggest that the three pathways to DWT, Tet, and IntL branch off from primitive intermediates, which are smaller than DWT (stage I). In the next stage (stage II), 45% of Tet are produced from DWT, IntL, and Py*. Firstly, Py* coordinates to one of the Pd(II) centers of DWT to form a partially broken DWT (Pd366Py*) having a free pyridyl group, which next coordinates to the Pd(II) center of IntL. Finally, Tet is released from this adduct through intramolecular ligand exchanges. After the consumption of DWT (stage III), Tet are produced from IntL by the initiation of Py* that coordinates to the Pd(II) centers in IntL. Therefore, Py*, which is not a component of Tet, plays a catalytic role in the self-assembly of Tet. When Py* was removed from the reaction mixture, the formation of Tet stopped, indicating that Py* is essential for the conversion of DWT and IntL to Tet.

## 4.4 Self-Assembly Process of Coordination Rings.

4.4.1 Self-Assembly Process of Pt(II)-Linked Single-Walled Macrocycles:

Macrocycles have widely been investigated as molecular hosts11 and constituents of molecular machines,12 therefore constitute an important class of synthetic targets. Because the intramolecular macrocyclization of chain molecules often competes with oligomerizations, several strategies (high dilution conditions13 and use of templates14) are applied to prevent undesired intermolecular reactions. As in self-assembled macrocycles,4a,4b,4f4i,15 thanks to the reversibility of the chemical bonds that connect the components, even if oligomers containing more components than the macrocycle are produced, these species are finally converted into the desired macrocycle under thermodynamic control. For instance, the yield of self-assembled macrocycles is generally much higher than macrocycles formed by covalent bonds, which represents the great advantage of self-assembly. However, it is uncertain whether such longer oligomers are truly produced during the self-assembly.

The chain-like oligomers that would be produced during the self-assembly of the MnLn macrocycles from ditopic MX2 (X is the leaving ligand) and L are classified into three types depending on the terminals (Figure 9a). The (n, k) values of these types of intermediates are plotted on different lines (Figure 9b), so the type and lengths of the chain-like intermediates can be analyzed by n-k analysis.

The self-assembly processes of three Pt(II)-linked cyclic hexagons shown in Figure 10 (rings A,16 B,16 and C17) were investigated by QASAP.3f,3g As to the self-assembly of ring A (Pt676, Pt indicates one of the dinuclear Pt(II) complexes shown in Figure 10) in CD2Cl2, QASAP indicates that the species whose (n, k) value is (1.0, 0.5), which corresponds to Pt72, was predominantly produced as the intermediate throughout the self-assembly (Figure 9c). In this case, this small fragmentary species was observed by 1H NMR and ESI-TOF mass measurements. The exclusive formation of a type I intermediate suggests high allosteric cooperativity on the two Pt(II) centers in Pt connected by a 1,4-phenylene spacer. The fact that Pt72 exists as the main intermediate indicates that the intermolecular reaction(s) of Pt72 is/are the rate-determining step(s) of the self-assembly of ring A.

In the case of the self-assembly of ring B from bent ditopic dinuclear Pt(II) complexes (Pt) and 7 in CD2Cl2, the growth of type I oligomers was observed by n-k analysis (Figure 9d). The selective formation of type I species suggests that the ligand exchanges on the Pt(II) centers in Pt take place with a high allosteric cooperativity even though the two Pt(II) centers are connected through a longer benzophenone spacer. The n-k plot for the self-assembly of ring B indicates that longer oligomers that contain more components than ring B were not formed.

Various types of oligomers were produced in the self-assembly of ring C from 4,4′-bipyridine (bpy) and PtPy*2 in CD2Cl2 (Figure 10), which was indicated by the change in the (〈n〉, 〈k〉) value (Figure 9e). Firstly, type I intermediates were mainly produced and then types II and III intermediates increased with decrease in type I intermediates. Judging from the (〈n〉, 〈k〉) value at the end of the self-assembly, no chain-like oligomers containing more components than ring C were produced. As the 1H NMR signals of the intermediates for the assembly of ring C were observed, it was possible to create the n-k plot by the direct quantification of their 1H NMR signals. The n-k plot from the 1H NMR signals of the intermediates (red open circles in Figure 9e) and that from signals of the substrates and the products (the usual approach, represented by green open circles in Figure 9e) show good consistency, which demonstrates the reliability of QASAP. The growth of the oligomers was also monitored by time-dependent 1H DOSY measurements of the signals for the oligomers. The change in the type of the oligomers produced during the self-assembly of ring C can be explained by the negative allosteric cooperativity of the two binding sites in bpy as the pKa values of the conjugate acids of bpy suggest (pKa1 = 4.82 and pKa2 = 3.19). In the beginning of the self-assembly, the high positive allosteric cooperativity of the two Pt(II) centers in Pt selectively produces Ptbpy2 (type I). Because the coordination ability of the free pyridyl groups in Ptbpy2 is weak enough to compete with Py*, the pKa value of whose conjugate acid is 2.84, types II and III intermediates tend to be produced as the self-assembly proceeds.

In every case, oligomers containing more components than the macrocycles were not produced. This is partly because of the high rigidity of the components (dinuclear complexes and ditopic ligands). However, considering that the components are connected by Pt–N single bonds, chain-like oligomers can adopt various conformations other than the C-shape, which is suitable for the intramolecular macrocyclization. Thus the high rigidity of the components should not be enough to suppress the formation of longer oligomers. For instance, other factors such as electrostatic and/or steric interactions between the components should also contribute to the efficient cyclization.

4.4.2 Self-Assembly Process of Pd(II)-Linked Double-Walled Macrocycle:

The self-assembly process of a double-walled square (DWS), Pd488,18 in which two ditopic ligands occupy each side, assembled from 8 and [PdPy*4]2+ in CD3NO2 at 298 K was investigated by QASAP (Figure 11).3h Monitoring the self-assembly by 1H NMR spectroscopy suggested the formation of a kinetically trapped species (KTS) whose symmetry is as high as that of DWS. The ESI-TOF mass spectrometry and 1H DOSY measurements indicate that the KTS is the Pd386 double-walled triangle (DWT). In the very early stage of the self-assembly, 1H NMR signals for the primitive intermediates (IntP) were observed and then disappeared partly due to the formation of submicrometer-sized intermediates (Int), which were confirmed by DLS. 30% of DWS were directly produced from IntP, which is the fastest pathway to DWS. Then the coordination of Py*s to the Pd(II) centers in the submicrometer-sized intermediates (Int) initiated the formation of DWSs from Int to lead to other submicrometer-sized intermediates (Int′), which then reacted with DWT, which was produced from IntP in the early stage of the self-assembly, with the aid of Py* to produce DWS. Finally, about 10% of Int′ (based on the ditopic ligand) remained as a kinetic trap, which was detected by DLS. The conversion of Int, Int′, and DWT was prevented by removing Py* from the reaction mixture and the conversion restarted by the addition of Py*. The n-k analysis indicates that even though Int and Int′ contain free pyridyl groups of 8, they cannot be converted to DWS without Py*, which is probably because the free pyridyl groups in Int and Int′ are sterically covered.

## 4.5 Self-Assembly Process of a Pd12L24 Sphere.

Pd12L24 coordination spheres19 are large, discrete structures assembled from 36 components. The self-assembly process of a Pd12924 sphere19c from 9 and [PdPy*4]2+ in CD3NO2 at 298 K was investigated (Figure 12).3j 1H NMR, 1H DOSY, and ESI-TOF mass measurements indicate that closed structures smaller than the Pd12924 sphere, Pdm92m (m = 6, 8, and 9) were kinetically produced, which is consistent with the previous experimental19e and theoretical20 investigations of a different Pd12L24 sphere. It is true that these kinetically trapped species were converted into the Pd12924 sphere, but QASAP revealed that the formation of the Pd12924 sphere from the intermediates not observed by 1H NMR (Int), which were formed in 44% yield based on the ditopic ligand, is faster than the conversion of Pdm92m (m = 6, 8, and 9) to the sphere. It was also found that the conversion of Pdm92m (m = 6, 8, and 9) takes place through the coordination of the free pyridyl groups in Int to the Pd(II) centers in Pdm92m (m = 6, 8, and 9). When the free pyridyl groups in Int were capped with Pd(II) ions, the self-assembly of the Pd12924 sphere and the conversion of the Pdm92m (m = 6, 8, and 9) stopped. On the other hand, when small amount of free ditopic ligand were added, the conversion of Int to the Pd12924 sphere was dramatically accelerated. These results indicate that the free pyridyl groups play an important role in the self-assembly of the Pd12924 sphere. This is reasonable considering the associative ligand exchange mechanism on Pd(II) centers.6 On the other hand, free Py* is not involved in the self-assembly process, contrary to the self-assemblies of Pd468 Tet and Pd488 DWS, indicating that the leaving ligand does not always intervene in the self-assembly process.

## 4.6 The Effect of Solvent and the Leaving Ligand on the Self-Assembly Process.

Considering that chemical reactions are often affected by environmental factors such as the solvent,21 it is natural to expect that various species not contained in the final product should affect the self-assembly process. In coordination self-assemblies, solvent molecules, leaving ligands, and counter anions are those that may affect the self-assembly. We have already seen in the previous sections that Py* sometime plays a key role in the conversion of the intermediates and the kinetically trapped species. Here we discuss the effect of the coordination ability of solvent and leaving ligand on the self-assembly process of the Pd648 capsule and Pt(II)-linked rings.3g,3k

4.6.1 The Effect of the Solvent:

The key role of the solvent in chemical reactions in solution is the solvation of the involved species. Solvent molecules that contact the solutes alter the stabilities of the substrates, of the transition sate, and/or of the products in different ways depending on their polarities, causing increase or decrease of the activation energies of interest. This solvation effect is the most important when the solutes are charged molecules as in coordination assemblies seen in the previous sections, so it is natural to expect that the solvent affects coordination self-assembly processes. Another important role of the solvent is seen in the solvophobic effect, which is relevant in less polar solutes. As the solvophobic effect (especially the hydrophobic effect) is the dominant factor which initiates host-guest complexations, protein folding, and the assembly of hydrophobic and amphiphiles molecules, the solvophobic effect may partly affect the coordination self-assembly processes. We will discuss the efficient use of the hydrophobic effect to construct a discrete molecular self-assembly in water in section 5.

As described above, the ligand exchanges on square-planar Pd(II) and Pt(II) ions take place through the formation of five-coordinate intermediates and transition states (Figure 3c),6 which are produced by the coordination of the incoming ligand, so solvents and counter anions with coordination ability can affect the ligand exchanges. Indeed, the rate of the ligand exchange of one of the pyridines (Py) in PdPy4·(OTf)2 with free Py in CD3CN at 298 K determined by 1H NMR saturation transfer experiment (2.0 × 10−2 s−1) is about six times faster than that in CD3NO2 (0.34 × 10−2 s−1), indicating that coordinative solvents accelerate the ligand exchange.3k Therefore, if the self-assembly pathway(s) is/are not altered by the solvent at all, the self-assembly should become faster in coordinatively stronger solvents. However, when the self-assembly of the Pd648 capsule was carried out in less coordinative solvent (CD3NO2 and CD2Cl2 (4:1, (v/v)), the yield of the capsule was dramatically dropped, suggesting that the self-assembly pathway was altered by the solvent and that less coordinative solvents tend to produce more kinetically trapped species. It was found that the yield of the capsule did not improve even when CD3NO2 was replaced with CD3CN at the very early stages of the self-assembly (5 min) and that the opposite is also true; the yield of the capsule did not decrease when CD3CN was replaced with CD3NO2 at 5 min. These results suggest that whether the intermediates are finally led to the capsule or to the kinetic traps is determined at the beginning of the self-assembly. As has often been seen in molecular self-assemblies, the kinetic traps should efficiently be converted into the correct assemblies through heating. However, the conversion of the kinetically trapped species produced in CD3NO2 at 298 K into the capsule was not possible even by heating in CD3NO2 and in CD3CN. In DMSO-d6, however, the KTSs were effectively converted to the thermodynamic product, indicating that a coordinately stronger solvent such as DMSO is required for the correction.

The effect of solvent polarity on the self-assembly process of Pt(II)-linked hexagons was investigated.3i In the case of ring A, Pt72 is the dominant intermediate both in CD3NO2 and in CD2Cl2 with no change in the self-assembly mechanism. On the other hand, for ring C, Pt′bpy2 is the dominant intermediate in CD3NO2, while the growth of oligomers was observed in CD2Cl2. These results indicate that the solvents in which these reactions take place constitute a very important factor that affects the self-assembly pathways.

4.6.2 The Effect of the Leaving Ligand:

The effect of the leaving ligand with different coordination ability (CH3CN, Py, Py*, and Py4Me (4-methylpyridine)) on the self-assembly process of the Pd648 capsule was investigated.3k DFT calculations indicate that the energy barrier of the ligand exchange becomes lower with weaker leaving ligands, suggesting that the self-assembly takes place faster with a metal source with weaker leaving ligands under the assumption that the self-assembly process is the same regardless of the leaving ligand. However, in the Pd648 capsule assembled from [PdX4]2+ (X: CH3CN, Py, Py*, and Py4Me) and 4 in CD3NO2 and CD2Cl2 (4:1, (v/v)) at 298 K, the self-assembly took place the fastest when the leaving ligand was Py4Me, which has the strongest coordination ability among the four. In addition, it was found that the self-assembly using the leaving ligand with weaker coordination ability tends to produce more kinetically trapped species. These results suggest that leaving ligands as much affect the self-assembly as the self-assembly pathway is altered. Indeed, n-k analysis showed that the change in the (〈n〉, 〈k〉) value is totally different for each leaving ligand, demonstrating the formation of different intermediates. Therefore, the role of the leaving ligand on the self-assembly process is quite complicated, though the consequence in equilibrium can easily be deduced. For the full understanding of the effect of the leaving ligand, a more comprehensive investigation is required, but it should be emphasized that it would be possible to find a suitable leaving ligand that enables to smoothly produce the thermodynamically most stable assembly or a metastable one under mild conditions by choosing a proper self-assembly pathway.

## 4.7 Chiral Self-Sorting Process in the Formation of Homo Chiral Pd2L4 Cages.

Chiral self-sorting, which takes place through the interplay between enantiomeric components to bias a system toward homochiral or heterochiral assembly, is one of the fundamental phenomena relative to the mystery of how evolution has selected (resolved) and amplified specific chiral isomers over others.22 On top of the ubiquity in natural systems, chiral self-sorting has also been reported in artificial systems.23 However the understanding of chiral self-sorting processes has been elusive. To overcome this deficiency, the chiral self-sorting in the self-assembly of homochiral Pd2104 cages24 was investigated (Figure 13).3l Firstly, the self-assembly process of Pd210S4 from 10S and [PdPy*4]2+ in CD3NO2 and CD2Cl2 (4:1 (v/v)) at 298 K was revealed by QASAP (Figure 13). In the beginning of the self-assembly, three mononuclear complexes (cis and trans isomers of Pd102Py* and Pd10Py*3) are mainly produced as primary intermediates (IntP). Then the intermolecular reactions between these three species lead to dinuclear complexes (Pd2104Py*3 and Pd2103Py*4). One of the isomers of Pd2104Py*3, A, produced from bimolecular reactions of cis Pd102Py* and of trans Pd102Py* can be converted into the cage by only releasing Py*s through intramolecular ligand exchanges. On the other hand, the other isomer of Pd2104Py*3, B, formed from the reactions between the cis and trans isomers of Pd102Py* leads to Pd2103Py*2 through intramolecular ligand exchanges by kicking out Py* and L. Pd2103Py*2 is also produced by the bimolecular reactions between Pd102Py*2 and Pd10Py*3 and the subsequent intramolecular ligand exchanges. The so formed Pd2103Py*2 reacts with free 10 or IntP to lead to the cage. The cage formation through the intermediates (A) is the major pathway.

The chiral self-sorting process from a racemic mixture of 10 (10S and 10R) and [PdPy*4]2+ was investigated by 1H NMR and ESI-TOF mass measurements.3l The self-assembly of the homochiral cages from a racemic mixture of 10 is considerably slower than that from a single enantiomer (10S), suggesting that heterochiral intermediates and/or cages were produced and that the correction of the chirality in the heterochiral species retarded the self-assembly. The homo- and heterochiral cages were characterized by 1H NMR and thus their formation ratio could be determined by the integrals of their signals. The self-assembly of the homochiral cages from a 1:1 mixture of 10S and 10R-d6 was monitored by mass spectrometry. As Py*s coordinating to the Pd(II) centers leave during the ionization process, the mass signals of [Pd2104]4+ are derived from Pd2104Py*c (c = 0–3), so the intensities of the signals for the diastereomeric isomers of [Pd2104]4+ contain information about the cages and the intermediates.

To indicate the degree of chiral self-sorting, the following parameter, X, is defined.

$$\text{X} = 1 - \frac{\sum_{i = 1}^{n}a_{i}}{\sum _{i = 1}^{n}s_{i}}$$
(16)
When the homo- and heterochiral isomers exist in the statistical ratio, X = 0. A positive value of X indicates perfect homochiral self-sorting, while a negative X indicates heterochiral sorting. X = 1 indicates perfect homochiral self-sorting only producing homochiral species. To investigate the chiral self-sorting process, the changes in the X value determined by 1H NMR (XNMR) and that by ESI-TOF mass (XMS) were monitored (Figure 13c). At 15 min, XNMR is 0.71, while XMS is negative. This result indicates that in the beginning of the self-assembly the intermediates of the cage (Pd2104Py*c (c = 1–3)) are biased toward the homochirality, though only the homochiral cages were produced. After 15 min, XNMR decreased and XMS increased until 30 min and then XNMR and XMS increased and finally reached 1. The complicated changes in XNMR and XMS indicate the following chiral self-sorting process (Figure 13b). In the bimolecular reactions of IntP, the heterochiral Pd2114Py*3 are preferred. The intramolecular ligand exchanges in the homochiral species (Pd211S4Py*c and Pd211R4Py*c (c = 1–3)) take place much faster than those in the heterochiral intermediates, so the homochiral cages are exclusively produced at first, which is the reason why XNMR = 0.71 at 15 min. After the consumption of the homochiral intermediates, the heterochiral cages ((S,S,S,R) and (S,S,R,R) and their enantiomers) are produced from the heterochiral intermediates, which decreases XNMR. After 30 min, XNMR increases again, which partly arises from the correction of the heterochiral cages to lead to the homochiral cages. As the intermediates remain throughout the self-assembly, the increase in XMS until 12 h indicates that the correction of the chirality in the intermediates (from hetero to homo) also took place until 12 h. After 12 h, the homochiral intermediates are mainly converted into the homochiral cages. These results indicate that the chiral self-sorting process of the homochiral Pd2104 cages is complicated and the heterochiral intermediates that are produced by the bimolecular reactions of IntPs are kinetically favored even though the homochiral cages are the thermodynamically most stable.

For the formation of discrete assembled structures, it is necessary to uniquely determine the relative positions of the components. To do so, reversible chemical bonds with high directionality such as hydrogen and coordination bonds have almost always been utilized. In other words, discrete molecular self-assemblies formed by only utilizing less or nondirectional bonds constitute a synthetic challenge in molecular self-assembly. van der Waals (vdW) interaction, which is a function of r−6 (r is the separation distance), is a less directional, attractive force always working between molecules that are very close to each other. It is true that vdW interaction is the weakest intermolecular interaction, but nature efficiently utilizes this interaction as seen in the feet of gecko and in protein folding and assembly.25 However, vdW interaction has scarcely been utilized for the construction of artificial discrete molecular self-assemblies. If a general design principle for the molecular self-assembly based on vdW interaction can be established, this could not only contribute to the understanding of the role of vdW interactions in biological systems at a molecular level,26 but would also enable us to create novel materials assembled by vdW interactions.

## 5.1 Molecular Hozo: Toward Discrete Assemblies in Water.

Let us consider the molecular self-assembly in water, where hydrophobic molecules are assembled by the hydrophobic effect. It is known that the stabilization free energies of hydrophobic assemblies are proportional to the desolvation surface areas formed upon self-assembly,27 so in order to make a stable assembly in water, components with a large desolvation surface area should be designed. In addition, vdW interaction must work between the neighboring components in the assembly. With these in mind, it is expected that the components possessing complementary large hydrophobic surfaces are self-assembled into a stable structure only utilizing vdW interaction and the hydrophobic effect. Although these two factors do not have sufficient directionality to precisely control the relative positions of the components, the complementary hydrophobic surfaces of the components compensate for their loss of directionality. In other words, the information about the directionality between the components is included in the shape of the hydrophobic surfaces. This idea is similar to a traditional artisanal method used to hold together furniture without employing nails or glue (‘mortise and tenon’ or Japanese ‘Hozo’) (Figure 14a). Following the design principles previously proposed, components possessing an indented hydrophobic molecular surface, gear-shaped amphiphiles (GSAs), which are hexaphenylbenzene derivatives with hydrophilic and hydrophobic substituents, were designed. As a matter of fact, GSAs self-assemble into cube-shaped assemblies, nanocubes, in a hydrophobic environment.

## 5.2 1st Generation Nanocube.

Firstly, the aggregation behavior of GSAs, 4, was investigated in aqueous methanol because of the insolubility of 4 in water.28 Upon the addition of D2O in a solution of 4 in CD3OD, new 1H NMR signals appeared, suggesting the solvophobic aggregation of 4. In CD3OD and D2O (3:1, v/v) only the new signals were found. A 2-nm-sized cube-shaped hexameric aggregate, 46, was characterized by 1H NMR, H-H COSY, and 1H DOSY spectroscopies, ESI-TOF mass spectrometry, and X-ray analysis of a single crystal. Although the monomer of 4 is a C3 symmetric molecule, each GSA in 46 has desymmetrized to C1. 46, the 1st generation nanocube, has one S6 (C3) axis and a center of symmetry, thus belongs to the S6 point group, which indicates that all six GSAs in the nanocube are chemically equivalent and that the three GSAs around the S6 axis in the northern hemisphere (Figure 14e) and those in the southern hemisphere are an enantiomeric pair.

The thermodynamic parameters for the self-assembly of 46 were determined by dilution isothermal titration calorimetry (ITC) measurements.29 It was found that the formation of the nanocube is enthalpically favored (ΔH298 = −216 kJ mol−1) but entropically disfavored (ΔS298 = −354 J mol−1 K−1), suggesting the great contribution of vdW interactions between the GSAs to the stability of the nanocube. When the three methyl groups in 4 are replaced with hydrogen atoms (411), no aggregation behavior was observed, suggesting that the vdW interactions around the methyl groups contribute heavily to the stability of the nanocube. The importance of vdW interactions between the GSAs in the nanocube was demonstrated by the ab initio and DFT calculations that consider vdW interactions.30 The importance of vdW interactions for the stabilization of the nanocubes in water will be discussed in later sections.

The nanocube has a 1-nm-sized hydrophobic inner space, which can be utilized for the encapsulation of hydrophobic molecules. For example, a couple of 1,3,5-tribromomesitylenes (G) were entrapped and aligned in a face-to-face fashion in the nanocube. When a spherical hydrophobic molecule smaller than the inner space of the nanocube such as adamantane was used as a guest, the nanocube was converted into a 44 tetrahedron by the induced-fit effect.29

## 5.3 2nd Generation Nanocube.

The disadvantage of the 1st generation nanocube, 46, is its insolubility in water. To improve the solubility of the nanocube, some 3-pyridyl rings in 4 were replaced with N-methyl-3-pyridinium groups (Figure 14c). The crystal structure of 46 shows that three 3-pyridyl rings derived from different GSAs are stacked, so when two of the three 3-pyridyl rings in 4 are replaced with N-methyl-3-pyridinium groups (PM GSA in Figure 14c), a neutral 3-pyridyl ring should be placed in between the two N-methyl-3-pyridinium groups so as to minimize the electrostatic repulsion between the two pyridinium groups in a PM nanocube. Indeed, all the GSAs derived from 4 by N-methylation (12+, PM, and 133+) are assembled into nanocubes. However [126]6+ is not soluble in pure water and [136]18+ is less stable than PM nanocube.31 The single-crystal X-ray analysis of PM nanocube indicates that all six 3-pyridyl rings in the nanocube are placed in between the pyridinium groups. The thermodynamic parameters for PM nanocube could not be determined by dilution ITC experiments, because no detectable heat change was observed by ITC upon the dilution of a concentrated aqueous solution of PM nanocube, suggesting that PM nanocube is significantly stable. The thermal stability of PM nanocube was then investigated by variable temperature (VT) 1H NMR spectroscopy. The disassembly temperature (T1/2), at which half of the nanocubes are disassembled into the monomers, for PM nanocube ([PM monomer]total = 1.0 mM) is 385 K, which is higher than the boiling point of water.32

The differential scanning calorimetry (DSC) measurement of PM nanocube showed a peak around 390 K, which is consistent with T1/2 determined by the VT 1H NMR measurements, but the thermodynamic parameters could not be determined due to the temperature limit of the instrument.32 The detailed thermodynamic nature of PM and other nanocubes will be discussed in section 5.4.

## 5.4 3rd Generation Nanocube.

The higher thermal stability of PM nanocube than 46, which is structurally quite similar to PM nanocube, arises mainly from the hydrophobic effect but the cation-π interactions between the 3-pyridyl and the pyridinium rings in the triple π-stackings should also contribute to the overall stability. If this is true, stronger cation-π interactions in the triple π-stackings should enhance the thermal stability of the nanocube. Because the cation-π interactions are mainly electrostatic interactions, aromatic rings whose electrostatic potential surface is more negative should strengthen the cation-π interactions. Hence, the nanocube assembled from BM GSAs, in which a phenyl group is introduced on the periphery of the hexaphenylbenzene core instead of 3-pyridyl ring, is expected to show higher thermal stability than PM nanocube. BM GSA was synthesized using the halogen dance reaction of pentabrominated hexaphenylbenzene derivatives,33 which enables us to synthesize C2v symmetric hexaphenylbenzene derivatives, as the key reaction. T1/2 of BM nanocube is 403 K, which is higher than that of the PM nanocube by 18 K. Although the difference in the electrostatic potential surface between benzene and pyridine is small, BM nanocube shows strong improvement in T1/2 (18 K), indicating that the multiple cation-π interactions are also important for the stability of the nanocubes.

## 5.5 Extremely High Thermal Stability of Nanocubes.

To investigate the effect of the triple π-stacking and the hydrophobic substituents on the periphery of the hexaphenylbenzene core on the thermal stability of the nanocube, HM and BD nanocubes were designed and synthesized. BM nanocube is the most stable among the four nanocubes that are soluble in water. When the phenyl group of BM GSA is replaced with a hydrogen atom (BM to HM), T1/2 dramatically dropped (T1/2 = 313 K), indicating that the cation-π interactions working in the triple stackings are essential for the high thermal stability. The replacement of the three p-tolyl methyl groups in BM GSA with deuterium atoms (BM to BD) also decreased T1/2 by 65 K (T1/2 of the BD nanocube is 338 K). This result indicates that vdW interactions working around the methyl groups also play a significant role for the thermal stability of BM nanocube.

The factors that mainly contribute to the stability of the nanocubes are the hydrophobic effect and vdW and cation-π interactions, all of which are utilized in the stabilization of folded proteins and their assemblies. Although many proteins are denatured with gentle heat, proteins in hyperthermophiles, which can thrive over 80 °C, are stable at very high temperatures exceeding the boiling point of water. As the hyperthermophilic proteins are mainly composed of general amino acids, their sequence is the key to gain the extremely high thermal stability they are renowned for. The molecular mechanism of their stability has not been fully understood yet. T1/2 of BM nanocube is higher than that of most hyperthermophilic proteins. This result indicates that it should be possible to construct molecular self-assemblies that are as thermally stable as the hyperthermophilic proteins using only weak molecular interactions and the hydrophobic effect.

Under hydrophobic conditions, the heat capacity change (ΔCP) is not negligible (ΔCP ≠ 0), so the free energy change for the disassembly, ΔGd(T), can be expressed by the following equation.34

\begin{align} \Delta G_{\text{d}}( T ) &= \Delta H_{\text{d}}(T_{(\Delta G = 0)})\left(1 - \frac{T}{T_{(\Delta G = 0)}}\right) \\ &\quad + \Delta C_{\text{P}}\left(T - T_{(\Delta G = 0)} - T\textit{ln} \left(\frac{T}{T_{(\Delta G = 0)}}\right)\right) \end{align}
(17)
where $$T_{(\Delta G = 0)}$$ is the temperature at which ΔGd = 0 and ΔHd is the enthalpy change for the disassembly. Equation (17) indicates a parabola-shaped curve (the stability curve), which enables us to discuss the nature of the nanocubes (Figure 14f and 14g). The curvature of the parabola is determined by ΔCP; a larger ΔCP value makes the stability curve narrower, which indicates that when ΔCP is large, the temperature change of ΔGd is large. A high ΔHd value at $$T_{(\Delta G = 0)}$$, ΔHd($$T_{(\Delta G = 0)}$$), increase the maximum of the parabola. The stability curves based on eq 17 indicate that the stability of the assemblies decreases at low temperatures, which is contrary to common intuition but is supported by the fact that some proteins are truly destabilized at low temperature.35

It is known that ΔCP linearly correlates with the desolvation surface area (ΔSAS).36 ΔSAS of the nanocubes ranges from 3400 to 4200 Å2, which is as large as that of small proteins. In general, ΔCP is determined by VT ITC measurements but as described above, the extremely high stability of PM and BM naocubes prevented the determination of their thermodynamic parameters by ITC and DSC. Therefore, the ΔCP values for the nanocubes were estimated by an empirical equation.36 The shapes of the stability curves of BM and PM nanocubes are similar, which is due to similar ΔCP values (Figure 14f). The top of the parabola for BM nanocube is slightly higher than that for PM nanocube, indicating that BM nanocube is enthalpically stabilized, which arises from the stronger cation-π interactions in BM nanocube. Broader stability curves for HM and BD nanocubes are due to their smaller ΔSAS, so the change in ΔGd with temperature is smaller. Tops of the parabola for HM and BD nanocubes are lower than those for BM and PM nanocubes, indicating the smaller enthalpic contribution to the stability of HM and BD nanocubes.

The thermal stability of the four nanocubes well correlates with the density (partial specific volume) of the nanocubes, which suggests that the GSAs in more stable nanocubes more tightly mesh with each other to form a densely packed assembled structure. Partial specific volumes of the nanocubes are higher than those of proteins, indicating that the packing of GSAs in the nanocubes is not as high as in proteins, which arises from a 1-nm-sized inner void space of the nanocubes. It is known that most of the hyperthermophilic proteins have no large void spaces, as these spaces destabilize the proteins. Considering this fact, it is surprising that BM nanocube shows extremely high thermal stability even though a large void space exists in it. The stability of the nanocubes is improved by filling the inner space with hydrophobic molecules. When a couple of Gs are encapsulated in BM nanocube, T1/2 of G2@BM is over 423 K, which is higher than the decomposition temperature of the most stable hyperthermophilic proteins (PhCutA1). The stability curves for the nanocubes after the encapsulation of G are similar (Figure 14g), suggesting that the shape of the nanocubes becomes similar to each other by the induced-fit effect, which was confirmed by similar partial specific volumes of the G2@nanocubes.

The desolvation surface areas for the four nanocubes are similar but their thermal stability is quite different. This indicates that though the hydrophobic effect is the dominant factor of the self-assembly of the nanocubes, the high thermal stability can be realized only if vdW and cation-π interactions work efficiently.

## 5.6 Semiquantitative Analysis of Molecular Meshing in Assemblies.

The high thermal stability of BM and PM nanocubes indicates the importance of vdW and cation-π interactions in the hydrophobic self-assemblies besides the hydrophobic effect. As described above, the free energy of the self-assemblies or host-guest complexes in water is a linear function of ΔSAS (eq (18)).27i

$$\text{log}K_{\text{a}} = 0.011 \times \Delta \text{SAS}/2\ (Å^{2})$$
(18)
This relationship is valid in a wide range of complexes from small host-guest complexes (ΔSAS/2 ≈ 200 Å2) to protein-antibody complexes (ΔSAS/2 ≈ 800 Å2). It was found that the logKa values for BM and PM nanocubes are situated much higher than the line represented by eq (18) (Figure 15a), which indicates that BM and PM nanocubes are significantly stabilized by other factors than the hydrophobic effect (the vdW and cation-π interactions). The speciation of the contributions from vdW interactions and from the hydrophobic effect has been a difficult problem as both contributions similarly correlate with the desolvation surface area.37 However, to be more precise, the hydrophobic effect is dependent on the number of water molecules that desolvate upon the self-assembly (so to the desolvation surface area: ΔSAS), while vdW interactions depend on the separation between the contact surfaces.

As a simple example, we consider dimers of U-shaped hydrophobic molecules with different degrees of molecular meshing (Figure 15b). From the hydrophobic point of view, ΔSASs of A2 and B2 dimers are very similar, which indicates that the contribution of the hydrophobic effect is comparable in the two dimers. The contact surface areas for A2 and B2 dimers (red lines) are also the same, but the separations between these surfaces (interstices) in the B2 dimer are higher, indicating that the vdW contribution in B2 is lower than that in A2. Thus, the difference in the contributions from the hydrophobic effect and from vdW interaction arises from the separation between the contact surfaces and is intuitively evaluated as the difference in the molecular meshing between the two U-shaped molecules.

For the semi-quantitative analysis of the molecular meshing, a novel method (surface analysis with varying probe radii: SAVPR) was developed.38 In SAVPR, the distribution of the contact surface area is plotted versus the contact distance (D) (Figure 15c) in order to visualize the molecular meshing. A large surface area with short contact distances indicates a tight meshing between the components in the complex. SAVPR is possible for various molecular complexes as long as the 3D structure based on the crystal or an energy-minimized structure is obtained. SAVPR for the four nanocubes shown in Figure 15c indicates that the six GSAs more tightly mesh each other in BM and PM nanocubes than in BD and HM nanocubes. As the SAVPR profiles for BM and PM nanocubes are quite similar, the difference in the thermal stability between the two nanocubes should be due to the difference in the cation-π interactions in the triple stacking. SAVPR of the nanocubes indicates that molecular meshing, which is closely related to vdW interaction, much contributes to the stability of the molecular complexes in water, even though vdW interaction is the weakest attractive interaction that works only if the molecular surfaces contact with very short separations (less than 1 Å, which is much shorter than the diameter of a water molecule, 2.8 Å). It also demonstrates that it is possible to construct thermally very stable assemblies only using less or nondirectional weak molecular interactions as long as complementary hydrophobic molecular surfaces can be designed.

We discussed two unresolved issues in molecular self-assembly. Contrary to the beauty of the self-assembled structures, molecular self-assembly is a complicated phenomenon where the components are assembled passing through several pathways, some of which involve the formation of kinetically trapped species. The pathway complexity in molecular self-assembly can be explained considering that molecular self-assembly taking place on an energy landscape and not on a simple reaction coordinate, which is quite similar to protein folding processes on a folding funnel (Figure 16).39 In both cases, the reaction (folding or assembly) proceeds from disordered states to a well-ordered state. The understanding of the general principles that underlie molecular self-assembly and protein folding will deepen our understanding of the contrivances organizing life systems and also open up the possibility of novel molecular self-assemblies that could not be realized without mechanistic understanding of the formation process. In the course of the investigation of the coordination self-assembly processes, metastable species have sometimes been isolated. If these species are mixed with some other components under kinetic conditions, complicated metastable assembled structures can be produced through a pathway on a new energy landscape.

The study of thermally stable nanocubes indicates that the hydrophobic effect is the dominant factor that triggers the self-assembly in water, but vdW interaction cannot be neglected even though it is the weakest molecular interaction and has the lowest directionality. In order to make the best use of vdW interactions, it is necessary to precisely design complementary hydrophobic molecular surfaces, which in practice correspond to chemical bonds with high directionality. This is the design principle of the hydrophobic surface engineering which allows to obtain discrete assemblies in water and the molecular Hozo is the strategy to make the best use of both the hydrophobic effect and vdW interaction. SAVPR of the nanocubes suggests that hydrophobic surfaces with separations shorter than 1 Å significantly contribute to the stability of the assemblies. SAVPR can be utilized for the semi-quantitative analysis of the contribution of vdW interaction in artificial and biological molecular assemblies and for the rational design of complementary molecular surfaces that lead to discrete molecular self-assemblies in water.

This research was supported by JSPS Grants-in-Aid for Scientific Research on Innovative Areas “Dynamical Ordering of Biomolecular Systems for Creation of Integrated Functions” (25102001 and 25102005), The Asahi Glass Foundation, The Mitsubishi Foundation, and Sekisui Integrated Research.

Shuichi Hiraoka

Shuichi Hiraoka obtained his Ph.D. from Tokyo Institute of Technology in 1998. After postdoctoral research at the Institute for Molecular Science with Makoto Fujita, he joined the Department of Applied Chemistry, Kanagawa University as an Assistant Professor in 1999. In 2000, he joined the Department of Chemistry, the University of Tokyo as an Assistant Professor and was promoted to Associate Professor in 2007. He has been a Professor in the Department of Basic Science, the University of Tokyo since 2010.