Persistent homology to quantify the quality of surface-supported covalent networks.

Covalent networks formed by on-surface synthesis usually suffer from the presence of a large number of defects. We report on a methodology to characterize such two-dimensional networks from their experimental images obtained by scanning probe microscopy. The computation is based on a persistent homology approach and provides a quantitative score indicative of the network homogeneity. We compare our scoring method with results previously obtained using minimal spanning tree analyses and we apply it to some molecular systems appearing in the existing literature.


Introduction
On-surface synthesis is emerging as a very efficient technique to create organic functional surfaces. [1][2][3][4] In a bottom-up approach, well designed molecular precursors are deposited on a surface and activated to produce original chemical compounds or covalently linked networks.
The most successful realization of on-surface synthesis is the creation of graphene nanoribbons (GNR) with perfectly defined atomic structure. [5][6][7] While such one-dimensional (1D) approach remains rather easily accessible and produces polymeric wires up to the micrometer range, [8][9][10][11][12] the quest for producing two-dimensional (2D) covalent structures, or 2D polymers, [13] is highly challenging. The formation of surface-supported 2D covalent structures can be seen as an extension of graphene growth, whereby the extraordinary properties of this material [14] are improved further by controlling the atomic scale structure and the introduction of heteroatoms. Robust surfaces with exceptional properties are thus expected. [15][16][17] A large variety of surface-supported 2D networks were fabricated in ultrahigh vacuum (UHV) environment [18][19][20][21][22][23][24][25][26][27][28][29][30][31][32][33][34][35][36] or at the liquid-solid interface. [37][38][39][40][41][42][43][44] In the latter case thermodynamic equilibrium conditions can be reached and extended well-ordered networks can be formed. In contrary, in UHV the growth process is irreversible, which prohibits self-healing mechanisms and generates a large number of defects. Indeed, most 2D networks to date are limited to a few tens of nm in size and are poorly ordered. The formation of defects in the networks can have several origins: 1) incomplete reactions, leading to the formation of oligomeric portions with free ends; 2) flexibility of the molecular backbone. Some distortions in the network can arise from slight variations in the bond geometries as compared to the ideal cases. For example in honeycomb networks pentagonal and heptagonal pores, or even tetragonal and octagonal pores are sometimes observed alongside the ideal hexagonal pores [20,21,[45][46][47] ; 3) poor selectivity in the reaction mechanism, i.e. the introduction of different products with poorly adapted configuration. A wide set of tunable parameters is usually available to direct the network growth, such as the reaction temperature, the substrate nature and its crystallographic orientation, or the addition of metal adatoms. [4] However, the complexity level in on-surface synthesis is high and it is still hardly possible to extract any general tendency for the reaction mechanisms. [4,48,49] Nevertheless, the optimization of the growth process and the quest for networks of ideal or well-controlled quality is of prime importance.
Surface-supported networks are usually characterized by scanning probe microscopy (scanning tunneling microscopy, STM, or atomic force microscopy, AFM) The development of an easy and straightforward quantification tool of these real space images is required.
In fact the assessment of the quality of the networks has been barely addressed and the successful network formation is usually qualified subjectively as of "good" or "poor" quality.
The different defects observed can be assigned from the STM images and counted individually to get an estimation of the overall quality. Only the extension of the networks (domain size) can be properly quantified. The global morphology of the networks can be assessed qualitatively by using Monte Carlo simulations and their comparison with the STM images. [50] A statistical analysis approach based on the minimal spanning tree (MST) method [51] to provide a quantitative and comparative estimation of the quality of boronic acid based covalent networks has been proposed. [45] This approach represents, to our knowledge, the one and only attempt to treat the problem of the quantification of the network quality.
Persistent homology has been recently proposed as an efficient topological method to analyze the structure of materials. For example, it was used for the analysis of pore configuration of granular materials, [52] for qualitative discrimination of amorphous metals [53] or for an analysis of similarity between nanoporous materials. [54] In this work we used an approach based on persistent homology to provide a simple and straightforward tool assessing the network quality in a quantitative way. Persistent diagrams were produced from STM topographic images and analyzed to provide a numerical score. We show that our method is consistent with the previously published MST approach and is robust against the image quality. The method was further used with a few additional results from the literature to highlight its general usability.

Results and discussion
Persistent homology has been used in previous applications in material science and interested readers can refer to Ref. [55] as an introduction to the methodology. A persistent diagram was computed from the network represented in the experimental image. Persistent diagrams represent the evolutions of holes (the pores) while the line width inside the network is artificially varied. Each hole in the structure is given a birth time when the enclosing circle first appears and a death time when the hole is completely filled. Each point in the persistent diagram corresponds thus to the pair (birth, death) for a specific hole (see Fig. 1). Details of the method are provided in supplementary information. The code used is available for download. [56] In the persistence diagram the columns at birth time ≤ 0 correspond to the pore size distribution of the image.
Contrary to most applications of persistent homology, we are not concentrating on the holes with the largest lifespan. Instead we want to evaluate how regular the network is. Therefore we are interested in how concentrated the persistence diagram is. A score is computed for each point by considering the number of similar points within a certain window, divided by the total number of points in the diagram. The score of the whole diagram (called in the following the PH score) is the maximal such score that can be obtained across the points in the diagram.
Intuitively, the PH score gives the proportion of holes similar to the most representative hole.
Therefore the higher is the score, the more regular the network is. A score of 1 would be achieved for a perfectly regular network, independently of the form of the network under the conditions that only one type of cell exists and that these cells are convex. The method is robust with respect to boundary effects. Holes formed along the boundary do occur in certain cases but their number is of a different order of magnitude than the total number of holes.
Therefore they only modify marginally the score. For this reason, PH scoring is poorly affected by the domain sizes, the surface coverage, or the molecular concentration. The method is also robust with respect to the image size (see Supplementary Information Fig. S1), and images taken at different scales can be comparatively analyzed. First we focused on the work by Ourdjini et al. [45] that provided a systematic analysis of the quality of the networks obtained from the self-condensation of 1,4-benzenediboronic acid (BDBA) on coinage metal surfaces. The different growth conditions were classified according to the outcome of the MST analysis. In brief, the centers of all pores were connected by straight segments, and the standard deviation of the segment lengths (MST ) is plotted versus the mean segment length (MST m). The results from the two methods (MST and PH) could be directly compared and are reproduced in Fig. 2. We found thus that the PH score, that considers the inner pore sizes, is consistent with the MST result that considers the interpore distances. The highest score obtained with BDBA, on Ag(111) surface, was given a PH score of 0.34.
An improvement of the network quality was proposed in Ref. [46] by using a sequential growth strategy [4] with the modified precursor p-bromobenzene boronic acid (BBBA) molecule on Au(111). The MST analysis is reported in Fig. 2. The higher quality gained with this precursor was confirmed by the PH score of 0.53.
All the above mentioned networks were grown in UHV conditions, in which the selfcondensation reaction is a strongly irreversible process, thus leading to overall rather low quality of the networks. Equilibrium growth conditions were reached in atmospheric conditions on a HOPG surface, while the network was formed by annealing the surface in the presence of water pressure confined in a closed reactor. [38,41] In this case (see Fig. 2), a nearly perfect honeycomb network was formed with the larger 4,4'-biphenyldiboronic acid (BPDA) precursor, [38] and for which we calculated a PH score of 0.88. Comparatively, in similar conditions but without the presence of H2O in the reactor, [38] a lower score of 0.20 is measured (Fig. 2).
Recently the smallest network based on boronic acid chemistry was achieved using the most basic diboronic acid precursor, namely tetrahydroxydiboron (THDB, see Fig. 2). [24] However, several defects are formed in this network, and our PH analysis gave a rather low quality score of 0.18. All the results obtained with boronic acids are summarized in Table 1. In fact a large pore wall length (i.e. a longer diboronic acid precursor) can provide some geometrical flexibility inducing the formation of defective pores with non-ideal shapes (deviating from the hexagonal symmetry), even if the reaction is complete and all the boronic acid groups have reacted. It was thus shown in Ref. [46] that, for networks made from BBBA precursors and comprising biphenyl groups as pore walls, polygonal pore shapes ranging from tetragons to octagons were routinely formed, whereas only pentagons to heptagons were observed using BDBA precursors, for which the pore wall length is limited to one phenyl unit. [35] This observation is however not a general rule as also tetragonal pores were observed with THDB. [24] We can conclude from the comparative analysis of the PH scores for these different networks in UHV that the network quality is not obviously affected by the geometrical flexibility of the covalent links. Networks #1 to #6 are obtained with BDBA precursor, see Table 1 and Ref. [45] for experimental details. For networks obtained with BBBA, BPDA and THDB, see Refs. [46] , [24] , [38] , respectively, for experimental details.
Adapted with permission from [45] Copyright 2011 by the American Physical Society.
Adapted with permission from [46] . Copyright 2012 American Chemical Society. Republished with permission of Royal Society of Chemistry, from [24] and [38] ; permission conveyed through Copyright Clearance Center, Inc.
Now we consider the formation of so-called porous graphene networks obtained from the hexaiodo-substituted macrocyclic cyclohexa-m-phenylene (I6-CHP) precursor. [34,50] A comparative study of the network growth on the Cu(111), Ag(111) and Au(111) surfaces was proposed in Ref. [50] . The STM images were compared with Monte Carlo simulations whereby the relevant parameter P was shown to be the ratio between the reactivity (coupling probability) and the diffusivity. A high P value was proposed to be valid for the growth on Cu(111), an intermediate P for Au(111) and a low P for Ag(111) (see Fig. 3). For these three surfaces we found PH scores of 0.22, 0.35 and 0.48, respectively, thus confirming Ag(111) as the most adapted surface for an ideal network growth. Note that for this specific system the molecular precursor is imaged with a donut shape exhibiting an internal pore. The latter is contributing to the final score and in particular to the non-negligible value of 0.22 for the Cu(111) case. Nevertheless, the increase in quality and the extension of the porous graphene network are well reproduced in the PH score evolution. See text for details.
(b) Republished with permission of Royal Society of Chemistry, from [21] ; permission conveyed through Copyright Clearance Center, Inc. (c) Reprinted from [19] , CC BY 4.0. (e,f) Republished with permission of Royal Society of Chemistry, from [26] ; permission conveyed through Copyright Clearance Center, Inc.
The honeycomb network formed by the precursor 1,3,5-tris(4-bromophenyl)benzene (TBPB, Fig. 4a) has been studied by different groups. On Au(111), a nicely ordered extended network was obtained, [21] for which we found a PH score of 0.53 (see Fig. 4b). Note that here the score is lowered due to the presence of filled pores that are thus wrongly assigned as defects by the calculation. On a Au-enriched Pd(111) surface, after annealing to 400°C a disordered network was obtained, for which we calculated a PH score of 0.19. [19] Further annealing up to 510°C allowed to improve the network quality and the PH score raised to 0.30 (Fig. 4c). The same honeycomb network could be obtained using a sequential coupling strategy [4] with a precursor bearing two different halogen types (4,1':4',DCTP,Fig. 4d). [26] The low quality network obtained on Au(111) with the help of codeposited Cu adatom catalysts delivered a PH score of 0.16 (Fig. 4e). Similarly, on Cu(111) a score of 0.14 was measured (see Fig. 4f).  [22] Copyright 2010. (b) Adapted with permission from [9] . Copyright 2014 American Chemical Society.
Finally, we present two additional systems to show the generality of our method. Most of the covalent networks reported in the literature have a honeycomb-like structure, but an extended network with an ideally square symmetry was reported with the molecule 5,10,15,20tetrakis(4-bromophenyl)porphyrinato}nickel(II) (NiTBrPP) on the Au(111) (see Fig. 5a). [22] Our PH scoring method could be applied in the same way on this network and delivered a score of 0.24. To end with, we show the example of a very low quality network obtained with the molecule 4,4'-diethynyl-1,1':4',1''terphenyl on Ag(111). [9] In this case, in addition to threefold linkages, twofold linkages between the linear precursors are possible with similar occurrence probability, which obviously prevents the formation of an ordered network.
Nevertheless, here the 2D character of the network is preserved (Fig. 5b). Due to the high number of defects, such network can be considered as a network with very low ordering.
Indeed, a PH score of 0.07 was calculated for it.

Conclusion
We propose a new method based on persistent homology to assess the quality of surfacesupported covalent networks from STM images. We show that our method is consistent with the previously published tool based on the minimal spanning tree (MST) approach. [45,46] The PH score provides quantitative insights on various types of networks, an issue that remains of prime importance for developing a rational optimization of the growth conditions in onsurface synthesis.
The scoring for the regularity of the networks relies on using a new approach to persistent homology, potentially discarding high lifespan topological features to concentrate on the density of the diagram. We believe that this approach could be also useful for the quantification of a wide range of other systems organized in networks and characterized by 2D imaging, such as block copolymers, [57,58] but also 2D soap froth, [59] biological cells, [60] Bénard−Marangoni convection cells, [61] or even geological structures. [62] Also, there is no theoretical objection to use it for higher dimensional structures. For example, in a given a set of slices representing a 3D system it could be possible to analyze the regularity of the structure using the cavities instead of the holes.