Introduction
Zeolites have shown pivotal roles in research and industrial sectors owing to their extraordinary properties and great applicability. The high thermal stability, large surface area, and molecular shape selectivity of zeolites allow them to be applied as catalysts (Bae et al., 2021; Kadja, Azhari, Mardiana, et al., 2021; Li et al., 2023; J. Wang et al., 2022; H. Zhang et al., 2022), adsorbents (Fischer, 2020; Hewitt et al., 2022; Mai et al., 2022; Mguni et al., 2022; Pérez-Botella et al., 2019; Shobuke et al., 2022), and ion exchangers (Aragaw & Ayalew, 2019; Campanile et al., 2022; Chen et al., 2016). Moreover, zeolite exhibits incredible performance in the refinery industry (Primo & Garcia, 2014; L. Zhang et al., 2022, 2023), biomass conversion (Bornes et al., 2023; Mardiana et al., 2022; Perego et al., 2017), CO2 capture (Karka et al., 2019; Murge et al., 2019; Thakkar et al., 2016), and water purification (Mahmoodi & Saffar-Dastgerdi, 2019; Tankersley et al., 2020).
Zeolites are composed by repeating TO4 (T represents Si or Al atom) tetrahedra aluminosilicate structures, which shape into different channels, cages, and pore morphologies. This unique structure creates well-defined pores in the zeolite structures. According to the International Zeolite Association, 255 zeolite frameworks have been discovered to date (Gandhi & Hasan, 2022). The synthesis of zeolite was first introduced by Richard Barrer, who reacted natural minerals
with an alkaline solution at high temperatures, i.e., 170-270 ℃ (Cundy & Cox, 2003). Generally, zeolites are synthesized using the hydrothermal method at a temperature of 60-200 ℃ for 1 to 20 days (Rahmah et al., 2023).
A large number of studies about zeolite synthesis focuses on property improvement. In this sense, the research about the synthesis of zeolites aims to control the crystal size and morphology as well as pore size controlling to overcome the diffusion limitation, in particular, hierarchical zeolite synthesis (Al-Ani et al., 2020; Graça et al., 2018; Jia et al., 2019; Kadja, Suprianti, et al., 2020; Khan et al., 2019; L. Wang et al., 2015). Currently, the innovations of zeolite synthesis research are shifting to develop synthesis strategies in order to suppress the production cost as well as to decrease environmental pollution. For instance, various strategies have been used for the green synthesis of zeolite, i.e., zeolite synthesis without solvent or known as solvent-free synthesis (Al-Nahari et al., 2023; Kadja, Rukmana, et al., 2021; Q. Wu et al., 2018), Organic Structure Directing Agent (OSDA)-free (Kadja, Kadir, et al., 2020; Tomita et al., 2022), solventfree and OSDA-free (Kadja, Azhari, Mukti, et al., 2021), using natural precursors, e.g., rice husk as silica (Kadja et al., 2017; Mohamed et al., 2015), and synthesis zeolite in low temperature (Kadja et al., 2016).
As aforementioned, zeolite structures, which are built by different TO4 configurations, could create different topologies. In this regard, different pore morphologies and compositions could also be fabricated (Gandhi & Hasan, 2022). Therefore, it has also become one factor that researchers put much effort into synthesizing new zeolite frameworks. Even so, understanding the zeolite synthesis process is still a big question mark. This process is affected by many factors, i.e., various synthesis conditions and numerous choices of source materials. Furthermore, the conversancy of the complex link between zeolite structures and properties still becomes a challenge. Owing to these reasons, researchers began to develop and/or utilize a tool to predict the suitable synthesis condition with desired properties.

Zeolite synthesis, machine learning, and zeolite and machine learning publication indexed by Scopus (April 2025).
The theoretical simulation of zeolite began in the 1970s, while the atomic simulation of the zeolite system started in the 1990s (S. Ma & Liu, 2022b). Scientists have predicted several million zeolite structures, but currently, only hundreds of them can be synthesized. Nevertheless, the density functional theory (DFT) calculation needs a high cost and longtime simulation. Thus, new approaches are required to overcome the bottleneck in zeolite synthesis.
Considering that the parameters of zeolite synthesis are complex and still poorly understood, a faster method that can optimize these intricate parameters is expected to ameliorate the understanding of zeolite synthesis (Moliner et al., 2019). Recently, the rapid development of machine learning has become a potential breakthrough in chemical research. As a part of artificial intelligence, machine learning is capable to understand the relationships between numerous variables and accurately predict the plausible outcomes. Unlike traditional computation techniques, machine learning can learn from a large amount of data without explicitly programmed. Furthermore, machine learning has the ability to produce an output extremely fast compared to conventional techniques, such as DFT. To date, machine learning has been widely applied to propose and identify new materials (Z. Yang et al., 2023). In addition, machine learning has been proven as an impressive tools in studying the complex relationship and several application (D. Ma et al., 2022). Hence, machine learning is projected to unveil the black box in zeolite synthesis system.
Figure 1 shows the number of publications related to zeolite synthesis, which has been growing year by year, along with the research on machine learning, which has increased extensively. The data trend shows a growing interest towards the use of machine learning in zeolite research. Herein, this present mini review focused on the potential of machine learning in revealing the black box issue in zeolite synthesis. It started by discussing the black box issue in conventional zeolite synthesis. Subsequently, the critical innovations in zeolite synthesis employing machine learning techniques in recent years were highlighted. At the end of the review, the future potential of machine learning in assisting chemists in the rational design and synthesis of zeolites was pointed out.
Several reviews have been published concerning the involvement of machine learning techniques in zeolite synthesis, as presented in Table 1 (Gandhi & Hasan, 2022; Kwak et al., 2021; S. Ma & Liu, 2022b, 2022a; Moliner et al., 2019; M. Wu et al., 2025). Despite growing interest in machine learning employ in zeolite synthesis, there remains a lack of comprehensive reviews which focusing in the latest advancement of machine learning potential in revealing the black box issue in zeolite synthesis. Thus, this gap highlights the significance of this mini review to provide a focused yet latest advancements discussion of unexplored avenues.
| Title | Focus | Year | Ref |
|---|---|---|---|
| Machine learning applied to zeolite synthesis: the missing link for realizing high-throughput discovery | Machine learning techniques to rationalize zeolite synthesis | 2019 | (Moliner et al., 2019) |
| The role of zeolite framework in zeolite stability and catalysis from recent simulation | The theoretical insights of zeolite framework's role in the stability and functionality of zeolite | 2021 | (S. Ma & Liu, 2022b) |
| Recent progress on Al distribution over zeolite frameworks: linking theories and experiments | Al distribution in zeolite framework | 2021 | (Kwak et al., 2021) |
| Machine learning for the design and discovery of zeolites and porous crystalline materials | The discovery of zeolites and similar crystalline materials using ML-based design | 2021 | (Gandhi & Hasan, 2022) |
| Machine learning potential era of zeolite simulation | The zeolite stability and the mechanism in catalytic reaction | 2022 | (S. Ma & Liu, 2022a) |
| AI-empowered digital design of zeolites: Progress, challenges, and perspectives | AI-empowered design of zeolites in properties prediction, simulation, design, and synthesis zeolite | 2025 | (M. Wu et al., 2025) |
| Deeper insight into the rational design and synthesis of zeolites revealed by machine | The latest advancement of machine learning potential in revealing the black | 2025 | This work |
Table 1 Comparison with existing reviews of machine learning in zeolite synthesis.
"Blackbox" in the Conventional Zeolite Synthesis
learning: a mini review
Generally, zeolites are synthesized using a hydrothermal method within a certain time. The hydrothermal method in the zeolite synthesis process consists of several stages, including (Cundy & Cox, 2005):
box issue in zeolite synthesis
- 1. Reactants consisting of amorphous silica and alumina are mixed with a cation source in an alkaline medium and a liquid phase, e.g., NaOH and KOH.
- 2. The mixture is heated in a stainless-steel autoclave, usually above 100℃.
- 3. The mixture undergoes an induction process by remaining amorphous after a while of heating.
- 4. Crystalline zeolite can be produced, and all amorphous reactants are converted to zeolite.
The mixture of reactants in the zeolite synthesis process can turn into a solid gel or colloidal suspension, depending on the type of reactants and mixing conditions. Zeolite synthesis is a complex reaction with many influencing variables (Yu, 2007), such as composition combination (Si/Al ratio (Hernando et al., 2018), alkalinity (Salwa Mohd Nazir et al., 2019), solubility (White et al., 2011), inorganic cations (Sasidharan & Kumar, 1997)), reactant sources (Keawkumay et al., 2025; Khaleque et al., 2020), OSDA (Jensen et al., 2021), temperature (Hui & Chao, 2006; Sumari et al., 2019), time (Meftah et al., 2017), aging period (Ahlers et al., 2020), stirring (Hanif et al., 2000), and addition of seeds (Jain & Rimer, 2020). The mechanism of zeolite crystallization consists of three main steps. The first step is the supersaturation stage, which drives the crystallization process through the formation of a supersaturated atmosphere. The second step is nucleation, where the reactant molecules undergo rearrangement to become nuclei. The last step is crystal growth.
Apart from the solid pathway mechanism by rearranging the structure, the mechanism of zeolite formation can also be through the settlement pathway or solution-mediated mechanism. In a solution-mediated mechanism, the amorphous phase in the zeolite will dissolve first, then form a core, and crystal growth will occur. Furthermore, the resulting zeolite, often a metastable species, rendering the more reproducible exact synthesis conditions, is urgently needed (Cundy & Cox, 2005; Grand et al., 2016). Besides, it is also prominent to bear in mind that zeolite as a metastable phase makes it probable to be controlled by the kinetic parameters. Notwithstanding, the coincidence of complex parameters, complex crystallization kinetics, and desired properties renders the process of zeolite synthesis still based on trial-and-error approaches, which causes the resulting product with minimal control, such as amorphous even zeolite with different phase and pore size (Moliner et al., 2019). Moreover, synthesis conditions that comprise even ten parameters make it a daunting task because we will encounter a high-dimensional search space (D. Ma et al., 2022).
It was reported previously that a specific zeolite synthesis method could generate a crystallized zeolite with Si/Al <5 due to the presence of alkali. Furthermore, the SDA (Structure Directing Agent) exploitation replacing the inorganic alkali made the zeolite with a high Si/Al ratio possible, even generating pure silica zeolite, known as silicalite-1 (X. Yang et al., 2022; C. Zhang et al., 2022). Besides, the synthesis of zeolite is also affected by the mixing order of reactants (Prodinger & Derewinski, 2020). Oleksiak et al. have investigated the nucleation of FAU and LTA zeolite using a combination of microscopy, scattering, and diffraction techniques (Oleksiak et al., 2016). The result showed that the exterior surface is the energetically preferred site for the nucleation over the interior of the particle based on the effect of confinement.
Kumar et al. reported the investigation of LTA zeolite synthesis using atomic force microscopy (AFM) at 35 and 45℃. In this work, LTA crystallization exhibits the gel-like islands formation from the aluminosilicate molecules in supersaturated conditions. Beyond these three-dimensional islands' evolution and assembly, other pathways could also occur, i.e., the attachment of nearly oriented. Moreover, a layering mechanism with layers spreading and two-dimensional nucleation also could create in lower supersaturation conditions. Hence, this work highlighted the nonclassical crystallization, which may be related to other zeolite types (Kumar et al., 2018).
Furthermore, apart from the synthesis parameters mentioned, zeolite synthesis research is also shifting to the consensus that providing a sustainable and environmentally friendly synthesis complete with enhancing the mechanistic understanding is also urgently needed. Kadja et al. have thoroughly investigated ZSM-5 zeolite synthesis (Kadja, Azhari, Mukti, et al., 2021). This work uses rice husk silica, seed-assisted, without SDA's presence by the solvent-free method. The result showed that crystalline ZSM-5 was generated in 10 h at 180 ℃, yielding up to ≥ 95%. Moreover, the Avrami equation analysis exhibits that nucleation occurred instantaneously and was the rate-determining step with activation energies of 137 kJ.mol-1 . Hence, the progress of revealing zeolite synthesis black box is still elusive due to many parameters that are essential to be examined.
Machine Learning Prospect for Rational Design of Zeolite Synthesis
As aforementioned before, the zeolite synthesis process is influenced by complex parameters, the trial-and-error principal, and the elusive understanding of crystallization mechanism, which has given rise to machine learning as a promising completion. Principally, machine learning prescribes suitable training data to a proper algorithm to understand the relationships over several variables (Butler et al., 2018; Jordan, M. I. & Mitchell, T. M. 2015). The data in machine learning is processed in four steps, i.e., data collection, feature generation and selection, algorithm selection, and validation and prediction(Mai et al., 2022). Due to its ability to predict, machine learning could also minimize the number of experiments (Louie et al., 2021; Xu et al., 2023). Figure 2 exhibits the use of machine learning in zeolite synthesis, including properties and structure prediction, best synthesis condition, OSDA determination, crystallinity and yield analysis, topology classification, even discovery new zeolite topology.
The role of machine learning in zeolite synthesis
Classification and Discovery of Zeolite Topology
The development of machine learning in zeolite areas has significantly impact the zeolite prediction. Using machine learning, Raman has predicted the extra-large pore structure in zeolite (Raman, 2023b). The data comprise 15 features, so it is simplified using t-SNE (t-distributed stochastic neighbor embedding). This work started with collecting a dataset related to the extra-large pore zeolite using 70-80 papers from several publishers. This collection data process used 80- 90% of the total time in this research. The independent variable (input) used was molar gel synthesis composition, while the dependent variable (output) was the number of T atoms in the zeolite pore ring. Zeolite with less than 12 T atoms and zeolite with more than 12 was labelled as 0 and 1, respectively.
Moreover, to obtain more transparent and interpretable results, SHAP (SHapley Additive exPlanations), a game-theoritic approach, have been conducted. The results showed that the zeolite with a very large pore size was produced from a lower F/T (the comparison of fluoride ions and T atoms) composition. The fluoride ion, in this case, plays a role in converting the synthetic material into a more mobile solution. During the zeolite synthesis process, fluoride ions act as mineralizers and OH− ions. In addition, a high carbon and nitrogen (C/N) ratio of OSDA is also required in the formation of zeolite with large pore sizes. It should be noted that the SHAP value is not only influenced by the specified features but also by other features. Hence, SHAP model is promising to interpreting the blackbox in zeolite synthesis and machine learning. Furthermore, by using XGBoost algorithms on the Heroku cloud platform, this work has achieved an accuracy of up to 86.57%. This result can be applied by other researchers regarding gaining their desired extra-large pore zeolite as well as discovering new zeolites with extra-large pore zeolite.
Long before, Yang et al. reported ZSP, the zeolite-structure predictor, for classifying the zeolite structures posit by the framework types (S. Yang et al., 2008). Among 1436 zeolite crystals from the Inorganic Crystal Structure Database (ICSD) that were analyzed, 179 zeolite frameworks were approved, and 96 framework types were depicted in the data set of zeolites. Furthermore, the topological descriptors using the Delaunay tessellation approach consist of four steps, i.e., generation of zeolite unit cell using the computational crystallography toolbox, replication periodically of the unit cell, determination of framework T-atom as active points, and utilization of Qhull algorithm for each sphere of T-atoms tessellation. Moreover, this work uses random forest algorithms, which exhibit an accuracy of 97.5%.
In another case, Kim and Min have exploited Bayesian active learning, the combination of machine learning and Bayesian optimization, to accelerate the discovery of zeolite structures from numerous hypothetical candidates as depicted in Figure 3(a) (Kim & Min, 2021). The procedures of this work start with calculating the mechanical properties of zeolite structures in the IZA database. The data was collected then using machine learning to predict the bulk and shear moduli. Furthermore, the elastic properties of zeolite structures in the predicted crystallography open database (PCOD) were predicted using the regression model. There were 50 regression models obtained and used for the Bayesian optimization. These procedures were repeated 20 times for validation. The result showed the new database that comprises up to 871 labeled zeolite structures and 23 new zeolite structures with the higher shear modulus than IZA database of 5.61-48.16 GPa. Figure 3(b) exhibits the standard deviation value of the predictive model including the unlabeled data which indicating the predictive model is convergence. It should be mentioned that the proposed platform may be used for other new material discoveries.

(a) The Bayesian active learning platform. (b) Standard deviation of data prediction for shear and bulk modulus. Reproduced with permission from Ref. (Kim & Min, 2021). Copyright 2021 ACS.
Furthermore, Xing and Blaisten-Barojas have developed a cloud-based computing system in order to permit the user to access the ZSP using a Web browser (Xing & Blaisten-Barojas, 2013). In this work, a public cloud was presented, i.e., Structure-Adaptive-Materials-Prediction (SAMP), which developed for Windows Azure platform. As a result, 41 framework types of zeolites were automatically obtained. Finally, it is worth noting that this work renders an automated system consisting of supercell, visualization, descriptor, and ZSP.
Mechanical and Structural Properties
Some research on machine learning utilization in zeolite synthesis was focused on the mechanical and structural properties of zeolite. Evans and Coudert have predicted the mechanical properties of the zeolite framework, particularly the elastic properties, i.e., elastic bulk (K) and shear moduli (G), using gradient boosting regressor (GBR) (Evans & Coudert, 2017). The descriptors in this research consist of space group, crystal density, unit cell volume, Si-O distributions, surface area, pore volume, pore size, and dimensionality. Furthermore, this work pointed out that the high-density zeolite has lower K and G than that of low-density. The result showed that from 590448 hypothetical zeolites prediction, the obtained accuracy for log (K) and log (G), respectively, was 0.102 ± 0.034 and 0.0947 ± 0.022. In addition, this model provides cheaper large-scale prediction by excluding the relative energy calculation, which has proven not to improve the accuracy of K and G.
Sours and Kulkarni utilize the deep neural network for predicting the structural properties of pure silica zeolites, as depicted in Figure 4(a) (Sours & Kulkarni, 2022). They used 219 data of pure silica zeolite topologies for DFT calculations. In addition, the structural properties analysis using 187 topologies data through machine learning and DFT optimizations for comparison. The result showed that machine learning potentials and DFT gained excellent performance. Notwithstanding, the machine learning model exhibits a faster performance than DFT up to 1000 times. Moreover, the accuracy of deep neural networks is validated by 32 testing topologies.
Still with the intention to predict the zeolite topology structural from the synthesis condition, another group, i.e., Jensen et al., created the pipeline data collected by automatically extracting zeolite synthesis data from 70000 papers. These data are then plotted as depicted in Figure 4(c). The data was collected by using NLP (natural language processing) techniques, the related synthesis data was parsed using HTML and XML, and the compositional ratios were located and extracted using regular expression (regex) (Figure 4(b)). In contrast with the previous group, which focused on pure silica zeolite, this work has successfully applied these pipeline data using random forest algorithms to predict the germanium (Ge) zeolite high and low framework densities with RMSE (root mean squared error (RMSE) of 0.98T/1000Å3 . Moreover, this work also pointed out the fluoride ion and Ge relationship with respect to the hydrothermal stability (Jensen et al., 2019).
Optimization of Synthesis Condition
In light of the plentiful research nowadays about low-cost zeolite synthesis, Ma et al. have exploited machine learning for the realization of distilling seed-assisted zeolite synthesis (D. Ma et al., 2022). This work uses 385 historical data on seed-assisted zeolite synthesis carried out in the laboratory. These data involve nine synthesis parameters and using Na-borosilicate gel, which consists of OTMAC (Octyltrimethylammonium Chloride) and EUO, MWW, MTT, MFI, ERI, SFE, TON, IWF, and *MRE as seeds. Moreover, this work was intended to suppress the production cost of zeolite synthesis by using a seed-assisted method instead of OSDA. There are six machine learning models used in this work, i.e., logistic regression (LR), Decision Tree (DT), Support Vector Machine (SVM), extreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), and Random Forest (RF).

a). Schematic procedure of deep potential model. Reproduced with permission from Ref. (Sours & Kulkarni, 2022). Copyright 2023 ACS. b). Schematic overview of machine learning approach in zeolite synthesis. c). The resulting data by automatic literature data extraction. The color insensity represents the data frequency. Reproduced with permission from Ref. (Jensen et al., 2019). Copyright 2019 ACS.
The accuracies of each model exhibit that the highest accuracy obtained by the AdaBoost model, followed by RF, XGBoost, DT, SVM, and LR, respectively. Moreover, OTMAC, NaOH, and B2O3 gel composition, framework density, and crystallization time have significant importance scores. It is worth noting that the metastable zeolite formation was affected by NaOH and the seed's framework density. The higher framework density of the seed inclines to no metastable formation. In this regard, MFI with higher framework density, i.e., 17.9, exhibited no transformation compared to IWV seed with a framework density of 15.7 which experienced transformation to MWW and EUO aligned with increasing the crystallization time.
Concerning gaining the general guidelines for zeolite classification, which relate to synthesis condition, Liu and coworkers have defined the global potential energy surface (PES) of 12 T atom zeolite systems, i.e., CHA, ATO, ATV, and
ATS framework using machine learning (S. Ma et al., 2020). The result exhibited that zeolite becomes a thermodynamically stable product due to the exploitation of a proper SDA. Moreover, the pH of zeolite synthesis is sensitive. In this sense, the Si-O-Al bonding favored the basic condition, not the acidic or neutral condition.

The decision tree resulted from XGBoost analysis. Reproduced with permission from Ref. (Muraoka et al., 2019). Copyright 2019 Nature.
Muraoka et al. used machine learning to connect synthesis and zeolite structure descriptors. In this research, XGBoost and random forest models showed a high accuracy of around 75-80%(Muraoka et al., 2019). Among these two methods, XGBoost was chosen because its hyperparameter is more suitable for this study. Moreover, it can predict the results of the synthesis as well as the possibility of zeolite formation under certain synthesis conditions. The XGBoost analysis revealed that the most influential descriptors of zeolite synthesis are SiO2, Al2O3, MOH (M = Li, Na, K), and H2O (Figure 5)
Recently, Pan et al. introduced the extending SHAP through aggregation approach to all building units. In this sense, they used ZeoSyn, which consists of 23961 zeolite synthesis routes for 921 OSDAs and 223 zeolite frameworks (Pan et al., 2024). The SHAP values then calculated to define the influenced of each parameter on the zeolite forming probability. Using random forest architecture to train the ZeoSyn datasets, the reported accuracy was 0.73.
The Role of OSDA
Besides investigating the synthesis conditions along with the resulting crystal, the possibility of research about OSDA and zeolite topology using machine learning also attracted scientists' attention. Jensen et al. have investigated the relationship between OSDA and zeolite types using weighted holistic invariant molecular (WHIM) as descriptors (Jensen et al., 2021). Furthermore, to simplify the high dimensionality of WHIM, PCA (principal component analysis) is used to sampling the potential and known OSDAs of 5 zeolite types, i.e., LEV, CHA, AEI, LTA, and AFX. This work collected 5663 databases of synthesis routes using natural language processing. Moreover, this work has successfully predicted 408 potential OSDA for CHA zeolite types. Besides, these generated OSDA are also being tested for a different kind of zeolite, i.e., SFW framework, and the molecular simulations exhibited that OSDA generates are compatible with SFW.
Schwalbe-Koda et al. have proposed metrics of binding by controlling the template selectivity in zeolite synthesis (Schwalbe-Koda, Kwon, et al., 2021). This work collected 549 OSDA data from literature with benchmarking OSDA's phase competition with criteria having both strong and weak binding affinity to several zeolite frameworks, respectively. The literature extraction exhibited 1122 pairs of OSDA and zeolite plotted by templating energy to obtain the compatibility between literature data and binding metric. Furthermore, the result showed that OSDAs for several frameworks and zeolite with the design of dual-OSDA exhibit not accurate and limited binding energy metrics, respectively. To be noted, the templating energy was better in explaining the literature, approximately 70%, than plain binding energies. Nevertheless, the templating energy inaccurately predicted the best host for a certain OSDA. Furthermore, this work successfully identified the energetic, electrostatic, and geometric od OSDAs and designed OSDAs with favorable properties for zeolite CHA.
In another work, Schwalbe-Koda et al. continues designing the biselectivity OSDAs for intergrowth zeolite synthesis by combining data mining and computational simulations (Schwalbe-Koda, Corma, et al., 2021). This work started by using some known OSDAs as a reference, designing new OSDA through shape metrics and phase competition. Moreover, the binding energies were used for energy references, i.e., to calculate the second-best pair of zeolites and OSDAs. The results exhibited that this method has the ability for hypothetical intergrowth realization of AEI/SAV and many other intergrowths.
Recently, still from Rafael Gόmez-Bombarelli group, they applied binding energies to the synthesis of up to 100 zeolites through known OSDAs (Schwalbe-Koda et al., 2022). The resulting data has the ability to construct the small- and medium-pore zeolites, which have a higher selectivity than the large-pore zeolites with respect to the opening size. Moreover, this work emphasizes that the presence of inorganic cations in OSDA enables an effect on competing phases. Nonetheless, new computational methods are still necessary to develop the electrostatic effects in zeolite's template. To this end, this method successfully proposed an alternative OSDA for KFI zeolite, i.e., tetraethylammonium.
The latest, Gόmez-Bombarelli and co-workers reported the efficient zeolite intergrowth synthesis of CHA/AEI through designing a unique OSDA via a priori methodology. This method relies on quantifying zeolite from a phase competition point of view via high-throughput simulations (Bello-Jurado et al., 2022). Generally, CHA and AEI zeolite were synthesized using N,N,N-trimethyladamantylammonium (TMAda), and N-ethyl-N-methyl-2,2,6,6 tetramethylpiperidinium, respectively. In this work, they proposed N-ethyl-N-isopropyl-N-methylpropan-2-ammonium and 1-ethyl-1-isopropylpyrrolidin-1-ium for CHA/AEI intergrowths zeolite synthesis. Moreover, the resulting material, i.e., CHA/AEI intergrowth, exhibits an outstanding performance for NH3-SCR (selective reduction of NOx with ammonia) compared to the pure CHA catalyst. It showed that machine learning has the ability to design and synthesize zeolite more efficiently and is applicable to industry.
Crystallinity and Yield Analysis
Moreover, the application of machine learning on zeolite was utilized not only for the zeolite synthesis process and parameter condition but also for crystallization results. Nikiforov et al. reported the prediction of MFI zeolite's crystallinity by analyzing 650 papers using several machine learning algorithms, i.e., random forest, gradient boosting, and decision trees. There are twelve synthesis parameters collected that is number of moles of Na/Si, the number of moles of Al/Si, the number of moles of H2O/Si, aging time, aging temperature, time and temperature of the first step of hydrothermal, time and temperature of the second step of hydrothermal, number of moles of template/Si, and the obtained degree of crystallinity. Among the aforementioned used algorithms, gradient boosting algorithms showed the most outstanding result, with MAE (mean absolute error) and MSE (mean squared error) of 204.4 and 10.3, respectively (Nikiforov et al., 2022).
Conroy et al. have reported the machine learning utilization for zeolite LTA synthesis. This research is expected to increase the zeolite yield and performance. The result exhibits one of the synthesis parameters that affect the LTA synthesis, i.e., reaction time. The gel composition used is SiO2/Al2O3, H2O/Na2O, Na2O/SiO2 of 2, 30, and 2.5, respectively. In the initial period of zeolite synthesis, the resulting product is still amorphous. However, the crystalline LTA zeolite increased in line with the increase in reaction time. This result was in agreement with SEM images. Moreover, the SEM images and quantitative XRD, which were analyzed with the Rietveld method, revealed the optimal crystalline results of 71 and 72% at 3 and 4 hours, respectively. Besides, the qualitative XRD, which was analyzed by comparing the relative peak intensity, revealed the crystallinity of LTA zeolite of 99 wt% at 3- and 4-hour synthesis time. Therefore, it is worth noting that the hybrid XRD analysis, a combination of quantitative and qualitative XRD, plays a role in amorphous and crystalline determination (Conroy et al., 2022). Furthermore, among several algorithms that applied, i.e., ridge regression, linear regression, regression tree, XGBoost, artificial neural networks, and random forest, an artificial neural networks model (ANN) achieved the highest accuracy with R2 = 0.84. In order to validate the hybrid data approach, the ANN model was analyzed using qualitative, quantitative, hybrid 10, 20, and 40 datasets (the number according to the crystalline zeolite content in wt%). The result showed that the qualitative XRD has the highest accuracy among others. Notwithstanding, the qualitative XRD lies a problem in interpreting the successfulness of most zeolite synthesis. Hence, in fact, the "hybrid 10" was the most accurate model. Nevertheless, this work pointed out that machine learning can be successfully utilized in zeolite LTA synthesis.
Table 2 Summarises of machine learning application in zeolite synthesis
| Aim | Dataset collection | Algorithm | Platform | Highlighted Result | Ref |
|---|---|---|---|---|---|
| Classi | fication and Discovery of Ze | ||||
| Extra-large pore zeolite prediction | Manually processed 70-80 papers from several publishers | XGBoost | Heroku cloud | Accuracy up to 86.57% | (Raman, 2023a) |
| Framework type predictor | Inorganic Crystal Structure Database (ICSD) | Random forest | - | Accuracy of 83% | (S. Yang et al., 2009) |
| Zeolite structures with superior mechanical properties | IZA database | DT, BTE, SVM, GPR, LightGBM | Bayesian active learning | 23 new zeolite structures | (Kim & Min, 2021) |
| Zeolite structure predictor | Inorganic Crystal Structure Database (ICSD) | Random forest | Windows Azure | Accuracy of 0.98 | (Xing & Blaisten- Barojas, 2013) |
| Mechanical and structural p | roperties | • | |||
| Mechanical properties prediction | Structure file in CIF format | Gradient boosting regressor | - | Accuracy for log (K) and log (G) = 0.102 ± 0.034 and 0.0947 ± 0.022 | (Evans & Coudert, 2017) |
| Structural properties prediction | IZA database | Deep Potential | - | RMSE of bulk moduli of 8.6 GPa | (Sours & Kulkarni, 2022) |
| Zeolite topology structural | Zeolite journal articles using NLP | Random forest | - | RMSE of 0.98T/1000Å | (Jensen et al., 2019) |
| Optimization of synthesis of | ondition | ||||
| Seed-assisted zeolite synthesis | Archived laboratory records | LR, DT, SVM, XGBoost, AdaBoost, Random Forest | - | Accuracy up to >0.9 | (D. Ma et al., 2022) |
| Hydrothermal parameters | ZeoSyn | Random Forest | - | Accuracy of 0.73 | (Pan et al., 2024) |
| Thermodynamic rules for zeolite formation | Enhanced stochastic surface walking | Global neural network | - | General guidelines for zeolite classification relate to their stability and synthesis condition | (S. Ma et al., 2020) |
| Linking the synthesis and zeolite structure descriptor | Experimental records in literatures | XGBoost, Random Forest | - | Accuracy of around 0.75-0.8 | (Muraoka et al., 2019) |
| · | The role of OSDA | ||||
| Relationship between OSDAs and zeolites | Scientific literatures which published from 1966 to 2020 using NLP | Generative neural work | - | 408 potential OSDA for CHA zeolite | (Jensen et al., 2021) |
| Designing the biselectivity OSDAs for intergrowth zeolite synthesis | A data set of zeolite- OSDA pairs from literatures | - | - | Potential OSDAs for disordered framework | (Schwalbe- Koda, Corma, et al., 2021) |
| Repurposing templates | Literatures of zeolites synthesis data | Voronoi and monte carlo docking algorithm | - | Example of alternatives OSDA for zeolites | (Schwalbe- Koda et al., 2022) |
| Crystallinity and yield ar | nalysis | ||||
| Degree of crystallinity prediction for MFI zeolite | Manually processed 650 papers about MFI zeolites synthesis | DT, Random Forest, and gradient boosting | - | MAE of 10.3 and MSE of 204.4 | (Nikiforov et al., 2022) |
| Increase the yield and performance of LTA zeolite | Synthesis data from literatures | Linear regression, artificial neural network, regression tree, XGBoost, random forest, ridge regression | - | Accuracy up to 0.84 | (Conroy et al., 2022) |
Summary and Outlook
Zeolite, applied in several chemistry industries, became more and more center of the researcher's attention. Here, we overview the conventional zeolite synthesis, which still requires a deep understanding to explain the crystallization of zeolite structure, both kinetics and thermodynamics, which remains elusive. It turns out to be a laborious task due to numerous influential parameters, such as the molar composition of reactants and synthesis conditions. On the other hand, machine learning with the capability to predict and discover a material is also under the spotlight. Hence, machine learning is a promising way to demonstrate the synthesis process, predict the best synthetic condition, and even discover a new zeolite topology.
In light of the foregoing, the superiority of machine learning has brought the researchers evidence of the machine learning potential in zeolite research, particularly the zeolite synthesis parameters designing, the zeolite structural properties prediction, even the prediction of quantitative output in zeolite synthesis. Table 2 summarizes the application of machine learning in zeolite synthesis. Generally, there are two deviances of machine learning in revealing the zeolite synthesis process, i.e., (i) determining the data collection techniques to obtain an enormous database for training, testing, and validation and (ii) the preference of suitable learning algorithms to accelerate the achievement of the research target. Moreover, it is worth noting that the feature importance of the machine learning algorithm model brought a convenience impact to analyze the large number of parameters that affect the synthesis process. With respect to these points, machine learning will effectively improve the synthesis process.
Ultimately, among several machine learning approaches to zeolite synthesis, some cases require attention to ameliorate the future prospects. First, the algorithm used ought to have the ability to interplay the synthesis process and desired properties as well as zeolite performance. Second, the machine learning approaches require an ability not only to make an effective synthesis process by the mechanism comprehension but also to prescribe the synthesis process on a large scale. In addition, data generation and feature extraction should also involve simple and efficient methods for capturing information from a large amount of literature regarding the biggest data availability are from patents and journals. This is due to the requisite of high-quality datasets to produce an accurate result in machine learning. Besides, it is also worth noting that the large amount of zeolite structures which is absolutely crucial to be consider in order to database generalization. Moreover, the exegesis of the kinetics and thermodynamic mechanism of zeolite synthesis required a proper approach to ensuring the reproducibility and quality of machine learning. In this case, SHAP and LIME (Local Interpretable Model-agnostic Explanations) can be considered for utilization due to the ability to interpret the method before proving it in the simulation and/or experiment.
On a final note, due to the complexity of zeolite and machine learning, the cross-disciplinary teams would create the best result. In this sense, shared databases and/or standard methods could potentially catalyze the process. Nevertheless, it is worth noting that cross validation, external validation, and experimental test still needed to prove the machine learning prediction result before applied in the laboratory and industry. Thus, if all of these cases are fulfilled, the machine learning models may successfully address the realization of zeolite and even develop more zeolite frameworks, which expected to have a beneficial impact on the chemical industry as well as humankind's life.
Acknowledgement
This work is supported by Hibah Riset Unggulan ITB 2025. SM gratefully acknowledges Institut Teknologi Bandung for tuition provided through the GTA scholarship and Kadja Lab led by Dr. Ir. Grandprix T.M. Kadja for monthly stipend.
Compliance with ethics guidelines
The authors declare they have no conflict of interest or financial conflicts to disclose.
This article contains no studies with human or animal subjects performed by authors.
List of Abbreviations
| Abbreviation | Full Term |
|---|---|
| AdaBoost | Adaptive Boosting |
| AFM | Atomic Force Microscopy |
| ANN | Artificial Neural Networks |
| BTE | Boosting Tree Ensemble |
| DFT | Density functional theory |
| Abbreviation | Full Term |
|---|---|
| DT | Decision tree |
| GBR | Gradient Boosting Regressor |
| GPR | Gaussian Process Regressor |
| LightGBM | Light Gradient Boosting Machine |
| LIME | Local Interpretable Model-agnostic Explanations |
| LR | Logistic Regression |
| MAE | Mean Absolute Error |
| MSE | Mean Squared Error |
| OSDA | Organic Structure Directing Agent |
| PCOD | Predicted Crystallography Open Database |
| PES | Potential Energy Surface |
| RF | Random forest |
| SAMP | Structure-Adaptive-Materials-Prediction |
| SDA | Structure directing agent |
| SHAP | SHaply Additive exPlanations |
| svm | Support Vector Machine |
| t-SNE | t-distributed stochastic neighbor embedding |
| WHIM | Weighted Holistic Invariant Molecular |
| XGBoost | eXtreme Gradient Boosting |
