Introduction
Machine learning (ML) has a long history of development and application spanning several decades. It is a rapidly growing field of modern science and endows computers with the ability to learn and make predictions from data without explicit programming. It falls under the umbrella of artificial intelligence (AI) and is closely related to statistical inference and pattern recognition. Recently, ML technologies have experienced a revival and gained popularity— particularly after AlphaGo from DeepMind defeated the human champion in the game of Go. This resurgence can be attributed to the advancement of algorithms, the increasing availability of powerful computational hardware such as graphics processing units (GPUs), and the abundance of data.
Nuclear physics seeks to understand the nature of nuclear matter, including its fundamental constituents and collective behavior under different conditions, as well as the fundamental interactions that govern them. Traditional nuclear physics—particularly for energies below approximately 1 GeV/nucleon—focuses on nuclear structures and reactions, where the degree of freedom is the nucleon. However, in high-energy nuclear physics (HENP), the degree of freedom includes and is often dominated by quarks and gluons. Theoretical calculations and experiments or observations with large scientific infrastructures play a leading role but are reaching unprecedented complexity and scale. In the context of HENP—particularly nuclear collisions— researchers are already at the forefront of Big Data analysis. The detectors used in high-energy nuclear collisions, such as the Relativistic Heavy Ion Collider (RHIC) and the Large Hadron Collider (LHC), can easily produce petabytes of raw data per year. A major challenge is to make sense of the vast amounts of data generated in experiments or simulated according to theory. These data are often highly complex and difficult to interpret. It is a daunting task to analyze this sheer volume of data using traditional methods of physics research. Therefore, efficient computational methods are urgently needed to facilitate physics explorations in these computational and data-intensive research areas.
One of the primary physical goals of HENP is to understand quantum chromodynamics (QCD) matter under extreme conditions. It is expected that at extremely high temperatures and/or high densities, nuclear matter, which is governed by the QCD dictated strong interaction, will turn into a deconfined quark–gluon plasma (QGP) state, which is with elementary particles—quarks and gluons—to be their basic degrees of freedom. The formation and properties of this new state of matter, as well as its transition to normal nuclear matter, are widely studied, but there are open questions in HENP. This deconfined QGP state was believed to exist in the early universe, a few microseconds after the Big Bang. Another way to study the QGP state is in terms of neutron stars (or binary neutron star mergers). A neutron star is a compact astrophysical object whose interior serves as a cosmic laboratory for cold and dense QCD matter. Increasing astronomical observations—particularly those arising from the progress of gravitational wave analysis—will provide constraints on the extreme properties of QCD matter in this cold and dense regime, for which effective techniques for dealing with the associated inverse problem will be essential. Theoretically, first-principles lattice QCD calculations at vanishing and small baryon chemical potentials predict a smooth crossover transition from a dilute hadronic resonance gas to the deconfined QGP state. However, in the high-baryon density regime, direct lattice QCD simulations are currently hampered by the fermionic sign problem. On Earth, this new state of QGP matter can only be studied through heavy-ion collision (HIC) programs, where two heavy nuclei are accelerated and smashed to deposit the collision energy in the overlapping region for achieving extreme conditions, causing “heating and (or) compression” of the normal nuclear matter to be excited.
A significant challenge associated with HICs is that the collision of heavy nuclei is a highly dynamic, complex, and rapidly evolving process: although the deconfined QGP state may indeed be formed during the collision, it undergoes rapid expansion and cooling, and at some point its degrees of freedom are reconfined into color-neutral hadrons, which continue to interact and decay until the detector in the experiment receives its signals. The collision process is too short and too small to be resolved. Experimentally, we have no direct access to the early potentially formed QGP fireball but only indirect measurements of the final emitted hadrons or their decay products. Furthermore, the theoretical description of the collision dynamics involves many uncertain physical factors that are not yet fully clear from theory or experimental comprehension. These uncertainties can interfere with final physical observables in the experiment. Thus, from the limited and contaminated (i.e., heavily influenced by many uncertain factors) measurements, a reliable extraction of the physics of the produced extreme QCD matter is non-trivial and challenging. This severely hampers the extraction of physical knowledge in the HIC programs.
As a modern computational paradigm, ML has become increasingly promising in recent years for applications at the forefront of HENP research. ML algorithms can be used to automatically identify patterns and correlations in data, allowing knowledge to be extracted from data computationally and automatically. It can thus help to extract meaningful information about the underlying physics or fundamental driving laws from the available data. In contrast to the traditional focus of ML, which is usually predictions based on pattern recognition from the collected data, the intersection of HENP and ML is concerned with the underlying patterns and causality for the purposes of uncertainty assessment and physical interpretation, which lead to discoveries. A collection of datasets from different areas of fundamental physics, including high-energy particle physics and nuclear physics, used for supervised ML studies was recently presented in Ref. [1].
For the purpose of physics identification, the intersection of HENP and ML goes beyond the mere application of existing learning algorithms to the dataset accessible in the physics problem. Paying special attention to the physical constraints or required fundamental laws or symmetries of the systems would increase the efficiency of ML in solving the specific physics problem. For example, when regressive or generative models are used to study quantum many-body systems or general quantum field theory (QFT), implementing the symmetries of the system can significantly reduce the amount of training data needed and improve the recognition performance [2]. ML has been applied in various studies at low- and intermediate-energy HICs [3-11]; a recent mini-review was presented in Ref. [12]. It has also been applied in hadron physics [13-15].
In addition, ML can be applied in the context of simulations, which play a key role in fundamental physics research as well as in a wide range of other scientific fields such as biology, chemistry, robotics, and climate modeling. In HENP, for both experimental and theoretical studies, simulation is an important tool, starting from the understanding of the fundamental interactions involved, e.g., in HIC dynamics and detector simulation, as well as in lattice QFT simulation. Simulations are used to model the behavior of nuclear matter and its constituents and the interactions that occur between them, which are typically highly complex, with detailed use of many involved physical laws and equations or empirical phenomenological models. Simulations of HICs and the associated detectors in HENP consume large amounts of computational resources because of the high statistics and high resolutions. A collision dynamics simulation with extensive synthetic data is required to accurately interpret experimental measurements, which is enormously computationally and memory-intensive. ML can be used to improve the efficiency and descriptive power of these simulations to facilitate the physics discovery process. For example, researchers have proposed using ML to accelerate the simulation of hydrodynamics, to optimize the parameters involved in the model simulation, to make the model more robust to uncertainties, and to solve many-body problems directly by augmenting the conventional Monte Carlo simulation method.
In brief, ML is an effective tool that can be employed to address many challenges in HENP. It can assist in analyzing large amounts of data from HENP, linking nuclear experiments to physics theory exploration, optimizing simulations and calibrating models more efficiently, as well as developing new empirical and theoretical models. It is undeniable that ML technologies have the potential to make a significant impact, even transforming the field of HENP. Therefore, it is essential to acknowledge and recognize the importance of this new paradigm in advancing the field.
In the present review, focusing on HIC-related studies within HENP, we first provide a brief overview of the methodology in Sect. 2. Then, we discuss the applications of ML to HIC physics with regard to the following aspects: initial condition inference in Sect. 3, decoding bulk matter properties in Sect. 4, in-medium effects in Sect. 5, hard probe sector in Sect. 6, and searching for different observables in Sect. 7. We summarize our review in Sect. 8.
Methodology
Taxonomy of Machine Learning
ML can be classified in several ways. One way is to classify it by its function, i.e., into classification, regression, generation, and dimensionality reduction. The other way is to classify ML by the type of training data, i.e., into supervised learning, unsupervised learning, semi-supervised learning, self-supervised learning, active learning, and reinforcement learning. For example, supervised learning requires data to be labeled in such a way that the model can be trained to build a mapping between the input and the labels. Unsupervised learning does not need labeled data; it can learn patterns from data, assuming that the machine makes self-consistent predictions on data that are perturbed or slightly augmented. Semi-supervised learning requires a small amount of labeled data along with a large amount of unlabeled data. Self-supervised learning works with specific data such as natural language or images that are sequential. It allows the machine to predict one part of the sequence from the other part. Active learning is a type of semi-supervised learning that employs two pools of data: a small pool of labeled data and a large pool of unlabeled data. The machine is trained on the labeled data and validated on the unlabeled data. The performance of the simply trained machine differs for different samples from the unlabeled data pool. For example, the machine may be uncertain on one sample, predicting that the label of the sample is A with 51% probability and B with 49% probability. This sample is assumed to be more difficult and more important for the trained machine than simple samples for which the machine’s predictions are certain. For efficiency, this sample is labeled and moved from the unlabeled pool to the labeled pool for further training. Reinforcement learning uses data generated by interactions with the environment.
According to the previous description, the loss function for supervised learning in the regression task can be expressed as
The cross-entropy loss is widely used for classification. It is defined as
In binary classification, the cross entropy is reduced to
For multi-categorical classification, the loss function is the cross-entropy loss, with the activation function in the last layer replaced by the softmax function:
For semi-supervised learning, the loss function is the combination of the supervised loss and unsupervised loss:
In active learning, the loss function is essentially the same as that in supervised learning. The difference is that the trained network ranks samples from the unsupervised pool for annotation. Thus, the key is to rank the samples. There are two main methods for this. One is to rank the samples according to the entropy of the predictions made by the pretrained network:
For reinforcement learning, the data are generated by subsequent interactions between the network policy and the environment. The network receives an observation ot from the environment at time t, makes a decision, and performs an action at on the environment. The environment returns a new observation ot+1, an immediate reward rt+1, and a done signal. The data are thus
Optimization
The goal of ML is to minimize the loss for the prediction of new data not used for training. In gradient-based models, this is achieved simply via stochastic gradient descent (SGD) and its variants:
The possible values of θ form a space called the parameter space. The initial value of θ is usually a random number. Updating θ using SGD is analogous to walking around the parameter space looking for the minimum value of the loss function. The loss function can be thought of as the potential surface whose negative gradients give the direction of acceleration
In reinforcement learning, the goal is to maximize the accumulated rewards. The optimization method is stochastic gradient ascent. In the popular policy gradient method, the parameters of the policy network are updated as follows:
Automatic Differentiation
The number of trainable parameters in a DNN is large. To learn from the data, one must compute the negative gradients of loss with respect to each of the millions or trillions of model parameters
AD has a forward mode and a backward mode. If the DNN is a
In the forward mode, AD is implemented by introducing a dual number for each variable:
Because of the universal approximation capability of DNNs and the efficient and accurate auto-diff, DNNs are widely used to represent solutions of ordinary differential equations (ODEs) and partial differential equations (PDEs) that require gradients. Thus, many physical problems are translated into optimization problems. This method is commonly referred to as physics-informed neural networks (PINNs). Compared with traditional numerical solutions, PINNs are mesh-free, work for very high dimensions, and are easy to implement—particularly for multi-scale and multi-physics problems.
Convolutional Neural Networks
Convolutional neural networks (CNNs) are distinguished from other neural networks by their superior performance for image, speech, and audio signal inputs. A naive CNN consists of three main types of layers, i.e., convolutional layers, pooling layers, and fully connected layers, as shown in Fig. 1.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F001.jpg)
The convolutional layer is the core building block of a CNN. The term convolution refers to the convolution operation between the input features and the filters (or kernels). In the mathematical view, a convolution operation is a special type of linear operation where two functions are multiplied to produce a third function that expresses how the shape of one function is modified by the other. In the ML view, the convolutional layer uses the filters to extract the features from the input data and combines the extracted features as the output. In a well-trained convolutional layer, a filter is only sensitive to one specific type of feature. Usually, there are many filters in a convolutional layer, to satisfy the complex input features. After the convolution operation, a rectified linear unit (ReLU) activation function is typically used, which introduces nonlinearity into the neural network.
After the convolutional layer, a pooling layer is applied to reduce the number of parameters, which is also known as downsampling. There are two main types of pooling: max pooling and average pooling. Max pooling selects the maximum value to be the output, and average pooling uses the average of the pixels covered by the pooling kernel. The fully connected layer is used to map the features extracted by the previous layers to the final output.
The convolutional layers can be stacked to make the neural network deeper. Earlier layers break down the complex features from the input data into individual simple features. As the features pass through the subsequent convolutional layers, the filters begin to capture larger elements or shapes. Owing to its ability to extract complex features, the CNN architecture became a foundation of modern computer vision.
However, when neural networks are deep, the vanishing gradient problem is severe. To overcome this problem in CNN architectures, many complex neural networks have been developed, such as AlexNet, VGGNet, InceptionNet, GoogLeNet, and ResNet.
Recurrent Neural Networks
Recurrent neural networks (RNNs) are distinguished from other neural networks by their superior performance for sequence or time-series data.
Fig. 2 shows the structure of a basic RNN, where U denotes the weights for the connection of the input layer to the hidden layer, V denotes the weights for the connection of the hidden layer to the hidden layer, and W denotes the weights for the connection of the hidden layer to the output layer. Using self-connection with weights V, the RNN takes information from previous inputs to influence the current input and output. This feature, which is often referred to as “memory,” makes the RNN good at processing sequential data. The loss function
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F002.jpg)
Point Cloud Network
The final-state particles from HICs form a point cloud in the momentum space. The data must be manipulated to use CNNs and RNNs, as these networks were originally designed for images and natural language. For example, to use a CNN, density estimation (histogram) is typically used to convert the particle cloud into images. However, this does not work well for a few particles in a three-dimensional (3D) space, because the particles are dilute and the resolution is poor. To use an RNN, the particle cloud must be sorted to one dimension, which can only keep the local information in one dimension. The point cloud network is designed to preserve the permutation symmetry of a set of particles.
Fig. 3 shows a simple demonstration of a point cloud network. The input to the network is a set of particles in the momentum space, including their 4-momenta, mass, and other quantum numbers. A fully connected neural network or multilayer perceptron (MLP) is applied to a particle to transform its m input features into 128 features in the high-dimensional latent space. The MLP is shared by all the particles in the cloud and is also called a 1DCNN. This step preserves the permutation symmetry of all the particles. Then, global max pooling (GMP) or global average pooling (GAP) is applied to these latent features of the particles to extract the global information of the particle cloud. The GMP and GAP extract the boundaries of the input particle cloud in the high-dimensional latent space, which learn the multi-particle correlation for the final decision. This extracted global information (128 features) is fed to another MLP for the final decision. The output neuron has a value in the range (0, 1) and uses 0.5 as the decision boundary.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F003.jpg)
The network shown in Fig. 3 is used to classify nuclear phase transitions [18]. Some point cloud networks apply a Euclidean rotation to the point cloud to preserve rotational symmetry, i.e., the network should make self-consistent prediction if the point cloud is rotated globally [19]. Other variants use k-nearest neighbors in the spatial or momentum space to extract the high-dimensional latent features of each particle, for keeping more local correlation. The k-nearest neighbors of each particle can be calculated in the feature space to capture the long-range multiple particle correlation, because particles that are close in the feature space may be far apart in the spatial or momentum space. This technique is called dynamical edge convolution and was used to search for self-similarity between particles in the momentum space, which is associated with critical phenomena that may occur in HICs [20]. The dynamical edge convolutional neural network is a type of message-passing neural network that is also called a graph neural network.
Generative Modeling
In unsupervised learning, generative modeling is a class of techniques related to probability distribution learning. With regard to tasks, ML can generally be categorized into discriminative modeling and generative modeling. From probabilistic perspectives, discriminative modeling, such as pattern recognition, aims at learning a conditional probability p(y|x), which can be used to make predictions for a given input object (x) its associated properties or class identities (y), while the goal of generative modeling is to capture the joint distribution p(x,y), from which one can generate new data points following the same statistics as the training set. Generative modeling has achieved considerable success in numerous applications, including image synthesis, inpainting, super-resolution, text-to-image translation, speech generation, and chat robots. Many of the generative models were developed with profound influence from and on physics. Generative modeling also has numerous direct applications in science, e.g., computational fluid simulation, drug molecule design, anomaly detection, many-body physics, and lattice field configuration generation for QCD.
The central purpose of generative modeling is to sample data (
In the following, we briefly review several representative and popular deep generative models, including the variational autoencoder (VAE), generative adversarial networks (GANs), autoregressive modeling, and normalizing flows (NF).
VAE [21], introduces a latent variable z to facilitate the generation process; thus, it constructs a trainable conditional probability
It was proven mathematically that the adversarial training of a GAN is equivalent to minimizing the Jensen–Shannon divergence:
Autoregressive model
There are also explicit MLE-based generative models, which are closely related to statistical physics. Among them, the simplest is the autoregressive model [30], which invokes the probability chain rule to decompose the full probability into products of a series of conditionals:
Normalizing flow
The NF [35-37] combines the latent variable model and explicit MLE. It introduces bijective affine transformations to map a simple latent space variable z to the complex data manifold sample x=g(z). The bijectivity requires the transformation to have the same dimensionality in the input and output. This allows for the usage of the change of variable theorem to estimate the likelihood explicitly:
Principal Component Analysis
In ML, principal component analysis (PCA) is a statistical technique that involves transforming a set of correlated variables into independent variables through orthogonal transformations. The principal components, which are associated with the obtained main eigenvectors (or non-negligible singular values), reveal the most representative configurations of the data. As one of the unsupervised learning techniques, PCA implements singular value decomposition (SVD) on a real matrix [42]:
Initial Condition
In the traditional view, the nuclear structure manifests its significance only at low energy, because the high-energy nucleus–nucleus collisions are violent processes in which the whole nucleus is disassembled. However, recent findings have indicated that the initial nuclear structure information is very important for understanding the final observables in high-energy HICs. One of the examples is collective flows, e.g., elliptic flows and triangular flows, in which the initial participant shape and nucleon density distribution, as well as their initial state fluctuations, are relevant. In particular, the collision geometry, neutron skin, deformation, and α-clustering structure significantly affect the final observable. A mini-review can be found in a chapter in the handbook of nuclear physics authored by Ma and Zhang [43]. ML is a powerful tool for discriminating such initial structure information. In this section, we discuss such applications.
Impact parameter estimation
The impact parameter b describes the distance between the centers of the two colliding nuclei in the classical view, which is a crucial quantity determining the initial geometry of a collision. In experiments, the impact parameter is not directly measurable and is usually estimated from the multiplicity of final-state particles in track detectors or the energy deposited in calorimeters. ML approaches are proposed to determine the impact parameters from the final-state particles and exhibit better performance than conventional methods. Ref.[44] proposed the use of a DNN and CNN to reconstruct the impact parameters from the energy spectra of final-state charged hadrons of HICs at
A model-independent Bayesian inference method for reconstructing the impact parameter distributions was proposed in Ref.[46]. The impact parameter distributions are inferred from model-independent data. This method is based on Bayes’ theorem:
End-to-end centrality estimation for CBM
The compressed baryonic matter (CBM) detector is currently under construction for the Facility for Antiproton and Ion Research (FAIR) at Gesellschaft für Schwerionenforschung (GSI), which will study the properties of strongly compressed nuclear matter via HICs with beam energies ranging from 2 to 10 AGeV. A characteristic of the CBM experiment is its very high event rate and trigger rate, which will produce a large amount of raw data per second in real-time and pose a challenge for online event characterization and storage. To address the online event characterization, it is essential to be able to work on the direct output of the detector, which has an inherent point cloud structure—a collection of points as an unordered list with particles or tracks’ attributes recorded. One important property of the point cloud is that they as a whole should be invariant under permutation. The PointNet structure [47] was specially developed to respect this order invariance. Accordingly, for HICs, PointNet-based models can perform real-time physics analysis on the detector output directly.
Refs. [48, 19] proposed the use of PointNet-based models for event-by-event impact parameter determination for the CBM experiment using the direct output from the detector, where the trained model serves as an end-to-end centrality estimator. The supervised learning strategy is used for this regression task, where the training data are prepared from UrQMD followed by CBMRoot detector simulation to obtain the detector output, which are hits or tracks of the particles. A PointNet-based model is constructed and trained to capture the inverse mapping between the detector output and the impact parameter information. It was shown that PointNet-based models can perform accurate event-by-event impact parameter determination using hits of charged particles in different detector planes and/or the tracks reconstructed from these hits. With regard to both precision and accuracy, these models outperformed a baseline model using charged track multiplicity as the input inside a polynomial fit. While the baseline model had a similar resolution (relative precision) to the PointNet-based model in the semi-central collision region, it had a lower accuracy and more fluctuations in the accuracy for impact parameters ranging from 3 to 16 fm, as indicated by the mean of the prediction error for the impact parameter. This trend was more evident for a realistic event distribution (i.e., ~ bdb), as shown in Fig. 4 for the mean prediction error. Considering the natural parallelizability and high speed, the PointNet-based model paves the way for real-time end-to-end event characterization for HIC studies.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F004.jpg)
Nuclear deformation estimation
The momentum distribution of final-state hadrons is sensitive to nuclear shape deformation. For example, owing to the different collision geometry, the elliptic flow as a function of charge multiplicity differs significantly between Pb+Pb and U+U collisions. As shown in Fig. 5, the 208Pb is a doubly magic nucleus with an almost perfectly spherical shape, whose collision patterns depend only on the impact parameter b. In contrast, the shape of 238U is similar to a watermelon, and the corresponding collision patterns are far more complex than those of Pb+Pb collisions. For example, the U+U collisions have body–body aligned, body–body crossed, tip–tip, and tip–body collisions. Different collision patterns correspond to different charge multiplicities and elliptic flows. Both the fully overlapped body–body aligned and central tip–tip collisions correspond to most central collisions with high charge multiplicity, but their elliptic flows differ significantly. This type of difference leads to a far larger variance in the elliptic flow for most central U+U collisions, compared with high-multiplicity Pb+Pb collisions. In principle, the complex collision patterns lead to many differences in the elliptic flow compared with the charged multiplicity diagram. Deep learning can be used to identify these differences and predict the nuclear shape deformation parameters using these patterns.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F005.jpg)
It was demonstrated that by using nuclei with different deformation parameters β2 and β4, high-energy HICs can be simulated using the TRENTo Monte Carlo model to obtain the event-by-event total initial entropy (which is proportional to the final charged multiplicity) and the corresponding geometric eccentricity (which is approximately proportional to the elliptic flow). A deep residual neural network was trained to predict β2 and β4 using the two-dimensional (2D) images of total entropy vs. eccentricity [49]. The network accurately predicted the absolute values of β2 and β4 but failed to predict their signs using the information provided. Using the class activation map (CAM) method to map the last convolutional layer onto the input image, the authors found two regions in the image that are important for decision-making. One is the most central collision region, which is the most sensitive region to the variance of the elliptical flow.
Recently, Bayesian inference with a Gaussian process (GP) emulator was used for reconstructing the nuclear structure including deformation parameters based on HIC measurements [50]. As a first-step exploratory study, the collision observables (charged multiplicities Nch, elliptic flow v2, triangular flow v3, and mean transverse momentum
α-clustering structure
The clustering structure is an exotic phenomenon in nuclei, and it usually occurs in light nuclei [52]. In nuclear collisions between light clustering nuclei and heavy ions, the clustering structure can make the final-state particles anisotropically distributed [43, 53, 54]. It is crucial to extract the quantitative information about the clustering from the final observables.In the 12C / 16O + 197Au collisions at relativistic energies, an ML method was used to obtain evidence of the cluster structures from the azimuthal angle and transverse momentum distributions of charged pions [55]. In this study, a Bayesian convolutional neural network (BCNN) was used. In addition to the input layer and output layer, there were hidden layers, consisting of four convolutional layers and three fully connected layers. The parameters of the three fully connected layers were sampled from distributions learned via Bayesian inference. A 2D histogram of azimuthal angle vs. transverse momentum was used as the input. Considering the detection efficiency in the experiments, charged pions with rapidity ranging from –1 to 1 and transverse momentum ranging from 0 to 2 GeV/c were selected. The dataset consisted of 1.6 × 106 histograms with 64 × 64 bins (pixels), with different labels to indicate different configurations.
The typical spectra of 4000 merged events are shown in Fig. 6. Even with merging, the samples of different configurations are barely distinguishable to the naked eye. The number of merged events is denoted as NEvent, which is taken to be 1000, 2000, and 4000.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F006.jpg)
The learning curves are shown in Fig. 7. As more events were merged, the event-by-event fluctuations were reduced, and the network was able to learn the features of the final state for predicting the initial configuration. For 12C with NEvent = 4000 and 16O with NEvent = 2000, the validation accuracy reached 95% and 97%, respectively, and for 16O with NEvent = 4000, it reached 99%.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F007.jpg)
For the clustering phenomenon, it is extremely difficult to extract signals from the final particles, because fluctuations play such an important role in relativistic HICs. By averaging over multiple events, the BCNN model can learn the features with good performance.
Neutron skin estimation
The distribution of neutrons is important in determining the thickness of the neutron skin, the symmetry energy of the nucleus, the QCD equation of state (EoS) of dense nuclear matter, and astrophysical observables such as the mass–radius relationship of neutron stars and the gravitational wave emitted during neutron star mergers. However, extracting the distribution of neutrons inside the nucleus is extremely difficult. The distribution of neutrons inside the nucleus differs from the distribution of protons. The proton distribution is far easier to measure than the neutron distribution because the former is equivalent to the charge distribution, whereas the latter is associated with the weak charge distribution. The neutron skin, which is the difference between the root-mean-square radii of neutrons and protons, can be used to determine the neutron (weak charge) distribution in the nucleus. PREX2 measured the parity-violating asymmetry by scattering longitudinally polarized electrons on Pb208 to obtain a neutron skin thickness of approximately
There have been many attempts to determine the neutron skin thickness and the symmetry energy at low energy [59], e.g., by investigating the charge-exchange spin–dipole excitation [60], the supernova neutrinos [61], nuclear fragmentation reactions [62], and parity-violating electron scattering [56, 63].
For high-energy HICs, it was proposed that the isobar ratios of the charge multiplicities of the mean transverse momentum and the net charge multiplicities between
A large amount of data has already been collected from high-energy HICs. There may be a data-driven way to reuse these data to determine the neutron distribution and neutron skin thickness. It has been tested in [67] that nucleons sampled from nuclei with different neutron skin types can be classified with reasonable accuracy using deep CNNs and point cloud networks. However, once the nucleus is involved in HICs, it is almost impossible to distinguish the neutron skin types of the colliding nucleus using the momentum distribution of the final-state hadrons. For this task, the signal is weak in minimum bias collisions, and DNNs fail to solve the difficult inverse problem. A new ML method is needed to search for weak signals in data with large statistical fluctuations.
Bulk Matter
Shear and bulk viscosities
The shear and bulk viscosities are important properties that significantly affect the dynamical expansion of QGP and the momentum distribution of final-state hadrons, as indicated by relativistic fluid dynamics simulations [68-71]. In solving the inverse problem of HICs, it was found that the effects of viscosity are entangled with the initial thermalization time, the EoS of QGP, and the phase transition between QGP and HRG. Thus, determining the shear and bulk viscosities of hot nuclear matter is a notoriously difficult problem. Regarding the nucleonic degree of freedom, the shear viscosity has attracted considerable attention because it is related to the nuclear EoS, phase change, and strong interaction [72-75]. A similar feature to the QGP viscosity has been demonstrated for the behavior η/s(T). Bayesian analysis plays an important role in determining the temperature dependence of the ratio of shear viscosity over the entropy density ratio η/s(T) as well as the bulk viscosity over the entropy density ratio ζ/s(T) [76-78].
Suppose that all the parameters in the theoretical model of HICs form a set
To estimate the temperature dependence of the shear and bulk viscosities, two parameterized functions based on physical a priori are required. In a Nature Physics paper [77], the shear and bulk viscosities were parameterized as follows:
Without considering other parameters, these six parameters form a six-dimensional parameter space. The above Bayes formulae are used to traverse this space, with the trajectories forming a set of parameter combinations. This is equivalent to importance sampling using the posterior distribution of the six parameters. Density estimation indicates that the distribution of (η / s)min is approximately normal, whose mean and variance give a quantitative estimate of
Crossover or first-order phase transition
In general, as mentioned in the Introduction, the challenge faced by high-energy nuclear collision studies can essentially be viewed as an inverse problem. Assuming that all related physical factors (e.g., initial condition/fluctuations, QGP bulk properties, transport coefficients, freeze-out parameter, hadronic interactions) are given, well-established theoretical models (e.g., relativistic viscous hydrodynamics with hadronic transport simulation) can be adopted to simulate the HIC process to give their final-state observables, and such a forward process is well understood. However, given instead only limited measurements of the final state of HICs, it is unclear how to disentangle those different influencing physical factors for decoding the corresponding early time dynamics. For high-energy HICs, there are two strategies for solving this inverse problem using statistical methods and ML: one is Bayesian inference with the task of parameter estimation for calibrating the chosen model (e.g., in Ref. [79]), and the other is supervised ML for directly capturing the inverse mapping from the final state to the corresponding physics of interest.
Ref. [80] proposed the use of a deep CNN to capture the direct inverse mapping from the final-state information to the types of QCD transition happened in early time. This is inspired by the success of image recognition in computer vision. Although the inverse mapping may be very implicit, DNNs can be used to decode it and represent it in the sense of Big Data in a supervised manner. The required training data can be prepared through well-established model simulation for HICs, e.g., using the state-of-the-art 3+1-dimensional viscous hydrodynamics [81-84], where diversity can be introduced by varying different physical factors (i.e., parameters in the simulation). As an exploratory study, a binary classification task was targeted, where the Deep CNN was trained to identify the QCD transition type embedded within the collision dynamics as crossover or first-order solely according to the final pion spectra
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F008.jpg)
Fig. 9 shows the space–time evolution histories of QGP expansion starting from the same initial condition model with different fluctuations, in relativistic hydrodynamic simulations using CLVisc. For EOSQ with a first-order phase transition, the pressure gradient is zero in the mixed phase. Multiple ridge structures are formed with a first-order phase transition in the EoS because the expansion of QGP is mainly driven by the pressure gradient and the acceleration is 0 in the mixed phase. However, the expansion histories are significantly different when the shear viscosity is not 0. Different evolution histories lead to different final-state particle spectra in the momentum space.
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F009.jpg)
To verify the robustness of the trained deep CNN in this QCD EoS recognition task, the test set was simulated from a different hydrodynamics package or with different initial fluctuating conditions (IP-Glasma or MC-Glauber) and different
Later, this strategy was deepened in a series of studies for more realistic scenarios, e.g., to take into account the afterburner hadronic cascade by incorporating UrQMD following the hydrodynamics evolution [85, 86]; to consider non-equilibrium dynamics of the phase transition’s influence, e.g., spinodal decomposition [18, 87] or Langevin dynamics [88]; to include more realistic experimental detector effects through detector simulation with hits or tracks as the input [48, 89]; to perform unsupervised outlier detection for HICs [90]; and to determine the nuclear symmetry energy [91]. Specifically, in Ref. [89] it was shown that by using just the detector output directly, PointNet models can be employed to classify collision events simulated by an EoS associated with a first-order phase transition and those simulated by an EoS with a crossover transition. The PointNet models take the reconstructed tracks from the CBM detector (simulated with CBMRoot) followed by the hybrid UrQMD events. They achieved a binary classification accuracy of approximately 96% when trained on collision events for impact parameters ranging from 0 to 7 fm. When the model training set was shrunk to the mid-central region with b=0~3 fm, the model accuracy increased to approximately 99%. A combination of training sets from both peripheral and mid-central collisions resulted in a classifier being able to identify the phase transition type across different centralities, while not compromising the accuracy for the central region.
Active learning for QCD EoS
First-principles calculations using lattice QCD provide the EoS of hot nuclear matter at high temperatures and zero baryon chemical potential. Because of the fermionic sign problem, lattice QCD fails to compute the nuclear EoS at finite μB at present. Using Taylor expansion, it is possible to obtain the nuclear EoS at a small μB that is close to zero, approximately. The BEST collaboration formulated a nuclear EoS with a critical endpoint by mapping the 3D Ising model with the Tylor expansion result. However, the model contains four free parameters whose values determine the size and location of the critical endpoint. Some combinations of these parameters lead to an unphysical, e.g., acausal or unstable, EoS.
Supervised learning can help to map unphysical regions of parameter combinations. However, labeling is computationally expensive in this task. For thermodynamic stability, one must check the positivity of the energy density, pressure, entropy density, baryon density, second-order baryon susceptibility
Active learning was used to find the most informative parameter combinations before labeling them [92]. In active learning, the network is first trained using a small amount of labeled data. Then, the trained network is employed to make predictions on all samples from a large unsupervised pool. If the network is uncertain about one parameter combination, e.g., it predicts that this group of parameter combinations will lead to an EoS that is unphysical with probability 51%, this sample lives on the decision boundary and should be informative and important for the network. Labeling this sample will improve the performance of the network more than labeling easy samples. The newly labeled sample is moved out of the pool and will be used in supervised learning later.
Accelerated relativistic hydrodynamic simulation via deep learning
Relativistic hydrodynamics is a powerful tool for simulating the QGP expansion and studying the flow observables in relativistic HICs at the RHIC and LHC energies [95-100]. For ideal hydrodynamics with zero net charge densities, it solves the transport equations of the energy momentum tensor:
Recently [93, 94], a DNN called stacked U-net (sU-net) was designed and trained to learn the initial- and final-state mapping from the nonlinear hydrodynamic evolution. The constructed sU-net has an encoder–decoder architecture, which contains four U-net blocks with residual connections between them. For each U-net block, there are three convolutional and deconvolutional layers with Leaky ReLU and softplus activation functions employed for the inner and output layers, respectively. By concatenating the feature maps along the channel dimension, the output of the first two convolutional layers is fed to the last two deconvolution layers. For details, please refer to [93, 94].
The training and test data (the profiles of the initial and final energy momentum tensor
Compared with the 10∼20 minutes simulation time of VISH2+1 on a traditional CPU, sU-net took several seconds to directly generate the final profiles for different types of initial conditions on one P40 GPU, which significantly accelerated the traditional hydrodynamic simulations. However, the sU-net model designed and trained in Refs. [93, 94] mainly focuses on mimicking the 2+1-dimensional hydrodynamic evolution with a fixed evolution time. For more realistic implementation, it is important to explore the possibilities of mapping the initial profiles to the final profiles of the particles emitted on the freeze-out surface of the relativistic HICs.
In-medium Effects
Spectral function reconstruction
Accessing real-time properties of QCD (or a many-body system in general) remains a notoriously difficult problem, because the non-perturbative computations, such as lattice field simulations or functional methods, usually operate in Euclidean space–time (after a Wick rotation
The associated ill-posed problem can be cast as a Fredholm equation of the first kind:
Recently, deep learning-based strategies have also been explored to tackle spectral reconstruction, which can be mainly categorized into two schemes: data-driven supervised learning approaches and unsupervised learning-based approaches. The first application of domain-knowledge-free deep-learning methods to this ill-conditioned spectral reconstruction (also called analytic continuation) was reported in Ref. [111] in the context of general quantum many-body physics. The results indicated the good performance of DNNs with supervised training in the cases of a Mott–Hubbard insulator and a metallic spectrum. In particular, a CNN was found to achieve better reconstruction than a fully connected network, with performance superior to that of the MEM—one of the most widely used conventional methods. In Ref. [112], the authors adopted a similar strategy but also introduced PCA to reduce the dimensionality of the QMC-simulated imaginary time correlation function of the position operator for a harmonic oscillator linearly coupled to an ideal heat bath.
The authors of Ref. [113] also adopted a data-driven perspective. They adopted a strategy similar to spectral function reconstruction in the QFT context and considered the Källen–Lehmann spectral representation as the accessible propagator, i.e.,
As another type of non-parametric representation, GPs were used in the reconstruction of the 2+1 flavor QCD ghost and gluon spectral function in Ref. [115]. In general, the GP can define a probability distribution over families of functions, which is typically characterized by the chosen kernel function. In Ref. [115] the GP was assumed to describe the spectral function:
In Refs. [114, 118, 119], the authors developed an unsupervised approach based on DNN representation for the spectral function together with automatic differentiation (AD) to reconstruct the spectral function, which does not need training data preparation for supervision (a similar DNN-based inverse problem solving strategy within the AD framework was used for reconstructing the neutron-star EoS from astrophysical observables [120, 121] and inferring the parton distribution function of pions in lattice QCD studies [122]). The introduced DNN representation can preserve the smoothness of the spectral function automatically, helping to regularize the degeneracy issue in this inverse problem. This is because, as analyzed in Ref. [119], the degeneracy is related to the null modes of the investigated kernel function, which usually induce oscillation for the reconstructed spectral function. Specifically, the DNN-represented spectral function, i.e.,
For the DNN representation of the spectral function, two different schemes were investigated in this work: one uses the multiple outputs of an L-layer neural network to represent in list format the spectral function (denoted as NN), and the other directly uses a feedforward neural network for parameterization (denoted as NN-P2P) of the spectral function as a function of frequency, i.e., ρ(ω). For the training, the Adam optimizer is adopted, and the L2 regularization is set in the warm-up beginning stage under an annealing strategy until the regularization strength value is sufficiently small (set as < 10-8 in the calculation). This can relax the regularization to obtain hyperparameter-independent inference results. For the direct NN list representation, a quenched implementation of smoothness condition
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F011.jpg)
In addition to Gaussian-like and Lorentzian-like spectral reconstruction tests, the newly devised framework presented in Refs. [114, 118] was validated through two physics-motivated tests. One was for non-positive definite spectral reconstruction, which is beyond the scope of classical MEM applicability but is often encountered for spectral functions related to confinement phenomenon of, e.g., gluons and ghosts, or thermal excitations with long-range correlation in strongly coupled systems. The other one was for the hadron spectral function encoded in the temperature-dependent thermal correlator with lattice QCD noise-level noises. For both of these physical cases, the proposed DNN and AD-based method with NN representation consistently works well, whereas traditional MEM based methods lose the peak information or fail to resolve the non-positiveness.
The spectral function can also be reconstructed from finite correlation data by implementing the radial basis function network (RBFN), which is an MLP model based on the RBF [127, 128]. The RBFN has been widely used in feature extraction, classification, regression, etc. [129-132]. In Ref. [123], the spectral function
For calculating these parameters in Eq.(45), in Ref. [123], a neutral network called the RBFN was constructed, which is a three-layer feedforward neural network with the active RBFs in the hidden layer. After discretization of the spectral function, Eq.(45) is converted to matrix form:
Fig. 12 shows a comparison of the spectral functions reconstructed using RBFN, TSVD, Tikhonov, and MEM, using the correlation data generated by a mock SPF. The mock SPF was obtained by mixing two Breit–Wigner distributions:
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F012.jpg)
Compared with the results of traditional methods, the RBFN provided a better description of the spectral functions—particularly for the low-frequency part. It almost reproduced the first peak of the mock SPF using the correlation data with a small amount of noise ϵ = 0.00001. In contrast, Tikhonov, TSVD, and MEM exhibited oscillation behavior at a low frequency. For such a task of extracting the transport coefficients from the Kubo relation, an improved reconstruction of the spectral functions at a low frequency is important. Although the RBFN failed to reconstruct the second peak of the mock SPF, it was the only method that reduced the oscillation at the low frequency, among the methods tested. In Ref. [123], the Gaussian and MQ RBFs used in the network were compared, and it was found that the Gaussian RBF provided better construction of the SPF, including the location and the width of the peak. Additionally, with mock data generated from the spectral function of the energy momentum tensor, it was demonstrated that the RBFN method allows precise and stable extraction of the transport coefficients.
In-medium heavy quark potential
As an important probe for the properties of the created QGP in HICs, heavy quarkonium (the bound state of a heavy quark and its anti-quark) has been intensively measured in experiments and analyzed in theoretical studies [135, 136], wherein the investigation and calculation require an understanding of the in-medium heavy quark interaction. The heavy quarkonium provides a calibrated QCD force, because in vacuum the simple Cornell potential can well reproduce the spectroscopy of heavy quarkonium, and when we put the bound state into the QCD medium, the color screening effects naturally occur and weaken the interactions between the heavy quarks, beyond which a non-vanishing imaginary part manifested as thermal width is argued to appear according to both one-loop hard thermal loop (HTL) perturbative QCD calculations [137, 138] and recent effective field theory (EFT) studies, e.g., those on PNRQCD [139, 140]. However, a non-perturbative treatment similar to that of lattice QCD is necessary because it is difficult to obtain a satisfactory description of the strong interaction dictated in-medium heavy quarkonium solely from perturbative calculations. These EFT studies suggested that a potential-based picture can provide a good approximation of the quarkonium, under which the Schrödinger equation can be employed to study the spectroscopy of the bound state. Recent lattice QCD studies involved quantification of the in-medium spectrum–mass shift and thermal widths of bottomonium (
In Ref. [124], the authors developed a model-independent DNN-based method for reconstructing the temperature and inter-quark distance-dependent in-medium heavy quark potential according to the aforementioned lattice QCD results for bottomonium. Inspired by the universal approximation theorem, the authors introduced the DNN to parameterize the potential in an unbiased yet flexible manner (can be named as potential-DNN). The DNN-represented heavy quark potential is coupled to the Schrödinger equation solving process to be converted into complex valued energy eigenvalues En, which are related to the bound state in-medium mass and thermal width through Re[En]=mn-2mb and Im[En]=-Γn. Through comparison with the lattice QCD “measurements”, the corresponding χ2 provide the loss function for optimizing the parameters of the potential-DNN:
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F013.jpg)
Deep learning for quasi-particle mass
The EoS of hadron resonance gas in the QCD phase diagram can be calculated using a simple statistical formula with the following partition function:
Hard Probe
Energetic partons lose energy as they pass through the hot QGP. This process is quantified by the jet transport coefficient
Deep learning has been widely used in high-energy particle physics to analyze the substructures of jets and to classify jets using the momentum of final-state hadrons in jets [153, 154]. In HICs, deep learning is used not only to classify quark and gluon jets but also to study the jet energy loss, the medium response, and the initial jet production positions [155-157].
Constraining the initial jet production positions will allow more detailed and differential studies of jet quenching. For example, one task in the field of HICs is to search for Mach cones in QGP produced by the supersonic parton jets. The difficulty is that the jets are produced at different locations in the initial state and travel in different directions in the QGP. Consequently, the shape of the Mach cone depends on the path length and is distorted by the local radial flow and temperature gradient. Predicting jet production positions using deep learning will help to select jet events whose Mach cones have similar shapes, enhancing the signal of the Mach cones in the final-state hadron distribution.
In these studies, the training data are usually generated by jet transport models [158, 159]; e.g., in the linear Boltzmann transport model (LBT), the jet parton loses energy through elastic scattering with thermal partons in QGP and inelastic gluon radiation. This process is described by a linearized Boltzmann equation:
The lost energy is deposited in QGP as represented by source terms of the relativistic hydrodynamic equations:
The initial jet production positions are sampled from the distribution of hard scattering, which is proportional to the distribution of binary collisions. The initial entropy density distribution is provided by the TRENTo Monte Carlo model, from which the initial Tμν can be calculated. Simultaneously solving Eqs. 53 and 54 provides both the jet energy loss and the medium response in each simulation. Typically, 10~100 thousands jet events are needed to predict the initial jet production positions. Of course, a larger amount of training data is better, provided that there are sufficient computational resources.
One may ask whether there is a type of DNN that is best suited to studying jet energy loss and predicting jet production positions. In practice, CNNs, point cloud neural networks, and graph neural networks have been used in different projects. Typically, the performance of different neural-network architectures is tested, and the one that works best for the specific task is selected. The simplest yet most powerful CNN should be the first to be tested in jet shape and jet energy loss studies. To capture the full information in jets, a point cloud network and a message-passing neural network can be used.
Observables in HICs
PCA for flow analysis
In relativistic HICs, the collective flow provides important information about the properties of the QGP and its initial state fluctuations [95-100]. The flow observables are generally defined by a Fourier decomposition of the produced particle distribution in the momentum space, such as
Recently, an ML technique called PCA based on SVD has been used to study the collective flow in relativistic HICs. For the two-particle correlations with the Fourier expansion [166-169], the event-by-event flow fluctuations have been investigated via PCA, revealing the substructures of the flow fluctuations [166-168]. Using PCA, VnΔ(PT1,Pt2) can be expressed as [167]
The aforementioned PCA studies on collective flow [166-171] were all based on the correlation data obtained with a Fourier expansion. Recently, PCA has been applied directly to single particle distributions dN/dφ without prior treatment with a Fourier transform, for exploring whether it can be used to directly discover flow without the guidance from humans [165]. Specifically, with PCA matrix multiplication, the ith row of a particle distribution matrix with N events generated from VISH2+1 hydrodynamics can be expressed as
Figs. 14 and 15 show the first 12 eigenvectors zj and the first 20 singular values σj of the PCA in descending order for the final-state matrix constructed from 2000 dN/dφ distributions with the azimuthal angle [-π,π] equally divided into 50 bins. Such dN/dφ distributions are generated from the VISH2+1 hydrodynamics with event-by-event fluctuating TRENTo initial conditions for 2.76 A TeV Pb+Pb collisions at 10%–20% centrality. Fig. 14 shows that the PCA eigenvectors are similar to the traditional Fourier bases. For example, the 1st and 2nd eigenvectors are close to sin(2φ) and cos(2φ), and the 3rd and 4th eigenvectors are close to sin(3φ) and cos(3φ). The corresponding singular values in Fig. 15 are arranged in pairs, which correspond to the real and imaginary parts of the anisotropic flow. It was found that for n≤6, the values of these PCA flow harmonics were very close to those of the traditional event-averaged flow harmonics obtained from the Fourier expansion but not exactly the same. Fig. 16 presents a comparison of the event-by-event flow harmonics obtained from PCA and from the traditional Fourier expansion. As shown, the elliptic flow with n=2 and the triangular flow with n=3 from the two methods agreed well. However, for higher flow harmonics with
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F014.jpg)
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F015.jpg)
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F016.jpg)
CME detection
In the presence of a magnetic field, the chiral magnetic effect (CME) can occur when the system has a chiral imbalance, i.e., the numbers of left- and right-handed particles differ. Essentially, a current of electric charge (known as chiral magnetic current) can be induced to flow along the direction of the magnetic field. The use of the CME to reveal the vacuum structure of QCD has been proposed. In HICs, a strong magnetic field can be created by the motion of the colliding ions, and it is predicted that in the formed hot and dense QGP, the topological fluctuations of gluon fields may cause chiral imbalance for quarks. Accordingly, the CME may occur, which can manifest as a separation of electric charge along the magnetic-field direction. However, several challenges hinder the detection of the CME in HICs, among which the chief difficulty is disentangling the CME signal from other possible sources of charge separation (CS), e.g., elliptic flow, the global polarization, and other background noises, although multiple observables are proposed.
Despite the challenges, there is long-term and continuing interest in the search for the CME in HICs because of its general importance to QCD. Recently, Ref. [172] proposed the use of deep learning to construct an end-to-end CME-meter that can efficiently analyze the final-state hadronic spectrum as a whole in the sense of Big Data with a deep CNN to reveal the fingerprints of the CME. For supervised learning, the training set was prepared from the string melting AMPT model with CME implemented under a global CS scheme. Essentially, the CME events are generated by switching the y-components of momenta of a fraction of a downward moving light quark and its corresponding anti-quarks with upward moving direction. The fraction defines the CS fraction f, which separates the events into the “no CS” (label as “0”) class for those with f=0% and the “CS” class (label as “1”) for those with f>0%. Each event is represented as 2D transverse momentum and azimuthal angle spectra of charged pions in the final state, i.e.,
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F017.jpg)
As shown in Fig. 17, the output of the network has two nodes, each of which is naturally interpreted as the probability resulting from the network decision in recognizing any given input spectrum as CME (P1) or non-CME (P0=1-P1) events. The training set contains multiple collision beam energies and centralities for diversity consideration. The pion spectrum is obtained by averaging over 100 events with the same collision condition to reduce the fluctuations, which also reduces the backgrounds and thus should be considered a prerequisite for realistic application in experiments. For the training, different levels of the CS fraction are used, and it is found that the classification validation accuracy is lower for a smaller number of CS fraction training events. This indicates that a larger CS fraction can be identified more easily, which is expected. Despite the different induced discernibility, the trained deep CNNs all exhibited robust performance against varying collision centrality and energy. One can conclude that at least under the AMPT modeling level, the CS signals can survive into the final state of the collision dynamics at different collision conditions, which can be recognized by the deep CNN-based CME-meter.
Note that the network was trained only on Au+Au collision systems, while the extrapolation to other collision system was validated. Specifically, the obtained CME-meter was applied to isobaric collisions of
Centrality | 0-10% | 10-20% | 20-30% | 30-40% | 40-50% | 50-60% |
---|---|---|---|---|---|---|
Riso | 9.95% | 12.99% | 8.13% | 13.84% | 19.67% | 10.47% |
The CME-meter was also validated through a different model simulation, i.e., anomalous-viscous fluid dynamics (AVFD). P1 exhibited a consistent positive correlation with N5/S, which controlled the CME strength, while the contamination from local charge conservation (LCC) up to 30% did not augment the performance of the CME-meter on the testing events from AVFD. In Ref. [172], to reveal the underlying account for the trained CME-meter, the network output P1 and γ-correlator were compared. The γ-correlator—a conventional CME probe—can measure the event-by-event two-particle azimuthal correlation of charged hadrons. It was shown that for averaged events, both the CME signal and the background from δγ (difference between correlations within particles of the same charge and within particles of opposite charge) are suppressed. Being differently, the CME-meter output P1 works well in classifying CS and no-CS classes on the averaged events.
The direct implementation of this trained CME-meter in real experiments would require reconstructing the reaction plane of each collision event to form the averaged events as input for the meter. In general, the reaction plane reconstruction can be achieved by measuring correlations of final-state particles, and it inevitably contains finite resolution and background effects. It was shown that even in restricted event plane reconstruction, the trained CME-meter can recognize the CS signals. For the deployment of the trained CME-meter on single event measurements, Ref. [172] proposed a hypothesis test perspective.
Another way to interpret the trained deep-learning algorithm is the DeepDream method, which was used in Ref. [172] to reconstruct the network most responding input pion spectrum, manifesting the “CME pattern” that the CNN-based CME-meter essentially captured for its further CME signal recognition. The key idea is to perform variational tuning on the input pion spectrum with the trained and frozen network to maximize its output (i.e., pushing
-202306/1001-8042-34-06-008/alternativeImage/1001-8042-34-06-008-F018.jpg)
Summary and Outlook
Summary
As a modern computational paradigm, AI—particularly machine- and deep-learning techniques—has introduced a wealth of applications and new possibilities in scientific research. Owing to its special ability to recognize patterns and structures hidden in complex data, these learning-based strategies make physics exploration with a Big Data or smart computation mindset feasible. In the context of HENP revolving around HIC programs to understand nuclear matter properties under different conditions, various research fields have benefited from the incorporation of these techniques.
In this mini-review, we presented the recent progress in the field of HICs, including initial state physics inference, QCD matter transport and bulk properties, thermal medium modifications for partons or hadrons, and recognition of physical observables in HICs.
We first reviewed different loss functions l used in supervised learning, unsupervised learning, semi-supervised learning, self-supervised learning, and active learning. During training, the negative gradients
For the initial condition, ML has been widely used to determine the centrality classes and impact parameters using the final-state hadrons in the momentum space, for extracting the initial nuclear structures, such as the nuclear deformation, the α clustering, and the neutron skin. In general, it is easier to extract the nuclear deformation than the α clustering and neutron skin according to the current literature.
For bulk matter, Bayesian parameter estimation has been successfully used to determine the temperature-dependent shear and bulk viscosities of QGP. An unsupervised autoencoder was used to reconstruct the charged multiplicity distributions, which helps to determine the source temperature and the temperature of the nuclear liquid gas phase transition. Deep CNNs, point cloud networks, and event-averaging techniques are employed to classify the crossover and first-order phase transition regions in the QCD phase diagram, using data generated by relativistic hydrodynamic models and hadronic transport models. Active learning is used to map out thermodynamically unstable regions near the critical endpoint in the QCD phase diagram. For hydrodynamic evolution, a well-designed network called sU-net can capture the nonlinear mapping between the initial and final profiles with sufficient precision, which is also far faster than the traditional hydrodynamic simulations.
For QGP in-medium effects, we first reported some of the recently proposed ML-based methods for spectral function reconstruction, which is a notorious ill-posed inverse problem. Both supervised and unsupervised methods have been discussed for the inference of spectral out-of-Euclidean correlator measurements from Monte Carlo simulations (e.g., lattice study). Then, in-medium heavy quark interaction inference based on in-medium heavy quarkonium spectroscopy was introduced. A novel DNN representation integrated inside the forward problem-solving pipeline with AD was proposed. This strategy is also used for in-medium quasi-particle effective model construction from the lattice QCD EoS.
For hard probes, Bayesian analysis is widely used to extract the temperature-dependent jet (or heavy quark) transport coefficient
For the observables, PCA has been implemented to study the collective flow in relativistic HICs. This revealed the substructures of the flow fluctuations, which can potentially be used to extract the subleading modes of flow with efforts from both the experimental and theoretical sides. When applied directly to the single particle distributions, PCA can directly discover flow with a basis similar to the Fourier expansion ones, which significantly reduces the mode coupling between different flow harmonics.
Outlook
Despite the impressive progress, the interplay between HENP and ML is still inducing hectic evolution. Many questions and challenges remain and deserve further exploration. In addition to the aforementioned applications of ML in the field of HICs, several other topics can be explored with ML, e.g., critical endpoint searching for eRHIC and the Electron Ion Collider (EIC) regime [173], spin polarization study, the upcoming FAIR program, Nuclotron-based Ion Collider facility (NICA) experiments, nuclear structure inference, and High Intensity Heavy-ion Accelerator Facility (HIAF) experiments in China. Regarding the future prospects of applying ML techniques for HIC physics research, because this field is rapidly evolving, we present questions that we consider worthy of future investigation:
• Can ML provide more efficient "observables” to pin down the desired physics?
• Can the algorithms provide new physical knowledge to advance our understanding of nuclear matter?
• How can we make the ML algorithms to be confronted with realistic experiments? Is on-line analysis possible? How can experimental raw data be accessed to test neural networks pretrained with model simulations?
• Is it possible to accelerate HIC dynamical simulations for performing high statistics measurements or Bayesian inference?
• How can Bayesian inference be combined with ML to advance our field and better connect experiment to theory?
• How can symmetries be fully incorporated into the analysis using ML, e.g., Lorentz Group Equivariant Autoencoders [174]? How can dimensionality analysis (constraints) be incorporated into the ML methods properly and consistently?
It is also important to consider how we can adopt potentially useful approaches from other fields, e.g., particle physics, condensed-matter physics, and astrophysics, and how the community can better organize with joint efforts, e.g., for maximizing the potential of these novel computational techniques to advance the field of HENP.
Shared Data and Algorithms for Deep Learning in Fundamental Physics
. Comput. Softw. Big Sci. 6, 9 (2022). doi: 10.1007/s41781-022-00082-6Lattice Gauge Equivariant Convolutional Neural Networks
. Phys. Rev. Lett. 128, 032003 (2022). doi: 10.1103/PhysRevLett.128.032003Machine learning the nuclear mass
. Nucl. Sci. Tech. 32, 109 (2021). doi: 10.1007/s41365-021-00956-1Enhanced search sensitivity to the double beta decay of 136xe to excited states with topological signatures
. Science China Physics, Mechanics Astronomy 64, 261011 (2021). doi: 10.1007/s11433-020-1693-6Nuclear mass based on the multi-task learning neural network method
. Nucl. Sci. Tech. 33, 48 (2022). doi: 10.1007/s41365-022-01031-zMulti-task learning on nuclear masses and separation energies with the kernel ridge regression
. Physics Letters B 834, 137394 (2022). doi: 10.1016/j.physletb.2022.137394Fast nuclide identification based on a sequential bayesian method
. Nucl. Sci. Tech. 32, 143 (2021).Application of machine learning to study the effects of quadrupole deformation on the nucleus in heavy-ion collisions at intermediate energies
. Science China Physics, Mechanics Astronomy 52, 252010-(2022). doi: 10.1360/SSPMA-2021-0308β-decay half-lives studied using neural network method
. Science China Physics, Mechanics Astronomy 52, 252006-(2022). doi: 10.1360/SSPMA-2021-0299Nuclear liquid-gas phase transition with machine learning
. Phys. Rev. Research 2, 043202 (2020). doi: 10.1103/PhysRevResearch.2.043202Determining temperature in heavy ion collisions with multiplicity distribution
. Phys. Lett. B 814, 136084 (2021). doi: 10.1016/j.physletb.2021.136084Machine learning in nuclear physics at low and intermediate energies.
(2023). arXiv:2301.06396Deep learning exotic hadrons
. Phys. Rev. D 105, L091501 (2022). doi: 10.1103/PhysRevD.105.L091501Approach the Gell-Mann-Okubo formula with machine learning
. Chin. Phys. Lett. 39, 111201 (2022). doi: 10.1088/0256-307X/39/11/111201Symmetry discovery with deep learning
. Phys. Rev. D 105, 096031 (2022). doi: 10.1103/PhysRevD.105.096031A machine learning study to identify spinodal clumping in high energy nuclear collisions
. JHEP 12, 122 (2019). doi: 10.1007/JHEP12(2019)122Deep Learning Based Impact Parameter Determination for the CBM Experiment
. Particles 4, 47-52 (2021). doi: 10.3390/particles4010006Probing criticality with deep learning in relativistic heavy-ion collisions.
(2021). arXiv:2107.11828Auto-Encoding Variational Bayes
. 2014. arXiv:http://arxiv.org/abs/1312.6114v10Reconstruction of three-dimensional porous media using generative adversarial neural networks
. 96, 043309 (2017). doi: 10.1103/PhysRevE.96.043309Deep neural networks for direct, featureless learning through observation: The case of two-dimensional spin models
. 97, 032119 (2018). doi: 10.1103/PhysRevE.97.032119Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis
. Comput. Softw. Big Sci. 1, 4 (2017). doi: 10.1007/s41781-017-0004-6Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multilayer Calorimeters
. Phys. Rev. Lett. 120, 042003 (2018). doi: 10.1103/PhysRevLett.120.042003Enabling Dark Energy Science with Deep Generative Models of Galaxy Images.
(2017). arXiv:1609.05796CosmoGAN: creating high-fidelity weak lensing convergence maps using Generative Adversarial Networks
. Computational Astrophysics and Cosmology 6, 1 (2019). doi: 10.1186/s40668-019-0029-9Regressive and generative neural networks for scalar field theory
. Phys. Rev. D 100, 011501 (2019). doi: 10.1103/PhysRevD.100.011501Reducing Autocorrelation Times in Lattice Simulations with Generative Adversarial Networks
. Mach. Learn. Sci. Tech. 1, 045011 (2020). doi: 10.1088/2632-2153/abae73MADE: Masked Autoencoder for Distribution Estimation
. arXiv e-prints arXiv:1502.03509 (2015).Proceedings of the 30th International Conference on Neural Information Processing Systems, Conditional image generation with pixelcnn decoders
. NIPS’16, (Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, Pixel recurrent neural networks
. ICML’16, (JMLR.org, 2016), p. 1747-1756Wavenet: A generative model for raw audio.
, cite arxiv:1609.03499 (2016) http://arxiv.org/abs/1609.03499Continuous-Mixture Autoregressive Networks Learning the Kosterlitz-Thouless Transition
. Chin. Phys. Lett. 39, 120502 (2022). doi: 10.1088/0256-307X/39/12/120502NICE: Non-linear Independent Components Estimation
. arXiv e-prints arXiv:1410.8516 (2014).Variational Inference with Normalizing Flows
. arXiv e-prints arXiv:1505.05770 (2015).Density estimation using Real NVP
. arXiv e-prints arXiv:1605.08803 (2016).Flow-based generative models for Markov chain Monte Carlo in lattice field theory
. Phys. Rev. D 100, 034515 (2019). doi: 10.1103/PhysRevD.100.034515Equivariant flow-based sampling for lattice gauge theory
. Phys. Rev. Lett. 125, 121601 (2020). doi: 10.1103/PhysRevLett.125.121601Sampling using SU(N) gauge equivariant flows
. Phys. Rev. D 103, 074504 (2021). doi: 10.1103/PhysRevD.103.074504Fourier-Flow model generating Feynman paths
. (2022). arXiv:2211.03470in A Tutorial on Principal Component Analysis (2014) CoRR
arXiv: 1404.1100Determination of the impact parameter in high-energy heavy-ion collisions via deep learning *
. Chin. Phys. C 46, 074110 (2022). doi: 10.1088/1674-1137/ac6490Application of artificial intelligence in the determination of impact parameter in heavy-ion collisions at intermediate energies
. J. Phys. G 47, 115104 (2020). doi: 10.1088/1361-6471/abb1f9Fluctuation mechanism and reconstruction of impact parameter distributions with two-observables for intermediate energy heavy ion collisions
. (2022). arXiv:2201.12586Pointnet: Deep learning on point sets for 3d classification and segmentation
. 2017, pp. 77–85. doi: 10.1109/CVPR.2017.16A fast centrality-meter for heavy-ion collisions at the CBM experiment
. Phys. Lett. B 811, 135872 (2020). doi: 10.1016/j.physletb.2020.135872Interpretable deep learning for nuclear deformation in heavy ion collisions
. (2019). arXiv:1906.06429How does Bayesian analysis infer the nucleon distributions in isobar collisions? (2023)
. arXiv:2301.03910Evidence of Quadrupole and Octupole Deformations in Zr96+Zr96 and Ru96+Ru96 Collisions at Ultrarelativistic Energies
. Phys. Rev. Lett. 128, 022301 (2022). doi: 10.1103/PhysRevLett.128.022301Giant dipole resonance as a fingerprint of clustering configurations in 12c and 16o
. Phys. Rev. Lett. 113, 032506 (2014). doi: 10.1103/PhysRevLett.113.032506α-clustering effect on flows of direct photons in heavy-ion collisions
. Nucl. Sci. Tech. 32, 66 (2021). doi: 10.1007/s41365-021-00897-9Nuclear cluster structure effect on elliptic and triangular flows in heavy-ion collisions
. Phys. Rev. C 95, 064904 (2017). doi: 10.1103/PhysRevC.95.064904Machine-learning-based identification for initial clustering structure in relativistic heavy-ion collisions
. Phys. Rev. C 104, 044902 (2021). doi: 10.1103/PhysRevC.104.044902Accurate Determination of the Neutron Skin Thickness of 208Pb through Parity-Violation in Electron Scattering
. Phys. Rev. Lett. 126, 172502 (2021). doi: 10.1103/PhysRevLett.126.172502Bayesian inference of nucleus resonance and neutron skin
. (2023). arXiv:2301.07884Bayesian inference of nuclear symmetry energy from measured and imagined neutron skin thickness in 116,118,120,122,124,130,132Sn, 208Pb, and 48Ca
. Phys. Rev. C 102, 044316 (2020). doi: 10.1103/PhysRevC.102.044316Constraints on the symmetry energy and neutron skins from experiments and theory
. Phys. Rev. C 86, 015803 (2012). doi: 10.1103/PhysRevC.86.015803Neutron skin thickness of 90Zr and symmetry energy constrained by charge exchange spin-dipole excitations
. Chin. Phys. C 47, 024102 (2023). doi: 10.1088/1674-1137/aca38eSupernova neutrinos as a precise probe of nuclear neutron skin
. Phys. Rev. D 106, 123034 (2022). doi: 10.1103/PhysRevD.106.123034Nuclear fragmentation reactions as a probe of neutron skins in nuclei
. Eur. Phys. J. A 58, 205 (2022). doi: 10.1140/epja/s10050-022-00849-wDetermination of the 27AI Neutron Distribution Radius from a Parity-Violating Electron Scattering Measurement
. Phys. Rev. Lett. 128, 132501 (2022). doi: 10.1103/PhysRevLett.128.132501Probing neutron skin and symmetry energy with relativistic isobar collisions
. 2023. arXiv:2301.08303Peeling away neutron skin in ultracentral collisions of relativistic nuclei
. Eur. Phys. J. A 58, 184 (2022). doi: 10.1140/epja/s10050-022-00832-5Probing neutron-skin thickness with free spectator neutrons in ultracentral high-energy isobaric collisions
. Phys. Lett. B 834, 137441 (2022). doi: 10.1016/j.physletb.2022.137441Determining the neutron skin types using deep learning and nuclear collisions: An attempt
. Science China Physics, Mechanics Astronomy 52, 252011-(2022). doi: 10.1360/SSPMA-2021-0318Dissipative hydrodynamics for viscous relativistic fluids
. Phys. Rev. C 73, 034904 (2006). doi: 10.1103/PhysRevC.73.034904Viscosity Information from Relativistic Nuclear Collisions: How Perfect is the Fluid Observed at RHIC?
Phys. Rev. Lett. 99, 172301 (2007). doi: 10.1103/PhysRevLett.99.172301The Effects of viscosity on spectra, elliptic flow, and HBT radii
. Phys. Rev. C 68, 034913 (2003). doi: 10.1103/PhysRevC.68.034913Phenomenological study of the anisotropic quark matter in the two-flavor Nambu-Jona-Lasinio model
. Nucl. Sci. Tech. 33, 150 (2022). doi: 10.1007/s41365-022-01129-4Thermodynamic properties and shear viscosity over entropy-density ratio of the nuclear fireball in a quantum-molecular dynamics model
. Phys. Rev. C 88, 024604 (2013). doi: 10.1103/PhysRevC.88.024604Shear viscosity of hot nuclear matter by the mean free path method
. Phys. Rev. C 89, 047601 (2014). doi: 10.1103/PhysRevC.89.047601Impact of fragment formation on shear viscosity in the nuclear liquid-gas phase transition region
. Phys. Rev. C 105, 064613 (2022). doi: 10.1103/PhysRevC.105.064613Applying Bayesian parameter estimation to relativistic heavy-ion collisions: simultaneous characterization of the initial state and quark-gluon plasma medium
. Phys. Rev. C 94, 024907 (2016). doi: 10.1103/PhysRevC.94.024907Bayesian estimation of the specific shear and bulk viscosity of quark–gluon plasma
. Nature Phys. 15, 1113-1117 (2019). doi: 10.1038/s41567-019-0611-8Bayesian Inference of the Specific Shear and Bulk Viscosities of the Quark-Gluon Plasma at Crossover from φ and Ω Observables
. (2022). arXiv:2207.13534The QCD EoS of dense nuclear matter from Bayesian analysis of heavy ion collision data
. (2022). arXiv:2211.11670An equation-of-state-meter of quantum chromodynamics transition from deep learning
. Nature Commun. 9, 210 (2018). doi: 10.1038/s41467-017-02726-3Effects of initial flow velocity fluctuation in event-by-event (3+1)D hydrodynamics
. Phys. Rev. C 86, 024911 (2012). doi: 10.1103/PhysRevC.86.024911The iEBE-VISHNU code package for relativistic heavy-ion collisions
. Comput. Phys. Commun. 199, 61-85 (2016). doi: 10.1016/j.cpc.2015.08.0393d structure of jet-induced diffusion wake in an expanding quark-gluon plasma
. Phys. Rev. Lett. 130, 052301 (2023). doi: 10.1103/PhysRevLett.130.0523013d wakes on the femtometer scale by supersonic jets
. Nucl. Sci. Tech. 34, 22 (2023). doi: 10.1007/s41365-023-01182-7Identifying the nature of the QCD transition in relativistic collision of heavy nuclei with deep learning
. Eur. Phys. J. C 80, 516 (2020). doi: 10.1140/epjc/s10052-020-8030-7Identifying the nature of the QCD transition in heavy-ion collisions with deep learning
. Nucl. Phys. A 1005, 121891 (2021). doi: 10.1016/j.nuclphysa.2020.121891A machine learning study on spinodal clumping in heavy ion collisions
. Nucl. Phys. A 1005, 121867 (2021). doi: 10.1016/j.nuclphysa.2020.121867Deep learning stochastic processes with QCD phase transition
. Phys. Rev. D 103, 116023 (2021). doi: 10.1103/PhysRevD.103.116023An equation-of-state-meter for CBM using PointNet
. JHEP 21, 184 (2020). doi: 10.1007/JHEP10(2021)184Unsupervised Outlier Detection in Heavy-Ion Collisions
. Phys. Scripta 96, 064003 (2021). doi: 10.1088/1402-4896/abf214Finding signatures of the nuclear symmetry energy in heavy-ion collisions with deep learning
. Phys. Lett. B 822, 136669 (2021). doi: 10.1016/j.physletb.2021.136669Mapping out the thermodynamic stability of a QCD equation of state with a critical point using active learning
. (2022). arXiv:2203.13876Applications of deep learning to relativistic hydrodynamics
. Phys. Rev. Res. 3, 023256 (2021). doi: 10.1103/PhysRevResearch.3.023256Applications of deep learning to relativistic hydrodynamics
. Nucl. Phys. A 982, 927-930 (2019). doi: 10.1016/j.nuclphysa.2018.11.004New Developments in Relativistic Viscous Hydrodynamics
. Int. J. Mod. Phys. E 19, 1-53 (2010). doi: 10.1142/S0218301310014613Collective flow and viscosity in relativistic heavy-ion collisions
. Ann. Rev. Nucl. Part. Sci. 63, 123-151 (2013). doi: 10.1146/annurev-nucl-102212-170540Hydrodynamic Modeling of Heavy-Ion Collisions
. Int. J. Mod. Phys. A 28, 1340011 (2013). doi: 10.1142/S0217751X13400113Hydrodynamic modelling for relativistic heavy-ion collisions at RHIC and LHC
. Pramana 84, 703-715 (2015). doi: 10.1007/s12043-015-0971-2Collective flow and hydrodynamics in large and small systems at the LHC
. Nucl. Sci. Tech. 28, 99 (2017). doi: 10.1007/s41365-017-0245-4Hydrodynamic description of ultrarelativistic heavy ion collisions
. 634–714 (2003). arXiv:nucl-th/0305084Causal Viscous Hydrodynamics for Relativistic Heavy Ion Collisions
. Other thesis (Causal viscous hydrodynamics in 2+1 dimensions for relativistic heavy-ion collisions
. Phys. Rev. C 77, 064901 (2008). doi: 10.1103/PhysRevC.77.064901Suppression of elliptic flow in a minimally viscous quark-gluon plasma
. Phys. Lett. B 658, 279-283 (2008). doi: 10.1016/j.physletb.2007.11.019Glauber modeling in high energy nuclear collisions
. Ann. Rev. Nucl. Part. Sci. 57, 205-243 (2007). doi: 10.1146/annurev.nucl.57.090506.123020Eccentricity fluctuation effects on elliptic flow in relativistic heavy ion collisions
. Phys. Rev. C 79, 064904 (2009). doi: 10.1103/PhysRevC.79.064904Effects of fluctuations on the initial eccentricity from the Color Glass Condensate in heavy ion collisions
. Phys. Rev. C 75, 034905 (2007). doi: 10.1103/PhysRevC.75.034905High-order flow harmonics of identified hadrons in 2.76A TeV Pb + Pb collisions
. Phys. Rev. C 93, 064905 (2016). doi: 10.1103/PhysRevC.93.064905Collective flow in 2.76 A TeV and 5.02 A TeV Pb+Pb collisions
. Eur. Phys. J. C 77, 645 (2017). doi: 10.1140/epjc/s10052-017-5186-xAlternative ansatz to wounded nucleon and binary collision scaling in high-energy nuclear collisions
. Phys. Rev. C 92, 011901 (2015). doi: 10.1103/PhysRevC.92.011901Analytic continuation via domain knowledge free machine learning
. 98, 245101 (2018). doi: 10.1103/PhysRevB.98.245101Artificial neural network approach to the analytic continuation problem
. Phys. Rev. Lett. 124, 056401 (2020). doi: 10.1103/PhysRevLett.124.056401Spectral Reconstruction with Deep Neural Networks
. Phys. Rev. D 102, 096001 (2020). doi: 10.1103/PhysRevD.102.096001Reconstructing spectral functions via automatic differentiation
. Phys. Rev. D 106, L051502 (2022). doi: 10.1103/PhysRevD.106.L051502Reconstructing QCD spectral functions with Gaussian processes
. Phys. Rev. D 105, 036014 (2022). doi: 10.1103/PhysRevD.105.036014Ghost spectral function from the spectral Dyson-Schwinger equation
. Phys. Rev. D 104,. doi: 10.1103/PhysRevD.104.074017Reconstructing the gluon
. SciPost Phys. 5, 065 (2018). doi: 10.21468/SciPostPhys.5.6.065Automatic differentiation approach for reconstructing spectral functions with neural networks
. 2021. arXiv:2112.06206Rethinking the ill-posedness of the spectral function reconstruction — Why is it fundamentally hard and how Artificial Neural Networks can help
. Comput. Phys. Commun. 282, 108547 (2023). doi: 10.1016/j.cpc.2022.108547Neural network reconstruction of the dense matter equation of state from neutron star observables
. JCAP 08, 071 (2022). doi: 10.1088/1475-7516/2022/08/071Reconstructing the neutron star equation of state from observational data via automatic differentiation
. (2022). arXiv:2209.08883Continuum-extrapolated NNLO valence PDF of the pion at the physical point
. Phys. Rev. D 106, 114510 (2022). doi: 10.1103/PhysRevD.106.114510Application of radial basis functions neutral networks in spectral functions
. Phys. Rev. D 104, 076011 (2021). doi: 10.1103/PhysRevD.104.076011Heavy quark potential in the quark-gluon plasma: Deep neural network meets lattice quantum chromodynamics
. Phys. Rev. D 105, 014017 (2022). doi: 10.1103/PhysRevD.105.014017Excited bottomonia in quark-gluon plasma from lattice QCD
. Phys. Lett. B 800, 135119 (2020). doi: 10.1016/j.physletb.2019.135119Improved Gauss law model and in-medium heavy quarkonium at finite density and velocity
. Phys. Rev. D 101, 056010 (2020). doi: 10.1103/PhysRevD.101.056010Three learning phases for radial-basis-function networks
. Neural networks 14, 439-458 (2001).A point interpolation meshless method based on radial basis functions
. International Journal for Numerical Methods in Engineering 54, 1623-1648 (2002).Surface interpolation with radial basis functions for medical imaging
. IEEE transactions on medical imaging 16, 96-107 (1997).Deep rbfnet: Point cloud feature learning using radial basis functions (2018)
. arXiv preprint arXiv:1812.04302.Analytic continuation via domain knowledge free machine learning
. Phys. Rev. B 98, 245101 (2018). doi: 10.1103/PhysRevB.98.245101Artificial neural network approach to the analytic continuation problem
. Phys. Rev. Lett. 124, 056401 (2020). doi: 10.1103/PhysRevLett.124.056401Medium effects on charmonium production at ultrarelativistic energies available at the CERN Large Hadron Collider
. Phys. Rev. C 89, 054911 (2014). doi: 10.1103/PhysRevC.89.054911Heavy flavors under extreme conditions in high energy nuclear collisions
. Prog. Part. Nucl. Phys. 114, 103801 (2020). doi: 10.1016/j.ppnp.2020.103801Real-time static potential in hot QCD
. JHEP 03, 054 (2007). doi: 10.1088/1126-6708/2007/03/054Real and imaginary-time Q anti-Q correlators in a thermal medium
. Nucl. Phys. A 806, 312-338 (2008). doi: 10.1016/j.nuclphysa.2008.03.001Static quark-antiquark pairs at finite temperature
. Phys. Rev. D 78, 014017 (2008). doi: 10.1103/PhysRevD.78.014017Heavy Quarkonium in a weakly-coupled quark-gluon plasma below the melting temperature
. JHEP 09, 038 (2010). doi: 10.1007/JHEP09(2010)038Mean Field Effect on J/Ψ Production in Heavy Ion Collisions
. Phys. Rev. C 86, 034906 (2012). doi: 10.1103/PhysRevC.86.034906Deep-learning quasi-particle masses from QCD equation of state
. (2022). arXiv:2211.07994Radiative energy loss of high-energy quarks and gluons in a finite volume quark - gluon plasma
. Nucl. Phys. B 483, 291-320 (1997). doi: 10.1016/S0550-3213(96)00553-6Radiative energy loss and p(T) broadening of high-energy partons in nuclei
. Nucl. Phys. B 484, 265-282 (1997). doi: 10.1016/S0550-3213(96)00581-0Multiple collisions and induced gluon Bremsstrahlung in QCD
. Nucl. Phys. B 420, 583-614 (1994). doi: 10.1016/0550-3213(94)90079-5Multiple scattering, parton energy loss and modified fragmentation functions in deeply inelastic e A scattering
. Phys. Rev. Lett. 85, 3591-3594 (2000). doi: 10.1103/PhysRevLett.85.3591Gluon radiation off hard quarks in a nuclear environment: Opacity expansion
. Nucl. Phys. B 588, 303-344 (2000). doi: 10.1016/S0550-3213(00)00457-0Data-driven analysis for the temperature and momentum dependence of the heavy-quark diffusion coefficient in relativistic heavy-ion collisions
. Phys. Rev. C 97, 014907 (2018). doi: 10.1103/PhysRevC.97.014907Bayesian extraction of jet energy loss distributions in heavy-ion collisions
. Phys. Rev. Lett. 122, 252302 (2019). doi: 10.1103/PhysRevLett.122.252302Bayesian extraction of q^ with multi-stage jet evolution approach
. PoS 2018, 048 (2019). doi: 10.22323/1.345.0048Information field based global Bayesian inference of the jet transport coefficient.
(2022). arXiv:2206.01340Global constraint on the jet transport coefficient from single hadron, dihadron and γ-hadron spectra in high-energy heavy-ion collisions
. (2022). arXiv:2208.14419A Living Review of Machine Learning for Particle Physics.
(2021). arXiv:2102.02770Applications of deep learning in jet quenching
. Science China Physics, Mechanics Astronomy 52, 252017-(2022). doi: 10.1360/SSPMA-2022-0046Deep learning jet modifications in heavy-ion collisions
. JHEP 21, 206 (2020). doi: 10.1007/JHEP03(2021)206Jet Tomography in Heavy-Ion Collisions with Deep Learning
. Phys. Rev. Lett. 128, 012301 (2022). doi: 10.1103/PhysRevLett.128.012301Deep learning assisted jet tomography for the study of Mach cones in QGP
. (2022). arXiv:2206.02393Linear Boltzmann Transport for Jet Propagation in the Quark-Gluon Plasma: Elastic Processes and Medium Recoil
. Phys. Rev. C 91, 054908 (2015). [Erratum: Phys.Rev.C 97, 019902 (2018)]. doi: 10.1103/PhysRevC.91.054908Multistage Monte-Carlo simulation of jet modification in a static medium
. Phys. Rev. C 96, 024909 (2017). doi: 10.1103/PhysRevC.96.024909QLBT: a linear Boltzmann transport model for heavy quarks in a quark-gluon plasma of quasi-particles
. Eur. Phys. J. C 82, 350 (2022). doi: 10.1140/epjc/s10052-022-10308-xBreaking of factorization of two-particle correlations in hydrodynamics
. Phys. Rev. C 87, 031901 (2013). doi: 10.1103/PhysRevC.87.031901Collective phenomena in non-central nuclear collisions
. Landolt-Bornstein 23, 293-333 (2010). doi: 10.1007/978-3-642-01539-7_10Elliptic Flow: A Brief Review
. New J. Phys. 13, 055008 (2011). doi: 10.1088/1367-2630/13/5/055008Event-shape fluctuations and flow correlations in ultra-relativistic heavy-ion collisions
. J. Phys. G 41, 124003 (2014). doi: 10.1088/0954-3899/41/12/124003Principal Component Analysis of collective flow in Relativistic Heavy-Ion Collisions
. Eur. Phys. J. C 79, 870 (2019). doi: 10.1140/epjc/s10052-019-7379-yPrincipal component analysis of event-by-event fluctuations
. Phys. Rev. Lett. 114, 152301 (2015). doi: 10.1103/PhysRevLett.114.152301Subleading harmonic flows in hydrodynamic simulations of heavy ion collisions
. Phys. Rev. C 91, 044902 (2015). doi: 10.1103/PhysRevC.91.044902Fluctuations of harmonic and radial flow in heavy ion collisions with principal components
. Phys. Rev. C 93, 024913 (2016). doi: 10.1103/PhysRevC.93.024913Principal component analysis of the nonlinear coupling of harmonic modes in heavy-ion collisions
. Phys. Rev. C 97, 034905 (2018). doi: 10.1103/PhysRevC.97.034905Principal-component analysis of two-particle azimuthal correlations in PbPb and p Pb collisions at CMS
. Phys. Rev. C 96, 064902 (2017). doi: 10.1103/PhysRevC.96.064902Robustness of principal component analysis of harmonic flow in heavy ion collisions
. Phys. Rev. C 102, 024911 (2020). doi: 10.1103/PhysRevC.102.024911Detecting the chiral magnetic effect via deep learning
. Phys. Rev. C 106, L051901 (2022). doi: 10.1103/PhysRevC.106.L051901Machine learning-based jet and event classification at the Electron-Ion Collider with applications to hadron structure and spin physics
. J. High Energy Phys. 3, 1-35 (2023).Lorentz Group Equivariant Autoencoders
. (2022). arXiv:2212.07347All authors declare that there are no competing interests.