Skip to content

Cellular Automata (CA) Solver

Overview

The CA solver evolves the grain microstructure that emerges as the laser-melted pool resolidifies. It works on a uniform 3D voxel grid that overlaps a sub-volume of the thermal-fluid domain and updates each cell's grain assignment based on undercooling-driven nucleation and crystallographic envelope growth.

The solver is implemented in Fortran with OpenMP, lives in code_base/solver_CA/, and runs as a separate process spawned from run.sh (PHOENIX_RUN_MODE=ca). It consumes the thermal field that the thermal-fluid solver writes to a binary frame queue — see Multiphysics Coupling for the inter-process protocol.

Module File Purpose
ca_grid mod_ca_grid.f90 Uniform CA mesh + cell index helpers
ca_state mod_ca_state.f90 Per-cell state: orient_id, temperature, etc.
ca_orient mod_ca_orient.f90 Append-only crystallographic orientation list (Bunge Euler angles)
ca_voronoi mod_ca_voronoi.f90 Initial substrate Voronoi tessellation
ca_octahedron mod_ca_octahedron.f90 Octahedral envelope geometry / capture test
ca_nucleation mod_ca_nucleation.f90 Surface and bulk nucleation Gaussian distributions
ca_solve mod_ca_solve.f90 Main time-integration loop (capture, growth, dt control)
ca_coupling mod_ca_coupling.f90 Reads thermal binary frames, trilinear + temporal interpolation
ca_vtr mod_ca_vtr.f90 VTK XML ImageData writer (zlib-compressed, pthread-async)
ca_driver mod_ca_driver.f90 Top-level orchestration: namelist parse → init → solve → close

Enabling the CA Solver

In code_base/inputfile/input_param.txt:

&ca_params CA_flag=1 /
&ca_params CA_flag=1, ca_export_every=1, ca_export_padding=2 /

In code_base/solver_CA/inputfile/input_param_ca.txt (selected blocks):

&ca_domain
  ca_origin    = 4.5d-4, 4.5d-4, 5.5d-4,
  ca_lateral   = 1.1d-3, 1.1d-3, 1.5d-4,
  ca_cell_size = 2.0d-6,
  ca_num_cells = 550, 550, 75
/

&ca_voronoi
  target_grain_size = 10.0d-6,
  voronoi_seed_in   = 42
/

&ca_time
  termination_time_in   = 2.5d-2,
  time_step             = 1.0d-6,
  time_step_factor_in   = 0.5d0
/

&ca_thermal_source
  coupling_dir          = '',
  frame_poll_interval_s = 1.0e-4
/

&ca_output
  output_frequency_in = 100
/
Parameter Description
ca_origin / ca_lateral CA volume corner + extent (m). Must lie inside the thermal-fluid domain and cover the scan footprint (+ a halo) for the coupling sampler to find valid temperatures.
ca_cell_size Uniform CA voxel size (m). Typical 2–5 μm for Ti-6Al-4V LPBF — fine enough to resolve dendrite-scale columnar growth.
ca_num_cells Voxel count per axis. ca_lateral / ca_cell_size must equal ca_num_cells.
target_grain_size Mean equivalent-sphere diameter for the Voronoi initial substrate (m). The seed count = volume / (4/3·π·(d/2)³).
voronoi_seed_in RNG seed; same value reproduces the initial Voronoi exactly.
termination_time_in Sim time to stop (s). Should match thermal timax.
frame_poll_interval_s Sleep between poll attempts when CA is waiting on the next thermal frame (s).
output_frequency_in Write a VTI frame every N CA substeps (ca_solve substep-internal counter, not thermal-step).

Governing Equations

CA does not solve a partial differential equation; it is an explicit voxel-update rule for a finite set of grain-orientation states, driven by the local thermal field. The equations below define those rules.

Local undercooling

For a cell at temperature \(T(\mathbf{x}, t)\) inside the mushy zone:

\[\Delta T(\mathbf{x}, t) = T_\text{liq} - T(\mathbf{x}, t)\]

where \(T_\text{liq}\) comes from the thermal-fluid material properties (constant scalar in single-alloy mode; tliquid_field(x) when species_flag=1). \(\Delta T < 0\) ⇒ cell is liquid, no growth or nucleation.

Envelope growth velocity (Kurz–Giovanola–Trivedi / Gandin–Rappaz)

Each grain \(g\) is represented by an octahedron centred on its nucleation site, with [100] crystal axes aligned to its Euler-angle orientation. Its half-diagonal \(L_g\) advances at:

\[v(\Delta T) = a_1 + a_2 \, \Delta T + a_3 \, \Delta T^2\]
\[\frac{dL_g}{dt} = v\bigl(\Delta T(\mathbf{c}_g, t)\bigr)\]

Default coefficients (Ti-6Al-4V, in mod_ca_solve.f90):

Symbol Value Unit
\(a_1\) 0.0 m s⁻¹
\(a_2\) 2.03 × 10⁻⁴ m s⁻¹ K⁻¹
\(a_3\) 5.44 × 10⁻⁵ m s⁻¹ K⁻²

Time integration is explicit Euler:

\[L_g^{n+1} = L_g^n + v(\Delta T_g^n) \, \Delta t\]

Capture criterion (octahedral envelope)

A liquid cell at position \(\mathbf{p}\) is captured by grain \(g\) when its position, rotated into the grain's crystal frame, lies inside the octahedron \(\|\xi\|_1 \le L_g\) in \(\ell^1\):

\[\bigl\| \mathbf{R}_g \,(\mathbf{p} - \mathbf{c}_g) \bigr\|_1 \;=\; \sum_{i=1}^{3} \bigl| \bigl(\mathbf{R}_g\,(\mathbf{p} - \mathbf{c}_g)\bigr)_i \bigr| \;\le\; L_g\]

where \(\mathbf{R}_g\) is the orthonormal rotation matrix from the lab frame to the grain's crystal frame, computed from Bunge Euler angles \((\varphi_1, \Phi, \varphi_2)\) as:

\[\mathbf{R}_g = R_z(\varphi_2)\,R_x(\Phi)\,R_z(\varphi_1)\]

When captured, the cell inherits \(g\)'s cell_orient_id and the capture event re-seeds a smaller daughter octahedron at the cell centre (preserving subsequent CCW-anisotropic growth). When T rises back above \(T_\text{liq}\) the cell remelts; its previous grain id is preserved so an arriving capture wave can resume the same orientation (epitaxial regrowth).

Nucleation density (Thévoz–Rappaz Gaussian)

Pre-computed surface and bulk site populations are sampled from a normal distribution of activation undercoolings. Each population has its own integrated total density \(n_\text{max}\), mean activation \(\Delta T_\text{max}\), and spread \(\sigma\):

\[\frac{dN}{d(\Delta T)} \;=\; \frac{n_\text{max}}{\sigma\sqrt{2\pi}} \exp\!\left( -\frac{(\Delta T - \Delta T_\text{max})^2}{2\sigma^2} \right)\]

A site fires the moment its sampled \(\Delta T_\text{site}\) is exceeded by the local \(\Delta T(\mathbf{x}, t)\), appending a fresh random orientation to orient_list and seeding a tiny octahedron at the site location.

Population Default mean \(\Delta T_\text{max}\) Default \(\sigma\) Where sites land
Surface 1.0 K 0.5 K Substrate-side faces of cells with has_been_melted=true
Bulk 5.0 K 2.0 K Anywhere inside the melt-pool volume

Densities \(n_\text{max}\) are namelist inputs (surface_site_density, bulk_site_density, m⁻² and m⁻³ respectively).

Adaptive time step (CFL-like)

\[\Delta t = \min\!\Bigl(\,\Delta t_\text{cap},\; f_c \cdot \dfrac{h}{v_\text{max}^n}\Bigr)\]

with \(h\) = ca_cell_size, \(f_c\) = time_step_factor_in (typically 0.5), \(v_\text{max}^n\) = max growth velocity across all active grains at step \(n\), and \(\Delta t_\text{cap}\) = time_step (hard ceiling). The 0.5 factor keeps any envelope from advancing more than half a cell per step, so the capture sweep stays one-cell-local.

Workflow

CA runs in two phases within a single executable invocation:

  1. Initial-grain generation (Voronoi):
  2. Build a Voronoi tessellation from n_grains = volume / target_grain_size³.
  3. Each grain gets a random Bunge Euler triplet (phi1, phi, phi2) appended to orient_list.
  4. Cells get a cell_orient_id pointing into that list.
  5. The result is dumped once to result/<case>/CA_results/initial_grain/.

  6. Time integration (capture + growth):

  7. At each CA substep, sample the thermal field at every cell center (trilinear in space + linear in time across two consecutive ca_input_*.bin frames produced by the thermal-fluid solver).
  8. Where T > tliquid: cell is liquid, cell_orient_id = 0 (no grain).
  9. Where solid neighbours touch a liquid cell during cooling: nucleate or capture.
    • Capture: an existing grain's octahedral envelope sweeps across the cell → cell inherits that grain's orient_id.
    • Nucleation: triggered by surface (deltaTs) or bulk (deltaTv) Gaussian undercooling; appends a new orientation to orient_list.
  10. Octahedral envelopes grow at v(ΔT) = a·ΔT² + b·ΔT³ until they capture neighbours or shrink (in case of remelt).
  11. Dump VTI every output_frequency_in substeps to result/<case>/CA_results/scan_grain/grainsTimeStates/.

The two phases share orient_list — initial Voronoi orientations stay valid through the scan; new ones are appended as nucleation fires.

Output

File Description
CA_results/initial_grain/ One-shot dump of the Voronoi substrate (microstructureInfo*.txt + diagnostics)
CA_results/scan_grain/Grains.vti.series ParaView-ready time-series JSON index
CA_results/scan_grain/grainsTimeStates/timeStep_*/Grains*.vti Per-frame VTK XML ImageData (zlib + base64, ~15–95 MB each) with grain_id (Int32, shuffled colormap) and grain_orientation (Float32 × 3 Euler angles) as CellData
CA_results/scan_grain/grainSizeStatistics.txt Grain-size histogram + mean / median / max diameters
CA_results/scan_grain/grain_size_histogram.png Auto-generated grain-size distribution plot
CA_results/scan_grain/timeLog.dat Phase-by-phase wall + CPU timings (voronoi, time_integration, vtr_write, …)
CA_results/scan_grain/memoryLog.dat Peak RSS and per-array memory accounting
CA_results/ca.log Stdout/stderr of the CA process (line-buffered via stdbuf -oL)

In ParaView open Grains.vti.series and color cells by grain_id (the mapped_grain_id shuffle scatters adjacent grains into different colormap slots, so neighbouring grains visually separate without any extra filter) or by a grain_orientation component.

Performance & Async I/O

The VTI writer runs on a persistent pthread, spawned at ca_vtr_open() and joined at ca_vtr_close(). The CA main thread fills the staging buffers (stage_id, stage_eu), hands a job off via mutex + condvar, and returns to the solver loop. The writer thread then:

  1. Compresses each DataArray with vtkZLibDataCompressor (32 KB blocks, level 1, OMP-parallel over blocks).
  2. Base64-encodes header + payload separately (OMP-parallel over triplets).
  3. Streams the XML envelope to disk.
  4. fsync + posix_fadvise(POSIX_FADV_DONTNEED) so the file flushes immediately and doesn't sit in the page cache pushing the concurrent thermal writer into dirty-page throttling.

Why pthread and not fork(): fork() only carries the calling thread into the child, and on WSL2 libgomp's atfork hook can't reliably rebuild the OpenMP team in the child — the first !$omp parallel in a fork-child deadlocks on a futex. A persistent pthread sidesteps that entirely. See projects/20260418_MERGING_CA/log.md for the full debugging trace.

For a 550×550×75 domain at output_frequency_in = 100, the writer runs at ~0.6 s per frame (compress + encode + fsync), CA main loop runs at ~48 substeps/s, and total CPU utilisation across thermal + CA + writer holds steady at ~76 % on a 24-core box — see CA-merge log for measurement details.

Validation

Standard sanity checks:

  1. Initial Voronoi grain-size distribution matches target_grain_size within 5 %.
  2. Final grain-size distribution under coupled thermal-fluid input shifts right and grows a long tail compared to the initial Voronoi — driven by epitaxial regrowth across multiple laser passes.
  3. At least one grain ≥ 5× the initial mean volume is expected after multi-track scanning; absence of such a tail usually means the thermal field never crossed the CA domain (check ca_origin / ca_lateral vs the toolpath footprint).
  4. Restart resilience smoke test: kill CA mid-run; on restart it bootstraps next_id from the lowest unconsumed coupling frame on disk, no manual state file needed.