Cellular Automata (CA) Solver¶

Overview¶

The CA solver evolves the grain microstructure that emerges as the laser-melted pool resolidifies. It works on a uniform 3D voxel grid that overlaps a sub-volume of the thermal-fluid domain and updates each cell's grain assignment based on undercooling-driven nucleation and crystallographic envelope growth.

The solver is implemented in Fortran with OpenMP, lives in code_base/solver_CA/, and runs as a separate process spawned from run.sh (PHOENIX_RUN_MODE=ca). It consumes the thermal field that the thermal-fluid solver writes to a binary frame queue — see Multiphysics Coupling for the inter-process protocol.

Module	File	Purpose
`ca_grid`	`mod_ca_grid.f90`	Uniform CA mesh + cell index helpers
`ca_state`	`mod_ca_state.f90`	Per-cell state: orient_id, temperature, etc.
`ca_orient`	`mod_ca_orient.f90`	Append-only crystallographic orientation list (Bunge Euler angles)
`ca_voronoi`	`mod_ca_voronoi.f90`	Initial substrate Voronoi tessellation
`ca_octahedron`	`mod_ca_octahedron.f90`	Octahedral envelope geometry / capture test
`ca_nucleation`	`mod_ca_nucleation.f90`	Surface and bulk nucleation Gaussian distributions
`ca_solve`	`mod_ca_solve.f90`	Main time-integration loop (capture, growth, dt control)
`ca_coupling`	`mod_ca_coupling.f90`	Reads thermal binary frames, trilinear + temporal interpolation
`ca_vtr`	`mod_ca_vtr.f90`	VTK XML ImageData writer (zlib-compressed, pthread-async)
`ca_driver`	`mod_ca_driver.f90`	Top-level orchestration: namelist parse → init → solve → close

Enabling the CA Solver¶

In code_base/inputfile/input_param.txt:

&ca_params CA_flag=1 /
&ca_params CA_flag=1, ca_export_every=1, ca_export_padding=2 /

In code_base/solver_CA/inputfile/input_param_ca.txt (selected blocks):

&ca_domain
  ca_origin    = 4.5d-4, 4.5d-4, 5.5d-4,
  ca_lateral   = 1.1d-3, 1.1d-3, 1.5d-4,
  ca_cell_size = 2.0d-6,
  ca_num_cells = 550, 550, 75
/

&ca_voronoi
  target_grain_size = 10.0d-6,
  voronoi_seed_in   = 42
/

&ca_time
  termination_time_in   = 2.5d-2,
  time_step             = 1.0d-6,
  time_step_factor_in   = 0.5d0
/

&ca_thermal_source
  coupling_dir          = '',
  frame_poll_interval_s = 1.0e-4
/

&ca_output
  output_frequency_in = 100
/

Parameter	Description
`ca_origin / ca_lateral`	CA volume corner + extent (m). Must lie inside the thermal-fluid domain and cover the scan footprint (+ a halo) for the coupling sampler to find valid temperatures.
`ca_cell_size`	Uniform CA voxel size (m). Typical 2–5 μm for Ti-6Al-4V LPBF — fine enough to resolve dendrite-scale columnar growth.
`ca_num_cells`	Voxel count per axis. `ca_lateral / ca_cell_size` must equal `ca_num_cells`.
`target_grain_size`	Mean equivalent-sphere diameter for the Voronoi initial substrate (m). The seed count = volume / (4/3·π·(d/2)³).
`voronoi_seed_in`	RNG seed; same value reproduces the initial Voronoi exactly.
`termination_time_in`	Sim time to stop (s). Should match thermal `timax`.
`frame_poll_interval_s`	Sleep between poll attempts when CA is waiting on the next thermal frame (s).
`output_frequency_in`	Write a VTI frame every N CA substeps (`ca_solve` substep-internal counter, not thermal-step).

Governing Equations¶

CA does not solve a partial differential equation; it is an explicit voxel-update rule for a finite set of grain-orientation states, driven by the local thermal field. The equations below define those rules.

Local undercooling¶

For a cell at temperature $T(\mathbf{x}, t)$ inside the mushy zone:

\[\Delta T(\mathbf{x}, t) = T_\text{liq} - T(\mathbf{x}, t)\]

where $T_\text{liq}$ comes from the thermal-fluid material properties (constant scalar in single-alloy mode; tliquid_field(x) when species_flag=1). $\Delta T < 0$ ⇒ cell is liquid, no growth or nucleation.

Envelope growth velocity (Kurz–Giovanola–Trivedi / Gandin–Rappaz)¶

Each grain $g$ is represented by an octahedron centred on its nucleation site, with [100] crystal axes aligned to its Euler-angle orientation. Its half-diagonal $L_g$ advances at:

\[v(\Delta T) = a_1 + a_2 \, \Delta T + a_3 \, \Delta T^2\]

\[\frac{dL_g}{dt} = v\bigl(\Delta T(\mathbf{c}_g, t)\bigr)\]

Default coefficients (Ti-6Al-4V, in mod_ca_solve.f90):

Symbol	Value	Unit
$a_1$	0.0	m s⁻¹
$a_2$	2.03 × 10⁻⁴	m s⁻¹ K⁻¹
$a_3$	5.44 × 10⁻⁵	m s⁻¹ K⁻²

Time integration is explicit Euler:

\[L_g^{n+1} = L_g^n + v(\Delta T_g^n) \, \Delta t\]

Capture criterion (octahedral envelope)¶

A liquid cell at position $\mathbf{p}$ is captured by grain $g$ when its position, rotated into the grain's crystal frame, lies inside the octahedron $\|\xi\|_1 \le L_g$ in $\ell^1$:

\[\bigl\| \mathbf{R}_g \,(\mathbf{p} - \mathbf{c}_g) \bigr\|_1 \;=\; \sum_{i=1}^{3} \bigl| \bigl(\mathbf{R}_g\,(\mathbf{p} - \mathbf{c}_g)\bigr)_i \bigr| \;\le\; L_g\]

where $\mathbf{R}_g$ is the orthonormal rotation matrix from the lab frame to the grain's crystal frame, computed from Bunge Euler angles $(\varphi_1, \Phi, \varphi_2)$ as:

\[\mathbf{R}_g = R_z(\varphi_2)\,R_x(\Phi)\,R_z(\varphi_1)\]

When captured, the cell inherits $g$'s cell_orient_id and the capture event re-seeds a smaller daughter octahedron at the cell centre (preserving subsequent CCW-anisotropic growth). When T rises back above $T_\text{liq}$ the cell remelts; its previous grain id is preserved so an arriving capture wave can resume the same orientation (epitaxial regrowth).

Nucleation density (Thévoz–Rappaz Gaussian)¶

Pre-computed surface and bulk site populations are sampled from a normal distribution of activation undercoolings. Each population has its own integrated total density $n_\text{max}$, mean activation $\Delta T_\text{max}$, and spread $\sigma$:

\[\frac{dN}{d(\Delta T)} \;=\; \frac{n_\text{max}}{\sigma\sqrt{2\pi}} \exp\!\left( -\frac{(\Delta T - \Delta T_\text{max})^2}{2\sigma^2} \right)\]

A site fires the moment its sampled $\Delta T_\text{site}$ is exceeded by the local $\Delta T(\mathbf{x}, t)$, appending a fresh random orientation to orient_list and seeding a tiny octahedron at the site location.

Population	Default mean $\Delta T_\text{max}$	Default $\sigma$	Where sites land
Surface	1.0 K	0.5 K	Substrate-side faces of cells with `has_been_melted`=true
Bulk	5.0 K	2.0 K	Anywhere inside the melt-pool volume

Densities $n_\text{max}$ are namelist inputs (surface_site_density, bulk_site_density, m⁻² and m⁻³ respectively).

Adaptive time step (CFL-like)¶

\[\Delta t = \min\!\Bigl(\,\Delta t_\text{cap},\; f_c \cdot \dfrac{h}{v_\text{max}^n}\Bigr)\]

with $h$ = ca_cell_size, $f_c$ = time_step_factor_in (typically 0.5), $v_\text{max}^n$ = max growth velocity across all active grains at step $n$, and $\Delta t_\text{cap}$ = time_step (hard ceiling). The 0.5 factor keeps any envelope from advancing more than half a cell per step, so the capture sweep stays one-cell-local.

Workflow¶

CA runs in two phases within a single executable invocation:

Initial-grain generation (Voronoi):
Build a Voronoi tessellation from n_grains = volume / target_grain_size³.
Each grain gets a random Bunge Euler triplet (phi1, phi, phi2) appended to orient_list.
Cells get a cell_orient_id pointing into that list.
The result is dumped once to result/<case>/CA_results/initial_grain/.
Time integration (capture + growth):
At each CA substep, sample the thermal field at every cell center (trilinear in space + linear in time across two consecutive ca_input_*.bin frames produced by the thermal-fluid solver).
Where T > tliquid: cell is liquid, cell_orient_id = 0 (no grain).
Where solid neighbours touch a liquid cell during cooling: nucleate or capture.
- Capture: an existing grain's octahedral envelope sweeps across the cell → cell inherits that grain's orient_id.
- Nucleation: triggered by surface (deltaTs) or bulk (deltaTv) Gaussian undercooling; appends a new orientation to orient_list.
Octahedral envelopes grow at v(ΔT) = a·ΔT² + b·ΔT³ until they capture neighbours or shrink (in case of remelt).
Dump VTI every output_frequency_in substeps to result/<case>/CA_results/scan_grain/grainsTimeStates/.

The two phases share orient_list — initial Voronoi orientations stay valid through the scan; new ones are appended as nucleation fires.

Output¶

File	Description
`CA_results/initial_grain/`	One-shot dump of the Voronoi substrate (microstructureInfo*.txt + diagnostics)
`CA_results/scan_grain/Grains.vti.series`	ParaView-ready time-series JSON index
`CA_results/scan_grain/grainsTimeStates/timeStep_/Grains.vti`	Per-frame VTK XML ImageData (zlib + base64, ~15–95 MB each) with `grain_id` (Int32, shuffled colormap) and `grain_orientation` (Float32 × 3 Euler angles) as CellData
`CA_results/scan_grain/grainSizeStatistics.txt`	Grain-size histogram + mean / median / max diameters
`CA_results/scan_grain/grain_size_histogram.png`	Auto-generated grain-size distribution plot
`CA_results/scan_grain/timeLog.dat`	Phase-by-phase wall + CPU timings (voronoi, time_integration, vtr_write, …)
`CA_results/scan_grain/memoryLog.dat`	Peak RSS and per-array memory accounting
`CA_results/ca.log`	Stdout/stderr of the CA process (line-buffered via `stdbuf -oL`)

In ParaView open Grains.vti.series and color cells by grain_id (the mapped_grain_id shuffle scatters adjacent grains into different colormap slots, so neighbouring grains visually separate without any extra filter) or by a grain_orientation component.

Performance & Async I/O¶

The VTI writer runs on a persistent pthread, spawned at ca_vtr_open() and joined at ca_vtr_close(). The CA main thread fills the staging buffers (stage_id, stage_eu), hands a job off via mutex + condvar, and returns to the solver loop. The writer thread then:

Compresses each DataArray with vtkZLibDataCompressor (32 KB blocks, level 1, OMP-parallel over blocks).
Base64-encodes header + payload separately (OMP-parallel over triplets).
Streams the XML envelope to disk.
fsync + posix_fadvise(POSIX_FADV_DONTNEED) so the file flushes immediately and doesn't sit in the page cache pushing the concurrent thermal writer into dirty-page throttling.

Why pthread and not fork(): fork() only carries the calling thread into the child, and on WSL2 libgomp's atfork hook can't reliably rebuild the OpenMP team in the child — the first !$omp parallel in a fork-child deadlocks on a futex. A persistent pthread sidesteps that entirely. See projects/20260418_MERGING_CA/log.md for the full debugging trace.

For a 550×550×75 domain at output_frequency_in = 100, the writer runs at ~0.6 s per frame (compress + encode + fsync), CA main loop runs at ~48 substeps/s, and total CPU utilisation across thermal + CA + writer holds steady at ~76 % on a 24-core box — see CA-merge log for measurement details.

Validation¶

Standard sanity checks:

Initial Voronoi grain-size distribution matches target_grain_size within 5 %.
Final grain-size distribution under coupled thermal-fluid input shifts right and grows a long tail compared to the initial Voronoi — driven by epitaxial regrowth across multiple laser passes.
At least one grain ≥ 5× the initial mean volume is expected after multi-track scanning; absence of such a tail usually means the thermal field never crossed the CA domain (check ca_origin / ca_lateral vs the toolpath footprint).
Restart resilience smoke test: kill CA mid-run; on restart it bootstraps next_id from the lowest unconsumed coupling frame on disk, no manual state file needed.

Symbol	Value	Unit
\(a_1\)	0.0	m s⁻¹
\(a_2\)	2.03 × 10⁻⁴	m s⁻¹ K⁻¹
\(a_3\)	5.44 × 10⁻⁵	m s⁻¹ K⁻²