Cellular Automata (CA) Solver¶
Overview¶
The CA solver evolves the grain microstructure that emerges as the laser-melted pool resolidifies. It works on a uniform 3D voxel grid that overlaps a sub-volume of the thermal-fluid domain and updates each cell's grain assignment based on undercooling-driven nucleation and crystallographic envelope growth.
The solver is implemented in Fortran with OpenMP, lives in code_base/solver_CA/, and runs as a separate process spawned from run.sh (PHOENIX_RUN_MODE=ca). It consumes the thermal field that the thermal-fluid solver writes to a binary frame queue — see Multiphysics Coupling for the inter-process protocol.
| Module | File | Purpose |
|---|---|---|
ca_grid |
mod_ca_grid.f90 |
Uniform CA mesh + cell index helpers |
ca_state |
mod_ca_state.f90 |
Per-cell state: orient_id, temperature, etc. |
ca_orient |
mod_ca_orient.f90 |
Append-only crystallographic orientation list (Bunge Euler angles) |
ca_voronoi |
mod_ca_voronoi.f90 |
Initial substrate Voronoi tessellation |
ca_octahedron |
mod_ca_octahedron.f90 |
Octahedral envelope geometry / capture test |
ca_nucleation |
mod_ca_nucleation.f90 |
Surface and bulk nucleation Gaussian distributions |
ca_solve |
mod_ca_solve.f90 |
Main time-integration loop (capture, growth, dt control) |
ca_coupling |
mod_ca_coupling.f90 |
Reads thermal binary frames, trilinear + temporal interpolation |
ca_vtr |
mod_ca_vtr.f90 |
VTK XML ImageData writer (zlib-compressed, pthread-async) |
ca_driver |
mod_ca_driver.f90 |
Top-level orchestration: namelist parse → init → solve → close |
Enabling the CA Solver¶
In code_base/inputfile/input_param.txt:
&ca_params CA_flag=1 /
&ca_params CA_flag=1, ca_export_every=1, ca_export_padding=2 /
In code_base/solver_CA/inputfile/input_param_ca.txt (selected blocks):
&ca_domain
ca_origin = 4.5d-4, 4.5d-4, 5.5d-4,
ca_lateral = 1.1d-3, 1.1d-3, 1.5d-4,
ca_cell_size = 2.0d-6,
ca_num_cells = 550, 550, 75
/
&ca_voronoi
target_grain_size = 10.0d-6,
voronoi_seed_in = 42
/
&ca_time
termination_time_in = 2.5d-2,
time_step = 1.0d-6,
time_step_factor_in = 0.5d0
/
&ca_thermal_source
coupling_dir = '',
frame_poll_interval_s = 1.0e-4
/
&ca_output
output_frequency_in = 100
/
| Parameter | Description |
|---|---|
ca_origin / ca_lateral |
CA volume corner + extent (m). Must lie inside the thermal-fluid domain and cover the scan footprint (+ a halo) for the coupling sampler to find valid temperatures. |
ca_cell_size |
Uniform CA voxel size (m). Typical 2–5 μm for Ti-6Al-4V LPBF — fine enough to resolve dendrite-scale columnar growth. |
ca_num_cells |
Voxel count per axis. ca_lateral / ca_cell_size must equal ca_num_cells. |
target_grain_size |
Mean equivalent-sphere diameter for the Voronoi initial substrate (m). The seed count = volume / (4/3·π·(d/2)³). |
voronoi_seed_in |
RNG seed; same value reproduces the initial Voronoi exactly. |
termination_time_in |
Sim time to stop (s). Should match thermal timax. |
frame_poll_interval_s |
Sleep between poll attempts when CA is waiting on the next thermal frame (s). |
output_frequency_in |
Write a VTI frame every N CA substeps (ca_solve substep-internal counter, not thermal-step). |
Governing Equations¶
CA does not solve a partial differential equation; it is an explicit voxel-update rule for a finite set of grain-orientation states, driven by the local thermal field. The equations below define those rules.
Local undercooling¶
For a cell at temperature \(T(\mathbf{x}, t)\) inside the mushy zone:
where \(T_\text{liq}\) comes from the thermal-fluid material properties (constant scalar in single-alloy mode; tliquid_field(x) when species_flag=1). \(\Delta T < 0\) ⇒ cell is liquid, no growth or nucleation.
Envelope growth velocity (Kurz–Giovanola–Trivedi / Gandin–Rappaz)¶
Each grain \(g\) is represented by an octahedron centred on its nucleation site, with [100] crystal axes aligned to its Euler-angle orientation. Its half-diagonal \(L_g\) advances at:
Default coefficients (Ti-6Al-4V, in mod_ca_solve.f90):
| Symbol | Value | Unit |
|---|---|---|
| \(a_1\) | 0.0 | m s⁻¹ |
| \(a_2\) | 2.03 × 10⁻⁴ | m s⁻¹ K⁻¹ |
| \(a_3\) | 5.44 × 10⁻⁵ | m s⁻¹ K⁻² |
Time integration is explicit Euler:
Capture criterion (octahedral envelope)¶
A liquid cell at position \(\mathbf{p}\) is captured by grain \(g\) when its position, rotated into the grain's crystal frame, lies inside the octahedron \(\|\xi\|_1 \le L_g\) in \(\ell^1\):
where \(\mathbf{R}_g\) is the orthonormal rotation matrix from the lab frame to the grain's crystal frame, computed from Bunge Euler angles \((\varphi_1, \Phi, \varphi_2)\) as:
When captured, the cell inherits \(g\)'s cell_orient_id and the capture event re-seeds a smaller daughter octahedron at the cell centre (preserving subsequent CCW-anisotropic growth). When T rises back above \(T_\text{liq}\) the cell remelts; its previous grain id is preserved so an arriving capture wave can resume the same orientation (epitaxial regrowth).
Nucleation density (Thévoz–Rappaz Gaussian)¶
Pre-computed surface and bulk site populations are sampled from a normal distribution of activation undercoolings. Each population has its own integrated total density \(n_\text{max}\), mean activation \(\Delta T_\text{max}\), and spread \(\sigma\):
A site fires the moment its sampled \(\Delta T_\text{site}\) is exceeded by the local \(\Delta T(\mathbf{x}, t)\), appending a fresh random orientation to orient_list and seeding a tiny octahedron at the site location.
| Population | Default mean \(\Delta T_\text{max}\) | Default \(\sigma\) | Where sites land |
|---|---|---|---|
| Surface | 1.0 K | 0.5 K | Substrate-side faces of cells with has_been_melted=true |
| Bulk | 5.0 K | 2.0 K | Anywhere inside the melt-pool volume |
Densities \(n_\text{max}\) are namelist inputs (surface_site_density, bulk_site_density, m⁻² and m⁻³ respectively).
Adaptive time step (CFL-like)¶
with \(h\) = ca_cell_size, \(f_c\) = time_step_factor_in (typically 0.5), \(v_\text{max}^n\) = max growth velocity across all active grains at step \(n\), and \(\Delta t_\text{cap}\) = time_step (hard ceiling). The 0.5 factor keeps any envelope from advancing more than half a cell per step, so the capture sweep stays one-cell-local.
Workflow¶
CA runs in two phases within a single executable invocation:
- Initial-grain generation (Voronoi):
- Build a Voronoi tessellation from
n_grains = volume / target_grain_size³. - Each grain gets a random Bunge Euler triplet
(phi1, phi, phi2)appended toorient_list. - Cells get a
cell_orient_idpointing into that list. -
The result is dumped once to
result/<case>/CA_results/initial_grain/. -
Time integration (capture + growth):
- At each CA substep, sample the thermal field at every cell center (trilinear in space + linear in time across two consecutive
ca_input_*.binframes produced by the thermal-fluid solver). - Where
T > tliquid: cell is liquid,cell_orient_id = 0(no grain). - Where solid neighbours touch a liquid cell during cooling: nucleate or capture.
- Capture: an existing grain's octahedral envelope sweeps across the cell → cell inherits that grain's orient_id.
- Nucleation: triggered by surface (
deltaTs) or bulk (deltaTv) Gaussian undercooling; appends a new orientation toorient_list.
- Octahedral envelopes grow at
v(ΔT) = a·ΔT² + b·ΔT³until they capture neighbours or shrink (in case of remelt). - Dump VTI every
output_frequency_insubsteps toresult/<case>/CA_results/scan_grain/grainsTimeStates/.
The two phases share orient_list — initial Voronoi orientations stay valid through the scan; new ones are appended as nucleation fires.
Output¶
| File | Description |
|---|---|
CA_results/initial_grain/ |
One-shot dump of the Voronoi substrate (microstructureInfo*.txt + diagnostics) |
CA_results/scan_grain/Grains.vti.series |
ParaView-ready time-series JSON index |
CA_results/scan_grain/grainsTimeStates/timeStep_*/Grains*.vti |
Per-frame VTK XML ImageData (zlib + base64, ~15–95 MB each) with grain_id (Int32, shuffled colormap) and grain_orientation (Float32 × 3 Euler angles) as CellData |
CA_results/scan_grain/grainSizeStatistics.txt |
Grain-size histogram + mean / median / max diameters |
CA_results/scan_grain/grain_size_histogram.png |
Auto-generated grain-size distribution plot |
CA_results/scan_grain/timeLog.dat |
Phase-by-phase wall + CPU timings (voronoi, time_integration, vtr_write, …) |
CA_results/scan_grain/memoryLog.dat |
Peak RSS and per-array memory accounting |
CA_results/ca.log |
Stdout/stderr of the CA process (line-buffered via stdbuf -oL) |
In ParaView open Grains.vti.series and color cells by grain_id (the mapped_grain_id shuffle scatters adjacent grains into different colormap slots, so neighbouring grains visually separate without any extra filter) or by a grain_orientation component.
Performance & Async I/O¶
The VTI writer runs on a persistent pthread, spawned at ca_vtr_open() and joined at ca_vtr_close(). The CA main thread fills the staging buffers (stage_id, stage_eu), hands a job off via mutex + condvar, and returns to the solver loop. The writer thread then:
- Compresses each DataArray with
vtkZLibDataCompressor(32 KB blocks, level 1, OMP-parallel over blocks). - Base64-encodes header + payload separately (OMP-parallel over triplets).
- Streams the XML envelope to disk.
fsync+posix_fadvise(POSIX_FADV_DONTNEED)so the file flushes immediately and doesn't sit in the page cache pushing the concurrent thermal writer into dirty-page throttling.
Why pthread and not fork(): fork() only carries the calling thread into the child, and on WSL2 libgomp's atfork hook can't reliably rebuild the OpenMP team in the child — the first !$omp parallel in a fork-child deadlocks on a futex. A persistent pthread sidesteps that entirely. See projects/20260418_MERGING_CA/log.md for the full debugging trace.
For a 550×550×75 domain at output_frequency_in = 100, the writer runs at ~0.6 s per frame (compress + encode + fsync), CA main loop runs at ~48 substeps/s, and total CPU utilisation across thermal + CA + writer holds steady at ~76 % on a 24-core box — see CA-merge log for measurement details.
Validation¶
Standard sanity checks:
- Initial Voronoi grain-size distribution matches
target_grain_sizewithin 5 %. - Final grain-size distribution under coupled thermal-fluid input shifts right and grows a long tail compared to the initial Voronoi — driven by epitaxial regrowth across multiple laser passes.
- At least one grain ≥ 5× the initial mean volume is expected after multi-track scanning; absence of such a tail usually means the thermal field never crossed the CA domain (check
ca_origin/ca_lateralvs the toolpath footprint). - Restart resilience smoke test: kill CA mid-run; on restart it bootstraps
next_idfrom the lowest unconsumed coupling frame on disk, no manual state file needed.