Multiphysics Coupling¶

PHOENIX integrates four physics solvers: thermal-fluid, species transport, mechanical residual stress, and cellular-automata grain microstructure. Different couplings use different mechanisms because their data volumes, update cadences, and language / runtime constraints are different. This page documents the choices and the protocols.

Overview¶

Pair	Direction	Mechanism	Cadence	Coupling code
Thermal ↔ Species	two-way (in-process)	Shared Fortran arrays	Every thermal step	`solver_species/mod_species.f90`
Thermal → Mechanical	one-way (in-process or sub-process)	Shared arrays / on-disk binary	Every `mech_interval` thermal steps	`solver_mechanical/mod_mech_io.f90`
Thermal → CA	one-way (out-of-process)	Binary frame queue on disk	Every `ca_export_every` thermal steps	`solver_thermal_fluid/mod_ca_coupling_writer.f90` ↔ `solver_CA/mod_ca_coupling.f90`
Mechanical ↔ CA	none (independent)	—	—	—

All four are launched from a single bash run.sh invocation; thread allocation is per-solver.

Why three different mechanisms?¶

Species sits inside the thermal-fluid solver's main loop with no separate driver — sharing enthalpy, temp, velocity arrays directly is cheapest and gives genuine two-way coupling within one timestep.
Mechanical is structured to be optionally split into a separate process (when mech_threads > 0) so its EBE FEM solve doesn't steal cores from the CFD time loop. When in-process, it shares arrays; when out-of-process, it consumes per-step binary snapshots.
CA is a separate executable (Fortran-OpenMP, but launched as PHOENIX_RUN_MODE=ca) running on a different uniform mesh and with its own time-step controller (typically 10× finer than thermal). Sharing in-process arrays would force a single timestep to serve both, defeating both. A binary file queue cleanly decouples the timesteps and lets each solver run at its native cadence.

Thermal ↔ Species (shared memory)¶

Species concentration C (mass fraction of the primary alloy) is an additional scalar field allocated alongside enthalpy, temp, etc. inside the thermal-fluid solver. The species transport equation is solved as one more pass through the same mod_solve machinery using the same velocity field and convective-diffusive operators.

Coupling back into thermal-fluid:

Solidus / liquidus temperatures become functions of C (tsolid_field, tliquid_field) → updates fracl directly.
Density / viscosity / specific heat / latent heat use C-weighted averages between the two alloys' values.
Marangoni stress at the free surface picks up an extra solutal term dgdC * grad(C) next to the thermal Marangoni term.

There's no inter-process protocol — every step both fields advance together inside the same Fortran routines.

Thermal → Mechanical¶

Two operating modes, selected by mech_threads in run.sh:

In-process (mech_threads = 0)¶

Mechanical solve fires inside the thermal main loop at mod(step_idx, mech_interval) == 0. It reads temp and solidfield directly from the thermal-fluid arrays, calls solve_mech_cg (preconditioned conjugate gradient on EBE assembly), and writes its output VTK files. The thermal loop then continues. No file I/O between the two solvers; the price is that mech and thermal share one OpenMP team.

Out-of-process (mech_threads > 0)¶

When the user passes a non-zero mechanical thread count, run.sh spawns a second cluster_main process with PHOENIX_RUN_MODE=mechanical. The thermal solver writes per-step binary snapshots:

result/<case>/mechanical_results/mech_input_<step>.bin
result/<case>/mechanical_results/mech_input_<step>.ready    <- atomic-rename sentinel

The mech process polls for the .ready sentinel, reads the corresponding .bin, runs the solve, then deletes the consumed pair. Sentinel-file pattern (write .bin first, rename .tmp → .ready last) guarantees the reader never sees a half-written payload — same pattern as CA's coupling queue (below).

The two processes share no memory; the OS scheduler treats them as independent and they can each saturate their own OpenMP team.

The roller / Z-only Dirichlet BC at the bottom face — uz(:,:,1) = 0, ux/uy free — applies in both modes.

Thermal → CA (binary frame queue)¶

CA runs as an out-of-process Fortran-OpenMP solver on a uniform sub-domain. It consumes the thermal field as a queue of per-step binary frames written by the thermal solver. The queue lives in result/<case>/coupling/:

ca_input_000001.bin
ca_input_000002.bin
ca_input_000003.bin
…
ca_input_END                  <- zero-byte sentinel; thermal touches this when its
                                  time loop exits

Frame format¶

One little-endian binary file per exported thermal step:

Header (76 bytes):
  int64  frame_id          monotonic, matches the filename's step index
  real64 t_sim             physical time (s)
  real64 x0, y0, z0        sub-array corner in global coordinates (m, left
                            face of cell 1; cell-i centre is x0 + (i-0.5)*hx)
  real64 hx, hy, hz        per-axis thermal cell size (m); each axis must be
                            uniform across the bbox — the writer aborts if it
                            finds non-uniform spacing in the sub-array
  int32  nx, ny, nz        sub-array dimensions (incl. 1-cell halo)

Payload:
  real32 T(1:nx, 1:ny, 1:nz)   Fortran column-major, K

Only the mushy-pool sub-volume is exported — the X-Y bounding box of cells where temp > tsolid - DELTA_COOL, padded with one halo cell on each side so trilinear interpolation at the bbox edge has valid neighbours. An empty frame (nx=ny=nz=0) is written when there's no melt pool yet (before first laser pass) or after full resolidification.

Three independent hx, hy, hz are stored because PHOENIX's Z grid is typically non-uniform globally (coarse in the substrate, fine in the powder zone) — using a single h for all three axes silently stretched the pool depth in CA's view by the X/Z spacing ratio. Each axis must still be uniform within the exported bbox; the writer asserts this and aborts the run with a clear error if violated, so the user catches grid mis-configurations early.

Producer side (thermal-fluid)¶

In solver_thermal_fluid/mod_ca_coupling_writer.f90. Inside the main time loop:

if (CA_flag == 1 .and. mod(step_idx, ca_export_every) == 0) then
  call ca_export_frame(t_sim, frame_id)
end if

The writer extracts the mushy bbox from mod_dimen.f90's flood-fill, packs the header + temperature sub-array, writes to coupling/ca_input_<NNNNNN>.bin. On normal exit it touches coupling/ca_input_END so CA can stop polling.

Consumer side (CA)¶

In solver_CA/mod_ca_coupling.f90. CA's integration advances by its own adaptive dt (typically 1 μs vs thermal's 20 μs). When CA's clock crosses the loaded t_curr, the reader:

Polls for the next frame on disk (frame_poll_interval_s between checks; default 0.5 ms).
On arrival, reads header + payload, slots into a double-buffer (T_prev, T_curr).
For each CA cell, does a trilinear spatial interpolation in the bbox + a linear temporal blend between T_prev and T_curr to recover the temperature at exactly the CA substep's time.
Cells outside the current bbox return null_mat_temp (cool, no growth).
Deletes consumed frames once they're no longer needed for temporal blending (t > t_curr_next), so disk usage stays bounded.

Restart resilience: on init, CA scans the coupling directory for the lowest unconsumed frame_id and starts from there. No external state file.

Cadence guidance¶

The shortest CA-relevant time scale is the solidification front residence in one CA cell: h / v_front ≈ 2.5 μm / 0.1 m·s⁻¹ = 25 μs. Frame spacing must be ≪ that for the linear time-interpolation to be faithful.

`ca_export_every`	Frame spacing	Frames per cell-residence	I/O overhead	Notes
1	~1e-6 s	~25 (overkill)	~0.1 %	Default for very fine CA grids
10	~1e-5 s	~2.5 (adequate)	<0.01 %	Recommended baseline
100	~1e-4 s	~0.25 (too sparse)	negligible	Time-interp visibly under-resolves

Domain alignment¶

CA's volume (ca_origin, ca_lateral) must lie entirely inside the thermal-fluid domain and cover the toolpath footprint plus a small halo. If the alignment is off — even by tens of microns — every CA cell ends up outside every frame's bbox and the integration runs but with zero physics effect (a silent pass-through; no error). The projects/20260418_MERGING_CA/log.md Domain alignment fix entry traces exactly this failure mode in a real test run.

A useful sanity check: result/<case>/coupling/ should have non-empty files (>1 KB) once the first laser pass starts; if all coupling files come out at exactly the 76-byte empty-frame size (header only, nx=ny=nz=0), either the bbox is misaligned with the toolpath, or the Z fine-mesh zone doesn't reach the powder layer. Use python3 code_base/solver_CA/tools/inspect_frame.py <file> to print the header + temperature stats for any single frame.

The writer also emits a per-frame consistency line whenever its bbox is shorter than mod_dimen.f90's reported alen / width / depth by more than ~1.5 cells — useful to catch AMR / fine-zone sizing problems mid-run:

[ca_coupling_writer] frame  174 bbox shorter than dimen pool:
   L 650.0 vs 982.2 um;  W 210.0 vs 152.2 um;  D 240.0 vs 114.3 um
 → AMR uniform zone or Z fine zone too small; CA will not see the
   entire mushy zone in this frame.

Mechanical ↔ CA¶

Currently independent. Mechanical operates on the FEM grid (coarsened from thermal); CA operates on its own uniform sub-grid. No data flows between them.

A future "hot CA → mech" pathway would pass per-cell phase / orient information into mechanical for grain-resolved residual-stress modelling, but that's out of scope for the current codebase.

Putting it together¶

The full multi-process layout under bash run.sh <case> 10 8 10 &:

   ┌──────────────────────────────────────────────────────────────────┐
   │  bash run.sh                                                     │
   │                                                                  │
   │  ┌────────────────────────────────┐                              │
   │  │ thermal-fluid (10 OMP threads) │                              │
   │  │  - main time loop              │  writes mech_input_*.bin     │
   │  │  - species (in-process)        │ ──────────────────────────►  │
   │  │  - mech (in-process if N_M=0)  │                              │
   │  │  - ca_export_frame             │  writes ca_input_*.bin       │
   │  └─────────────┬──────────────────┘ ──────────────────────────►  │
   │                │ (background &)                                  │
   │  ┌─────────────▼──────────────────┐                              │
   │  │ mechanical (8 OMP, separate    │ ◄── reads mech_input_*.bin   │
   │  │ process, PHOENIX_RUN_MODE=     │                              │
   │  │ mechanical)                    │                              │
   │  └────────────────────────────────┘                              │
   │                                                                  │
   │  ┌────────────────────────────────┐                              │
   │  │ CA (10 OMP threads, separate   │ ◄── reads ca_input_*.bin     │
   │  │ process, PHOENIX_RUN_MODE=ca)  │     deletes after consume    │
   │  │  - main thread (CA solve)      │                              │
   │  │  - pthread writer (VTI dump)   │  writes Grains*.vti          │
   │  └────────────────────────────────┘                              │
   └──────────────────────────────────────────────────────────────────┘

OMP teams stay separate per process; each solver fully utilises its allocated cores. The disk-based protocols add ~14 MB/run of coupling-frame data (CA) and a few MB for mech, both consumed-and-deleted in flight, so steady-state queue size stays small.