Abstract
Rapid and accurate quantification of arsenic (As) in the hyperaccumulator Pteris vittata is vital for evaluating the efficacy of phytoremediation. While Laser-Induced Breakdown Spectroscopy (LIBS) offers a promising rapid analysis solution, its precision is often compromised by complex biological matrix effects and the limitations of conventional chemometric models. To address these challenges, this study proposes a novel quantitative model, the multi-view linearly constrained tabular prior-data fitted network (MLC-TabPFN), which synergistically combines data-driven learning with knowledge-driven principles. Experimental results first established that the baseline TabPFN model surpassed traditional algorithms like PLSR, achieving a higher prediction coefficient of determination (R²p) of 0.956, compared to 0.933 for PLSR. Building on this strong foundation, two key innovations were introduced. First, a linear constraint (LC) module was integrated, which incorporates the physical prior of a positive correlation between spectral intensity and concentration. This knowledge-driven enhancement improved the model's plausibility and accuracy, significantly reducing the mean absolute percentage error (MAPEp) from 8.537 % to 7.641 % while maintaining a high coefficient of determination (R²p = 0.971) on the optimal second-derivative spectra. Second, a multi-view fusion module was implemented to integrate complementary information from original, first-, and second-derivative spectra. This final step significantly boosted performance, elevating the R²p to 0.981, while substantially lowering the RMSEp to 40.189 mg/kg and the MAPEp to 6.249 %. This study validates that the proposed framework provides a superior solution for high-precision LIBS analysis in complex biological matrices, demonstrating a powerful synergy between advanced machine learning and physical domain knowledge.


