Search Resources — Somali Digital Library

Resource 2025 EN

The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It

Leonardo Bertolazzi · Philipp Mondorf · Barbara Plank +1 more

The ability of large language models (LLMs) to validate their output andidentify potential errors is crucial for ensuring robustness and reliability.However, current research indicates that LLMs struggle with self-correction,encountering significant challenges in detecting errors. While studies haveexplored methods to enhance self-correction in LLMs, relatively littleattention has been given to understanding the models' internal mechanismsunderlying error detection. In this paper, we present a mechanistic analysis oferror detection in LLMs, focusing on simple arithmetic problems. Throughcircuit analysis, we identify the computational subgraphs responsible fordetecting arithmetic errors across four smaller-sized LLMs. Our findings revealthat all models heavily rely on $\textit{consistency heads}$--attention headsthat assess surface-level alignment of numerical values in arithmeticsolutions. Moreover, we observe that the models' internal arithmeticcomputation primarily occurs in higher layers, whereas validation takes placein middle layers, before the final arithmetic results are fully encoded. Thisstructural dissociation between arithmetic computation and validation seems toexplain why current LLMs struggle to detect even simple arithmetic errors.

Not Specified

Download

Resource 2025 EN

WeedVision: Multi-Stage Growth and Classification of Weeds using DETR and RetinaNet for Precision Agriculture

Taminul Islam · Toqi Tahamid Sarker · Khaled R Ahmed +2 more

Weed management remains a critical challenge in agriculture, where weedscompete with crops for essential resources, leading to significant yieldlosses. Accurate detection of weeds at various growth stages is crucial foreffective management yet challenging for farmers, as it requires identifyingdifferent species at multiple growth phases. This research addresses thesechallenges by utilizing advanced object detection models, specifically, theDetection Transformer (DETR) with a ResNet50 backbone and RetinaNet with aResNeXt101 backbone, to identify and classify 16 weed species of economicconcern across 174 classes, spanning their 11 weeks growth stages from seedlingto maturity. A robust dataset comprising 203,567 images was developed,meticulously labeled by species and growth stage. The models were rigorouslytrained and evaluated, with RetinaNet demonstrating superior performance,achieving a mean Average Precision (mAP) of 0.907 on the training set and 0.904on the test set, compared to DETR's mAP of 0.854 and 0.840, respectively.RetinaNet also outperformed DETR in recall and inference speed of 7.28 FPS,making it more suitable for real time applications. Both models showed improvedaccuracy as plants matured. This research provides crucial insights fordeveloping precise, sustainable, and automated weed management strategies,paving the way for real time species specific detection systems and advancingAI-assisted agriculture through continued innovation in model development andearly detection accuracy.

Not Specified

Download

Resource 2025 EN

All-in-one: Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark

Davide Testa · Giovanni Bonetta · Raffaella Bernardi +5 more

We introduce MAIA (Multimodal AI Assessment), a native-Italian benchmarkdesigned for fine-grained investigation of the reasoning abilities of visuallanguage models on videos. MAIA differs from other available video benchmarksfor its design, its reasoning categories, the metric it uses and the languageand culture of the videos. It evaluates Vision Language Models (VLMs) on twoaligned tasks: a visual statement verification task, and an open-ended visualquestion-answering task, both on the same set of video-related questions. Itconsiders twelve reasoning categories that aim to disentangle language andvision relations by highlight when one of two alone encodes sufficientinformation to solve the tasks, when they are both needed and when the fullrichness of the short video is essential instead of just a part of it. Thanksto its carefully taught design, it evaluates VLMs' consistency and visuallygrounded natural language comprehension and generation simultaneously throughan aggregated metric. Last but not least, the video collection has beencarefully selected to reflect the Italian culture and the language data areproduced by native-speakers.

Not Specified

Download

Resource 2025 EN

Power sum expansions for Kromatic symmetric functions using Lyndon heaps

Laura Pierson

In arXiv:2301.02177, Crew, Pechenik, and Spirkl defined the Kromaticsymmetric function $\overline{X}_G$ as a $K$-analogue of Stanley's chromaticsymmetric function $X_G$, and one question they asked was how $\overline{X}_G$expands in their $\overline{p}_\lambda$ basis, which they defined as a$K$-analogue of the classic power sum basis $p_\lambda.$ In arXiv:2408.01395,we gave a formula that partially answered this question but did not explain thecombinatorial significance of the coefficients. Here, we give combinatorialdescriptions for the $\overline{p}$-coefficients of $\overline{X}_G$ and$\omega(\overline{X}_G)$, lifting the $p$-expansion of $X_G$ in terms ofacyclic orientations that was given by Bernardi and Nadeau in arXiv:1904.01262.We also propose an alternative $K$-analogue $\overline{p}'$ of the $p$-basisthat gives slightly cleaner expansion formulas. Our expansions are based onLyndon heaps, introduced by Lalonde (1995), which are representatives forcertain equivalence classes of acyclic orientations on clan graphs of $G$.Additionally, we show that knowing $\overline{X}_G$ is equivalent to knowingthe multiset of independence polynomials of induced subgraphs of $G$, whichgives shorter proofs of all our results from arXiv:2403.15929 that$\overline{X}_G$ can be used to determine the number of copies in $G$ ofcertain induced subgraphs. We also give power sum expansions for the Kromaticquasisymmetric function $\overline{X}_G(q)$ defined by Marberg inarXiv:2312.16474 in the case where $G$ is the incomparability graph of a unitinterval order.

Not Specified

Download

Resource 2025 EN

Bijections for faces of braid-type arrangements

Olivier Bernardi

We establish a general bijective framework for encoding faces of someclassical hyperplane arrangements. Precisely, we consider hyperplane arrangements in $\mathbb{R}^n$ whosehyperplanes are all of the form $\{x_i-x_j=s\}$ for some $i,j\in[n]$ and $s\in\mathbb{Z}$. Such an arrangement $A$ is \emph{strongly transitive} if itsatisfies the following condition: if $\{x_i-x_j=s\}\notin A$ and$\{x_j-x_k=t\}\notin A$ for some $i,j,k\in [n]$ and $s,t\geq 0$, then$\{x_i-x_k=s+t\}\notin A$. For any strongly transitive arrangement $A$, we establish a bijection betweenthe faces of $A$ and some set of decorated plane trees.

Not Specified

Download

Resource 2025 EN

Tensor Learning and Compression of N-phonon Interactions

Yao Luo · Dhruv Mangtani · Shiyu Peng +3 more

Phonon interactions from lattice anharmonicity govern thermal properties andheat transport in materials. These interactions are described by n-th orderinteratomic force constants (n-IFCs), which can be viewed as high-dimensionaltensors correlating the motion of n atoms, or equivalently encoding n-phononscattering processes in momentum space. Here, we introduce a tensordecomposition to efficiently compress n-IFCs for arbitrary order n. Usingtensor learning, we find optimal low-rank approximations of n-IFCs by solvingthe resulting optimization problem. Our approach reveals the inherent lowdimensionality of phonon-phonon interactions and allows compression of the 3and 4-IFC tensors by factors of up to $10^3-10^4$ while retaining high accuracyin calculations of phonon scattering rates and thermal conductivity.Calculations of thermal conductivity using the compressed n-IFCs achieve aspeed-up by nearly three orders of magnitude with >98% accuracy relative to thereference uncompressed solution. These calculations include both 3- and4-phonon scattering and are shown for a diverse range of materials (Si, HgTe,MgO, and TiNiSn). In addition to accelerating state-of-the-art thermaltransport calculations, the method shown here paves the way for modelingstrongly anharmonic materials and higher-order phonon interactions.

Not Specified

Download

Resource 2025 EN

Detection of magnetic fields in superclusters of galaxies

G. V. Pignataro · S. P. O'Sullivan · A. Bonafede +3 more

Magnetic fields in large scale structure filaments beyond galaxy clustersremain poorly understood. Superclusters offer a unique setting to study theselow density environments, where weak signals make detection challenging. TheFaraday rotation measure (RM) of polarized sources along supercluster lines ofsight helps constrain magnetic field properties in these regions. This studyaims to determine magnetic field intensity in low density environments withinsuperclusters using RM measurements at different frequencies. We analyzed threenearby (z<0.1) superclusters, Corona Borealis, Hercules, and Leo, wherepolarization observations were available at 1.4 GHz and 144 MHz. Our catalogueincludes 4497 polarized background sources with RM values from literature andunpublished 144 MHz data. We constructed 3D density cubes for each superclusterto estimate density at RM measurement locations. By grouping RM values intothree density bins (outskirts, filaments, and nodes) we examined RM variancelinked to mean density. We found an RM variance excess of 2.5 \pm 0.5 rad^2m^{-4} between the lowest-density regions outside the supercluster and thelow-density filamentary regions within. This suggests an intervening magneticfield in the supercluster filaments. Modeling the RM variance with a singlescale, randomly oriented magnetic field, we constrained the line of sightmagnetic field to B_{//} = 19^{+50}_{-8} nG after marginalizing over reversalscale and path length. Our findings align with previous studies of large scalestructure filaments, suggesting that adiabatic compression alone (B_{||} \sim 2nG) cannot fully explain the observed field strengths. Other amplificationmechanisms likely contribute to the evolution of magnetic fields insuperclusters.

Not Specified

Download

Resource 2025 EN

A novel metric for species vulnerability and coexistence in spatially-extended ecosystems

Davide Bernardi · Giorgio Nicoletti · Prajwal Padmanabha +4 more

We develop a theoretical framework to understand the persistence andcoexistence of competitive species in a spatially explicit metacommunity modelwith a heterogeneous dispersal kernel. Our analysis, based on methods from thephysics of disordered systems and non-Gaussian dynamical mean field theory,reveals that species coexistence is governed by a single key parameter, whichwe term competitive balance. From competitive balance, we derive a novel metricto quantitatively assess the vulnerability of a species, showing that abundancealone is not sufficient to determine it. Rather, a species' vulnerabilitycrucially depends on the state of the metacommunity as a whole. We test ourtheory by analyzing two distinct tropical forest datasets, finding excellentagreement with our theoretical predictions. A key step in our analysis is theintroduction of a new quantity - the competitive score - which disentangles theabundance distribution and enables us to circumvent the challenge of estimatingboth the colonization kernel and the joint abundance distribution. Our findingsprovide novel and fundamental insights into the ecosystem-level trade-offsunderlying macroecological patterns and introduce a robust approach forestimating extinction risks.

Not Specified

Download

Resource 2025 EN

Heterogeneously structured compartmental models of epidemiological systems: from individual-level processes to population-scale dynamics

Emanuele Bernardi · Tommaso Lorenzi · Mattia Sensi +1 more

We develop a general modelling framework for compartmental epidemiologicalsystems structured by continuous variables which are linked to the levels ofexpression of compartment-specific traits. We start by formulating anindividual-based model that describes the dynamics of single individuals interms of stochastic processes. Then we formally derive: (i) the mesoscopiccounterpart of this model, which is formulated as a system ofintegro-differential equations for the distributions of individuals over thestructuring-variable domains of the different compartments; (ii) thecorresponding macroscopic model, which takes the form of a system of ordinarydifferential equations for the fractions of individuals in the differentcompartments and the mean levels of expression of the traits represented by thestructuring variables. We employ a reduced version of the macroscopic model toobtain a general formula for the basic reproduction number, $\mathcal{R}_0$, interms of key parameters and functions of the underlying microscopic model, soas to illustrate how such a modelling framework makes it possible to drawconnections between fundamental individual-level processes and population-scaledynamics. Finally we apply the modelling framework to case studies based onclassical compartmental epidemiological systems, for each of which we report onMonte Carlo simulations of the individual-based model as well as on analyticalresults and numerical solutions of the macroscopic model.

Not Specified

Download

Resource 2025 EN

On the Convexity of the Bernardi Integral Operator

Johnny E. Brown

We prove that the Bernardi Integral Operator maps certain classes of boundedstarlike functions into the class of convex functions, improving the result ofOros and Oros. We also present a general unified method for investigatingvarious other integral operators that preserve many of the previously studiedsubclasses of univalent and p-valent functions.

Not Specified

Download