Search Resources — Somali Digital Library

Resource 2023 EN

MaNtLE: Model-agnostic Natural Language Explainer

Rakesh R. Me · Kerem Zaman · Shashank Srivastava

Understanding the internal reasoning behind the predictions of machinelearning systems is increasingly vital, given their rising adoption andacceptance. While previous approaches, such as LIME, generate algorithmicexplanations by attributing importance to input features for individualexamples, recent research indicates that practitioners prefer examininglanguage explanations that explain sub-groups of examples. In this paper, weintroduce MaNtLE, a model-agnostic natural language explainer that analyzesmultiple classifier predictions and generates faithful natural languageexplanations of classifier rationale for structured classification tasks.MaNtLE uses multi-task training on thousands of synthetic classification tasksto generate faithful explanations. Simulated user studies indicate that, onaverage, MaNtLE-generated explanations are at least 11% more faithful comparedto LIME and Anchors explanations across three tasks. Human evaluationsdemonstrate that users can better predict model behavior using explanationsfrom MaNtLE compared to other techniques

Not Specified

Download

Resource 2023 EN

Towards Language-Based Modulation of Assistive Robots through Multimodal Models

Philipp Wicke · Lüfti Kerem Şenel · Shengqiang Zhang +4 more

In the field of Geriatronics, enabling effective and transparentcommunication between humans and robots is crucial for enhancing the acceptanceand performance of assistive robots. Our early-stage research projectinvestigates the potential of language-based modulation as a means to improvehuman-robot interaction. We propose to explore real-time modulation during taskexecution, leveraging language cues, visual references, and multimodal inputs.By developing transparent and interpretable methods, we aim to enable robots toadapt and respond to language commands, enhancing their usability andflexibility. Through the exchange of insights and knowledge at the workshop, weseek to gather valuable feedback to advance our research and contribute to thedevelopment of interactive robotic systems for Geriatronics and beyond.

Not Specified

Download

Resource 2023 EN

FourierLoss: Shape-Aware Loss Function with Fourier Descriptors

Mehmet Bahadir Erden · Selahattin Cansiz · Onur Caki +6 more

Encoder-decoder networks become a popular choice for various medical imagesegmentation tasks. When they are trained with a standard loss function, thesenetworks are not explicitly enforced to preserve the shape integrity of anobject in an image. However, this ability of the network is important to obtainmore accurate results, especially when there is a low-contrast differencebetween the object and its surroundings. In response to this issue, this workintroduces a new shape-aware loss function, which we name FourierLoss. Thisloss function relies on quantifying the shape dissimilarity between the groundtruth and the predicted segmentation maps through the Fourier descriptorscalculated on their objects, and penalizing this dissimilarity in networktraining. Different than the previous studies, FourierLoss offers an adaptiveloss function with trainable hyperparameters that control the importance of thelevel of the shape details that the network is enforced to learn in thetraining process. This control is achieved by the proposed adaptive loss updatemechanism, which end-to-end learns the hyperparameters simultaneously with thenetwork weights by backpropagation. As a result of using this mechanism, thenetwork can dynamically change its attention from learning the general outlineof an object to learning the details of its contour points, or vice versa, indifferent training epochs. Working on 2879 computed tomography images of 93subjects, our experiments revealed that the proposed adaptive shape-aware lossfunction led to statistically significantly better results for liversegmentation, compared to its counterparts.

Not Specified

Download

Resource 2023 EN

Examining the Influence of Varied Levels of Domain Knowledge Base Inclusion in GPT-based Intelligent Tutors

Blake Castleman · Mehmet Kerem Turkcan

Recent advancements in large language models (LLMs) have facilitated thedevelopment of chatbots with sophisticated conversational capabilities.However, LLMs exhibit frequent inaccurate responses to queries, hinderingapplications in educational settings. In this paper, we investigate theeffectiveness of integrating a knowledge base (KB) with LLM intelligent tutorsto increase response reliability. To achieve this, we design a scaleable KBthat affords educational supervisors seamless integration of lesson curricula,which is automatically processed by the intelligent tutoring system. We thendetail an evaluation, where student participants were presented with questionsabout the artificial intelligence curriculum to respond to. GPT-4 intelligenttutors with varying hierarchies of KB access and human domain experts thenassessed these responses. Lastly, students cross-examined the intelligenttutors' responses to the domain experts' and ranked their various pedagogicalabilities. Results suggest that, although these intelligent tutors stilldemonstrate a lower accuracy compared to domain experts, the accuracy of theintelligent tutors increases when access to a KB is granted. We also observethat the intelligent tutors with KB access exhibit better pedagogical abilitiesto speak like a teacher and understand students than those of domain experts,while their ability to help students remains lagging behind domain experts.

Not Specified

Download

Resource 2023 EN

Projected Push-Pull For Distributed Constrained Optimization Over Time-Varying Directed Graphs (extended version)

Orhan Eren Akgün · Arif Kerem Dayı · Stephanie Gil +1 more

We introduce the Projected Push-Pull algorithm that enables multiple agentsto solve a distributed constrained optimization problem with private costfunctions and global constraints, in a collaborative manner. Our algorithmemploys projected gradient descent to deal with constraints and a lazy updaterule to control the trade-off between the consensus and optimization steps inthe protocol. We prove that our algorithm achieves geometric convergence overtime-varying directed graphs while ensuring that the decision variable alwaysstays within the constraint set. We derive explicit bounds for step sizes thatguarantee geometric convergence based on the strong-convexity and smoothness ofcost functions, and graph properties. Moreover, we provide additionaltheoretical results on the usefulness of lazy updates, revealing the challengesin the analysis of any gradient tracking method that uses projection operatorsin a distributed constrained optimization setting. We validate our theoreticalresults with numerical studies over different graph types, showing that ouralgorithm achieves geometric convergence empirically.

Not Specified

Download

Resource 2023 EN

LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation

Shengqiang Zhang · Philipp Wicke · Lütfi Kerem Şenel +5 more

The convergence of embodied agents and large language models (LLMs) hasbrought significant advancements to embodied instruction following.Particularly, the strong reasoning capabilities of LLMs make it possible forrobots to perform long-horizon tasks without expensive annotateddemonstrations. However, public benchmarks for testing the long-horizonreasoning capabilities of language-conditioned robots in various scenarios arestill missing. To fill this gap, this work focuses on the tabletop manipulationtask and releases a simulation benchmark, \textit{LoHoRavens}, which coversvarious long-horizon reasoning aspects spanning color, size, space, arithmeticsand reference. Furthermore, there is a key modality bridging problem forlong-horizon manipulation tasks with LLMs: how to incorporate the observationfeedback during robot execution for the LLM's closed-loop planning, which ishowever less studied by prior work. We investigate two methods of bridging themodality gap: caption generation and learnable interface for incorporatingexplicit and implicit observation feedback to the LLM, respectively. Thesemethods serve as the two baselines for our proposed benchmark. Experiments showthat both methods struggle to solve some tasks, indicating long-horizonmanipulation tasks are still challenging for current popular models. We expectthe proposed public benchmark and baselines can help the community developbetter models for long-horizon tabletop manipulation tasks.

Not Specified

Download

Resource 2023 EN

Dimensions of Disagreement: Unpacking Divergence and Misalignment in Cognitive Science and Artificial Intelligence

Kerem Oktar · Ilia Sucholutsky · Tania Lombrozo +1 more

The increasing prevalence of artificial agents creates a correspondinglyincreasing need to manage disagreements between humans and artificial agents,as well as between artificial agents themselves. Considering this larger spaceof possible agents exposes an opportunity for furthering our understanding ofthe nature of disagreement: past studies in psychology have often castdisagreement as two agents forming diverging evaluations of the same object,but disagreement can also arise from differences in how agents represent thatobject. AI research on human-machine alignment and recent work in computationalcognitive science have focused on this latter kind of disagreement, and havedeveloped tools that can be used to quantify the extent of representationaloverlap between agents. Understanding how divergence and misalignment interactto produce disagreement, and how resolution strategies depend on thisinteraction, is key to promoting effective collaboration between diverse typesof agents.

Not Specified

Download

Resource 2023 EN

A physics-informed GAN Framework based on Model-free Data-Driven Computational Mechanics

Kerem Ciftci · Klaus Hackl

Model-free data-driven computational mechanics, first proposed byKirchdoerfer and Ortiz, replace phenomenological models with numericalsimulations based on sample data sets in strain-stress space. In this study, weintegrate this paradigm within physics-informed generative adversarial networks(GANs). We enhance the conventional physics-informed neural network frameworkby implementing the principles of data-driven computational mechanics intoGANs. Specifically, the generator is informed by physical constraints, whilethe discriminator utilizes the closest strain-stress data to discern theauthenticity of the generator's output. This combined approach presents a newformalism to harness data-driven mechanics and deep learning to simulate andpredict mechanical behaviors.

Not Specified

Download

Resource 2023 EN

Double-Free-Layer Stochastic Magnetic Tunnel Junctions with Synthetic Antiferromagnets

Kemal Selcuk · Shun Kanai · Rikuto Ota +3 more

Stochastic magnetic tunnel junctions (sMTJ) using low-barrier nanomagnetshave shown promise as fast, energy-efficient, and scalable building blocks forprobabilistic computing. Despite recent experimental and theoretical progress,sMTJs exhibiting the ideal characteristics necessary for probabilistic bits(p-bit) are still lacking. Ideally, the sMTJs should have (a) voltage biasindependence preventing read disturbance (b) uniform randomness in themagnetization angle between the free layers, and (c) fast fluctuations withoutrequiring external magnetic fields while being robust to magnetic fieldperturbations. Here, we propose a new design satisfying all of theserequirements, using double-free-layer sMTJs with synthetic antiferromagnets(SAF). We evaluate the proposed sMTJ design with experimentally benchmarkedspin-circuit models accounting for transport physics, coupled with thestochastic Landau-Lifshitz-Gilbert equation for magnetization dynamics. We findthat the use of low-barrier SAF layers reduces dipolar coupling, achievinguncorrelated fluctuations at zero-magnetic field surviving up to diametersexceeding ($D\approx 100$ nm) if the nanomagnets can be made thin enough($\approx 1$-$2$ nm). The double-free-layer structure retains bias-independenceand the circular nature of the nanomagnets provides near-uniform randomnesswith fast fluctuations. Combining our full sMTJ model with advanced transistormodels, we estimate the energy to generate a random bit as $\approx$ 3.6 fJ,with fluctuation rates of $\approx$ 3.3 GHz per p-bit. Our results will guidethe experimental development of superior stochastic magnetic tunnel junctionsfor large-scale and energy-efficient probabilistic computation for problemsrelevant to machine learning and artificial intelligence.

Not Specified

Download

Resource 2023 EN

Heisenberg machines with programmable spin-circuits

Saleh Bunaiyan · Supriyo Datta · Kerem Y. Camsari

We show that we can harness two recent experimental developments to build acompact hardware emulator for the classical Heisenberg model in statisticalphysics. The first is the demonstration of spin-diffusion lengths in excess ofmicrons in graphene even at room temperature. The second is the demonstrationof low barrier magnets (LBMs) whose magnetization can fluctuate rapidly even atsub-nanosecond rates. Using experimentally benchmarked circuit models, we showthat an array of LBMs driven by an external current source has a steady-statedistribution corresponding to a classical system with an energy function of theform $E = -1/2\sum_{i,j} J_{ij} (\hat{m}_i \cdot \hat{m}_j$). This may seemsurprising for a non-equilibrium system but we show that it can be justified bya Lyapunov function corresponding to a system of coupledLandau-Lifshitz-Gilbert (LLG) equations. The Lyapunov function we constructdescribes LBMs interacting through the spin currents they inject into the spinneutral substrate. We suggest ways to tune the coupling coefficients $J_{ij}$so that it can be used as a hardware solver for optimization problems involvingcontinuous variables represented by vector magnetizations, similar to the roleof the Ising model in solving optimization problems with binary variables.Finally, we implement a Heisenberg AND gate based on a network of three coupledstochastic LLG equations, illustrating the concept of probabilistic computingwith a programmable Heisenberg model.

Not Specified

Download