Domain generalization for semantic segmentation: a survey
TH Rafi, R Mahjabin, E Ghosh, YW Ko… - Artificial Intelligence …, 2024 - Springer
Deep neural networks (DNNs) have proven explicit contributions in making autonomous
driving cars and related tasks such as semantic segmentation, motion tracking, object …
driving cars and related tasks such as semantic segmentation, motion tracking, object …
Unveiling deception in arabic: optimization of deceptive text detection across formal and informal genres
In recent years, social media has significantly influenced how we share information and
exchange messages. However, a significant issue arises from the fast dissemination of …
exchange messages. However, a significant issue arises from the fast dissemination of …
What Makes Multimodal In-Context Learning Work?
Abstract Large Language Models have demonstrated remarkable performance across
various tasks exhibiting the capacity to swiftly acquire new skills such as through In-Context …
various tasks exhibiting the capacity to swiftly acquire new skills such as through In-Context …
Rethinking the evaluation protocol of domain generalization
Abstract Domain generalization aims to solve the challenge of Out-of-Distribution (OOD)
generalization by leveraging common knowledge learned from multiple training domains to …
generalization by leveraging common knowledge learned from multiple training domains to …
Auto-Encoding Morph-Tokens for Multimodal LLM
For multimodal LLMs, the synergy of visual comprehension (textual output) and generation
(visual output) presents an ongoing challenge. This is due to a conflicting objective: for …
(visual output) presents an ongoing challenge. This is due to a conflicting objective: for …
Many-Shot In-Context Learning in Multimodal Foundation Models
Large language models are well-known to be effective at few-shot in-context learning (ICL).
Recent advancements in multimodal foundation models have enabled unprecedentedly …
Recent advancements in multimodal foundation models have enabled unprecedentedly …
Counterfactually Augmented Event Matching for De-biased Temporal Sentence Grounding
Temporal Sentence Grounding (TSG), which aims to localize events in untrimmed videos
with a given language query, has been widely studied in the last decades. However …
with a given language query, has been widely studied in the last decades. However …
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners
Top-view perspective denotes a typical way in which humans read and reason over different
types of maps, and it is vital for localization and navigation of humans as well as ofnon …
types of maps, and it is vital for localization and navigation of humans as well as ofnon …
In-context learning in presence of spurious correlations
Large language models exhibit a remarkable capacity for in-context learning, where they
learn to solve tasks given a few examples. Recent work has shown that transformers can be …
learn to solve tasks given a few examples. Recent work has shown that transformers can be …
A Picture is Worth A Thousand Numbers: Enabling LLMs Reason about Time Series via Visualization
H Liu, C Liu, BA Prakash - arXiv preprint arXiv:2411.06018, 2024 - arxiv.org
Large language models (LLMs), with demonstrated reasoning abilities across multiple
domains, are largely underexplored for time-series reasoning (TsR), which is ubiquitous in …
domains, are largely underexplored for time-series reasoning (TsR), which is ubiquitous in …