AI Dynamics

Global AI News Aggregator

@_akhaliq

LANCE: Stress-testing Visual Models with Language-guided Counterfactual Images

By

@_akhaliq

–

31 May 2023 6h01

LANCE: Stress-testing Visual Models by Generating Language-guided Counterfactual Images propose an automated algorithm to stress-test a trained visual model by generating language-guided counterfactual test images (LANCE). Our method leverages recent progress in large language

→ View original post on X — @_akhaliq,

31 May 2023
Transformers’ Limitations in Compositional Reasoning Tasks

By

@_akhaliq

–

31 May 2023 5h57

Faith and Fate: Limits of Transformers on Compositionality Transformer large language models (LLMs) have sparked admiration for their exceptional performance on tasks that demand intricate multi-step reasoning. Yet, these models simultaneously show failures on surprisingly

→ View original post on X — @_akhaliq,

31 May 2023
PaLI-X: Scaling Multilingual Vision and Language Models

By

@_akhaliq

–

31 May 2023 5h53

PaLI-X: On Scaling up a Multilingual Vision and Language Model present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new

→ View original post on X — @_akhaliq,

31 May 2023
KAFA: Knowledge-Augmented Vision-Language Models for Image Ad Understanding

By

@_akhaliq

–

31 May 2023 5h33

KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models Image ad understanding is a crucial task with wide real-world applications. Although highly challenging with the involvement of diverse atypical scenes, real-world

→ View original post on X — @_akhaliq,

31 May 2023
HiFA: Advanced Diffusion Guidance for High-Fidelity Text-to-3D Synthesis

By

@_akhaliq

–

31 May 2023 5h16

HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance

Automatic text-to-3D synthesis has achieved remarkable advancements through the optimization of 3D models. Existing methods commonly rely on pre-trained text-to-image generative models, such as diffusion models,… pic.twitter.com/Jr3oJkNzFG
— AK (@_akhaliq) 31 mai 2023

HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance Automatic text-to-3D synthesis has achieved remarkable advancements through the optimization of 3D models. Existing methods commonly rely on pre-trained text-to-image generative models, such as diffusion models,

→ View original post on X — @_akhaliq,

31 May 2023
StyleAvatar3D: High-Fidelity 3D Avatar Generation Using Diffusion Models

By

@_akhaliq

–

31 May 2023 5h03

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation

present a novel method for generating high-quality, stylized 3D avatars that utilizes pre-trained image-text diffusion models for data generation and a Generative Adversarial Network… pic.twitter.com/mYxICiPCgH
— AK (@_akhaliq) 31 mai 2023

StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation present a novel method for generating high-quality, stylized 3D avatars that utilizes pre-trained image-text diffusion models for data generation and a Generative Adversarial Network

→ View original post on X — @_akhaliq,

31 May 2023
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

By

@_akhaliq

–

31 May 2023 5h00

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation propose Make-an-Audio 2, a latent diffusion-based T2A method that builds on the success of Make-an-Audio. Our approach includes several techniques to improve semantic alignment and temporal consistency: Firstly, we use

→ View original post on X — @_akhaliq,

31 May 2023
Nested Diffusion Processes for Anytime Image Generation

By

@_akhaliq

–

31 May 2023 4h34

Nested Diffusion Processes for Anytime Image Generation propose an anytime diffusion-based method that can generate viable images when stopped at arbitrary times before completion. Using existing pretrained diffusion models, we show that the generation scheme can be recomposed

→ View original post on X — @_akhaliq,

31 May 2023
RIVAL: Diffusion-Based Real-World Image Variation Pipeline

By

@_akhaliq

–

31 May 2023 4h24

Real-World Image Variation by Aligning Diffusion Inversion Chain propose a novel inference pipeline called Real-world Image Variation by ALignment (RIVAL) that utilizes diffusion models to generate image variations from a single image exemplar. Our pipeline enhances the

→ View original post on X — @_akhaliq,

31 May 2023