@_akhaliq - AI Dynamics - Page 136 of 138

BBF: Value-Based RL Agent Achieves Super-Human Atari Performance

By

–

01 June 2023 6h02

Bigger, Better, Faster: Human-level Atari with human-level efficiency introduce a value-based RL agent, which we call BBF, that achieves super-human performance in the Atari 100K benchmark. BBF relies on scaling the neural networks used for value estimation, as well as a number

→ View original post on X — @_akhaliq

1 June 2023

Improving CLIP Training with Language Rewrites via LaCLIP

By

@_akhaliq

–

01 June 2023 5h18

Improving CLIP Training with Language Rewrites introduce Language augmented CLIP (LaCLIP), a simple yet highly effective approach to enhance CLIP training through language rewrites. Leveraging the in-context learning capability of large language models, we rewrite the text

→ View original post on X — @_akhaliq

1 June 2023

OpenAI Releases Process Supervision Method for Mathematical Reasoning

By

@_akhaliq

–

31 May 2023 20h52

Open AI releases paper + dataset Let’s Verify Step by Step trained a model to achieve a new state-of-the-art in mathematical problem solving by rewarding each correct step of reasoning (“process supervision”) instead of simply rewarding the correct final answer (“outcome

→ View original post on X — @_akhaliq

31 May 2023

Tab-CoT: Zero-shot Tabular Chain of Thought Framework

By

@_akhaliq

–

31 May 2023 20h11

Tab-CoT: Zero-shot Tabular Chain of Thought propose a new Chain-of-Thought framework Tab-CoT, which use a tabular format to conduct complex reasoning process in a highly structured manner. Despite its simplicity, we show that our approach is capable of performing reasoning

→ View original post on X — @_akhaliq

31 May 2023

Photoshop AI Generative Fill Used for Its Intended Purpose

By

@_akhaliq

–

31 May 2023 18h59

Photoshop AI Generative Fill was used for its intended purpose

→ View original post on X — @_akhaliq

31 May 2023

Photoshop Generative Fill Beta Expands Midjourney Photos

By

@_akhaliq

–

31 May 2023 18h01

Photoshop Generative Fill Beta used to expand Midjourney photos pic.twitter.com/kTuC8x4Fvj
— AK (@_akhaliq) 31 mai 2023

Photoshop Generative Fill Beta used to expand Midjourney photos

→ View original post on X — @_akhaliq

31 May 2023

Concept Decomposition for Visual Exploration Using Vision-Language Models

By

@_akhaliq

–

31 May 2023 17h35

Concept Decomposition for Visual Exploration and Inspiration

propose a method to decompose a visual concept, represented as a set of images, into different visual aspects encoded in a hierarchical tree structure. We utilize large vision-language models and their rich latent… pic.twitter.com/J5OduSX7CG
— AK (@_akhaliq) 31 mai 2023

Concept Decomposition for Visual Exploration and Inspiration propose a method to decompose a visual concept, represented as a set of images, into different visual aspects encoded in a hierarchical tree structure. We utilize large vision-language models and their rich latent

→ View original post on X — @_akhaliq

31 May 2023

VisorGPT: Learning Visual Prior via Generative Pre-Training

By

@_akhaliq

–

31 May 2023 8h05

VisorGPT: Learning Visual Prior via Generative Pre-Training

propose to learn Visual prior via Generative Pre-Training, dubbed VisorGPT. By discretizing visual locations of objects, e.g., bounding boxes, human pose, and instance masks, into sequences, our~can model visual prior… pic.twitter.com/xj84MvpE14
— AK (@_akhaliq) 31 mai 2023

VisorGPT: Learning Visual Prior via Generative Pre-Training propose to learn Visual prior via Generative Pre-Training, dubbed VisorGPT. By discretizing visual locations of objects, e.g., bounding boxes, human pose, and instance masks, into sequences, our~can model visual prior

→ View original post on X — @_akhaliq

31 May 2023

LibriTTS-R: Restored Multi-Speaker Text-to-Speech Dataset

By

@_akhaliq

–

31 May 2023 8h01

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus paper introduces a new speech dataset called “LibriTTS-R'' designed for text-to-speech (TTS) use. It is derived by applying speech restoration to the LibriTTS corpus, which consists of 585 hours of speech data at 24 kHz

→ View original post on X — @_akhaliq

31 May 2023