Reddit thread: https://
reddit.com/r/ProgrammerHu
mor/comments/145z7g6/do_not_pretend_too_much_from_ai/
…
@_akhaliq
-
Reddit thread about not expecting too much from AI
By
–
-

Trending AI News Stories and Papers
By
–
Trending AI news stories + papers https://
open.substack.com/pub/akhaliq/p/
trending-ai-news-stories-papers-486
… -
Nvidia releases ATT3D for text-to-3D object synthesis
By
–
Nvidia just released ATT3D: Amortized Text-To-3D Object Synthesis
— AK (@_akhaliq) 9 juin 2023
project page: https://t.co/3yAK3Hh4II
Text-to-3D modeling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently… pic.twitter.com/tOvP31EFS0Nvidia just released ATT3D: Amortized Text-To-3D Object Synthesis project page: https://
research.nvidia.com/labs/toronto-a
i/ATT3D/
… Text-to-3D modeling has seen exciting progress by combining generative text-to-image models with image-to-3D methods like Neural Radiance Fields. DreamFusion recently -
Meta Releases MusicGen: Controllable Music Generation Model
By
–
Meta just released MusicGen, a simple and controllable model for music generation
— AK (@_akhaliq) 9 juin 2023
MusicGen is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz. Unlike existing methods like MusicLM, MusicGen doesn't not… pic.twitter.com/kFCOrAmLShMeta just released MusicGen, a simple and controllable model for music generation MusicGen is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz. Unlike existing methods like MusicLM, MusicGen doesn't not
-

Background Prompting for Improved Object Depth Estimation
By
–
Background Prompting for Improved Object Depth paper page: https://
huggingface.co/papers/2306.05
428
… Estimating the depth of objects from a single image is a valuable task for many vision, robotics, and graphics applications. However, current methods often fail to produce accurate depth for -
Test-Time Optimization for Dense Motion Tracking in Videos
By
–
Tracking Everything Everywhere All at Once
— AK (@_akhaliq) 9 juin 2023
paper page: https://t.co/UwLxRYPGvb
present a new test-time optimization method for estimating dense and long-range motion from a video sequence. Prior optical flow or particle video tracking algorithms typically operate within limited… pic.twitter.com/3ryHUA4c9nTracking Everything Everywhere All at Once paper page: https://
huggingface.co/papers/2306.05
422
… present a new test-time optimization method for estimating dense and long-range motion from a video sequence. Prior optical flow or particle video tracking algorithms typically operate within limited -

Scaling Spherical CNNs: Spectral Domain Convolutions
By
–
Scaling Spherical CNNs paper page: https://
huggingface.co/papers/2306.05
420
… Spherical CNNs generalize CNNs to functions on the sphere, by using spherical convolutions as the main linear operation. The most accurate and efficient way to compute spherical convolutions is in the spectral domain -

R-MAE: Regions Meet Masked Autoencoders for Vision Tasks
By
–
R-MAE: Regions Meet Masked Autoencoders paper page: https://
huggingface.co/papers/2306.05
411
… Vision-specific concepts such as "region" have played a key role in extending general machine learning frameworks to tasks like object detection. Given the success of region-based detectors for -

LU-NeRF: Scene and Pose Estimation Using Local Unposed NeRFs
By
–
LU-NeRF: Scene and Pose Estimation by Synchronizing Local Unposed NeRFs paper page: https://
huggingface.co/papers/2306.05
410
… A critical obstacle preventing NeRF models from being deployed broadly in the wild is their reliance on accurate camera poses. Consequently, there is growing interest in -
Grounded Text-to-Image Synthesis with Attention Refocusing
By
–
Grounded Text-to-Image Synthesis with Attention Refocusing
— AK (@_akhaliq) 9 juin 2023
paper page: https://t.co/3DfgBmfB2I
Driven by scalable diffusion models trained on large-scale paired text-image datasets, text-to-image synthesis methods have shown compelling results. However, these models still fail… pic.twitter.com/nQRFzGoSLjGrounded Text-to-Image Synthesis with Attention Refocusing paper page: https://
huggingface.co/papers/2306.05
427
… Driven by scalable diffusion models trained on large-scale paired text-image datasets, text-to-image synthesis methods have shown compelling results. However, these models still fail