Live in London: it was an honor for our team to present the first Deep Learning large language model purpose-built and pre-trained specifically for the financial services industry. #ai #artificialintelligence #financialservices #nlp #deeplearning #deeplearningai #aifs23
@sambanovaai
-
SambaNovaAI CEO Rodrigo Liang Hosts NYSE Event
By
–
Much appreciation to @NYSE for hosting @SambaNovaAI CEO and Co-Founder @RodrigoLiang today in NYC.
-
SambaNova CEO discusses enterprise generative AI in oil gas
By
–
"You better better know how to go with the flow." PODCAST:
@SambaNovaAI CEO and Co-founder
@RodrigoLiang joins the Digital Innovations in Oil and Gas with @geoffreycann Podcast. Listen to the full episode at https://
digitaloilgas.libsyn.com/rodrigo-liang-
on-using-enterprise-generative-ai-tools-in-oil-and-gas
… -
SambaNova Releases SN-13B-8k-Instruct Language Model
By
–
@huggingface | SN-13B-8k-Instruct, a 13 billion parameter model https://
huggingface.co/sambanovasyste
ms/SN-13B-8k-Instruct
… Join our Discord to ask questions and discuss. https://
discord.gg/8z2Pe7cpRv Read the full blog and technical details at https://
sambanova.ai/blog/training-
long-sequence-size-models-on-sambanova/
… (10/10) -
Open License Model Released for Long Sequence Understanding
By
–
It is available with an open license and created to understand long sequence capabilities. It is not meant to be a drop-in replacement for chat models. We are excited to see how people use the checkpoint. (9/10)
-
Open Source AI Checkpoint Released on Hugging Face
By
–
When we do a LikeRT human evaluation of our checkpoint and compare with other models, we achieve comparable results. We are open sourcing the checkpoint on @huggingface for the community to try. (8/10)
-
Curriculum Learning Boosts Model Training From 2K to 8K Tokens
By
–
To train this model, we use Curriculum Learning by gradually increasing the token lengths trained on from 2K to 8K. We additionally train on an instruction tune dataset sampled from various popular sources and curated in-house. (4/10)
-
Instruction List Technique Enhances Long Sequence Attention
By
–
Second, we use a technique that we developed called instruction list that synthetically creates tasks that encourage more long sequence attention and instruction following. (6/10)
-
13B Parameter Long Sequence Model Achieves Competitive Accuracy
By
–
Using this recipe, we are able to train a competitive long sequence model at 13B parameter scale. We achieve 2-12 points better accuracy across a wide variety of long sequence tasks from Scrolls and ZeroScrolls. (7/10)
-
Dataset Curation for Long Sequence Instruction Following
By
–
To curate a dataset that encourages long sequence instruction following, we use two techniques. First we find tasks that truly benefit from longer sequences and add them to our instruction tuning datasets. (5/10)