zhub.link is one of the many independent Mastodon servers you can use to participate in the fediverse.

Administered by:

Server stats:

28
active users

#highflyer

0 posts0 participants0 posts today
Habr<p>HAI LLM: Как DeepSeek снизил стоимость обучения и генерации в разы без потери качества?</p><p>Компания HighFlyer внедрила в свою LLM такие архитектурные фишки как Multi-Head Latent Attention, Mixture of Experts (MoE) with Auxiliary-Loss-Free Load Balancing и Multi-Token Predict. Однако все эти новшества уже были ранее представлены в других LLM: GPT-4, Llama, Mistrall и других. Полистав WhitePaper HighFlyer, можно наткнуться на описание собственного непубличного тренировочного фреймворка HAI LLM , эксплуатирующего действительно новые фишки, которые позволяют значительно сэкономить на обучении модели. Именно в фреймворке и кроется, как мне кажется, одна из основных инноваций DeepSeek, о чем мне бы и хотелось поговорить далее. Приятного прочтения)</p><p><a href="https://habr.com/ru/companies/bothub/articles/878742/" target="_blank" rel="nofollow noopener noreferrer" translate="no"><span class="invisible">https://</span><span class="ellipsis">habr.com/ru/companies/bothub/a</span><span class="invisible">rticles/878742/</span></a></p><p><a href="https://zhub.link/tags/deepseek" class="mention hashtag" rel="tag">#<span>deepseek</span></a> <a href="https://zhub.link/tags/hai_llm" class="mention hashtag" rel="tag">#<span>hai_llm</span></a> <a href="https://zhub.link/tags/HighFlyer" class="mention hashtag" rel="tag">#<span>HighFlyer</span></a> <a href="https://zhub.link/tags/llm" class="mention hashtag" rel="tag">#<span>llm</span></a> <a href="https://zhub.link/tags/%D0%B8%D0%B8" class="mention hashtag" rel="tag">#<span>ии</span></a> <a href="https://zhub.link/tags/%D0%B8%D0%B8_%D0%B8_%D0%BC%D0%B0%D1%88%D0%B8%D0%BD%D0%BD%D0%BE%D0%B5_%D0%BE%D0%B1%D1%83%D1%87%D0%B5%D0%BD%D0%B8%D0%B5" class="mention hashtag" rel="tag">#<span>ии_и_машинное_обучение</span></a> <a href="https://zhub.link/tags/deepseek_v3" class="mention hashtag" rel="tag">#<span>deepseek_v3</span></a></p>
Nonilex<p><a href="https://masto.ai/tags/DeepSeek" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DeepSeek</span></a> is a <a href="https://masto.ai/tags/China" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>China</span></a>-based start-up that last week launched a free <a href="https://masto.ai/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> assistant that it says can operate at a lower cost than American AI models like <a href="https://masto.ai/tags/ChatGPT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ChatGPT</span></a>. The company was founded in 2023 by <a href="https://masto.ai/tags/LiangWenfeng" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LiangWenfeng</span></a>, co-founder of the hedge fund <a href="https://masto.ai/tags/HighFlyer" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HighFlyer</span></a>. By Monday it had rocketed to the top of downloads on the Apple Store.<br>
DeepSeek has shaken the market because it purports to need fewer &amp; less advanced chips than other <a href="https://masto.ai/tags/ArtificialIntelligence" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ArtificialIntelligence</span></a> models, while still performing as well as US rivals….</p>