SkillAgentSearch skills...

Kanana

Kanana: Compute-efficient Bilingual Language Models

Install / Use

/learn @kakao/Kanana
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

<p align="center"> <br> <picture> <source media="(prefers-color-scheme: dark)" srcset="assets/logo/kanana-logo-light.png"> <source media="(prefers-color-scheme: light)" srcset="assets/logo/kanana-logo-dark.png"> <img alt="Kanana Logo" src="assets/logo/kanana-logo-light.png" width="400" style="margin: 40px auto;"> </picture> </br> <p align="center"> 🤗 <a href="https://kko.kakao.com/kananallm">Kanana-1.5 HF Models</a> &nbsp | &nbsp 📕 <a href="https://tech.kakao.com/search?q=kanana">Blogs</a> &nbsp <br>

News 🔥

<br>

Table of Contents

<br>

Kanana 1.5

Kanana 1.5, a newly introduced version of the Kanana model family, presents substantial enhancements in coding, mathematics, and function calling capabilities over the previous version, enabling broader application to more complex real-world problems. This new version now can handle up to 32K tokens length natively and up to 128K tokens using YaRN, allowing the model to maintain coherence when handling extensive documents or engaging in extended conversations. Furthermore, Kanana 1.5 delivers more natural and accurate conversations through a refined post-training process.

✨ Updated Kanana-1.5-15.7B-A3B

Introducing Kanana-1.5-15.7B-A3B, the first Mixture-of-Experts (MoE) model in our Kanana family, engineered for exceptional efficiency and powerful performance. Kanana-1.5-15.7B-A3B, which has sparse architecture, delivers capabilities comparable to the Kanana-1.5-8B dense model while utilizing only 37% of the FLOPS per token, making it a highly inference-efficient and cost-effective solution for real-world applications. Furthermore, Kanana-1.5-15.7B-A3B is powered by our newly enhanced post-training strategy, which includes on-policy distillation followed by reinforcement learning.

<!-- <p align="center"> <picture> <img src="assets/performance/kanana-1.5-radar.png", width="700" style="margin: 40px auto;"> </picture> -->

[!Note] Neither the pre-training nor the post-training data includes Kakao user data.

Performance

Base Model Evaluation

<table> <tr> <th>Models</th> <th>MMLU</th> <th>KMMLU</th> <th>HAE-RAE</th> <th>HumanEval</th> <th>MBPP</th> <th>GSM8K</th> </tr> <tr> <td><strong>Kanana-Flag-1.5-32.5B</strong></td> <td align="center">76.76</td> <td align="center">61.90</td> <td align="center">89.18</td> <td align="center">73.17</td> <td align="center">65.60</td> <td align="center">81.50</td> </tr> <tr> <td><strong>Kanana-Flag-32.5B</strong></td> <td align="center">77.68</td> <td align="center">62.10</td> <td align="center">90.47</td> <td align="center">51.22</td> <td align="center">63.40</td> <td align="center">70.05</td> </tr> <tr> <td><strong>Kanana-Essence-1.5-9.8B</strong></td> <td align="center">68.27</td> <td align="center">52.78</td> <td align="center">86.34</td> <td align="center">64.63</td> <td align="center">61.60</td> <td align="center">71.57</td> </tr> <tr> <td><strong>Kanana-Essence-9.8B</strong></td> <td align="center">67.61</td> <td align="center">50.57</td> <td align="center">84.97</td> <td align="center">40.24</td> <td align="center">53.60</td> <td align="center">63.61</td> </tr> <!-- <tr> <td><strong>Kanana-1.5-8B</strong></td> <td align="center">64.24</td> <td align="center">48.94</td> <td align="center">82.77</td> <td align="center">61.59</td> <td align="center">57.80</td> <td align="center">63.53</td> </tr> <tr> <td><strong>Kanana-8B</strong></td> <td align="center">64.22</td> <td align="center">48.30</td> <td align="center">83.41</td> <td align="center">40.24</td> <td align="center">51.40</td> <td align="center">57.09</td> </tr> --> <tr> <td><strong>Kanana-Nano-1.5-3B</strong></td> <td align="center">59.23</td> <td align="center">47.30</td> <td align="center">78.00</td> <td align="center">46.34</td> <td align="center">46.80</td> <td align="center">61.79</td> </tr> <!-- <tr> <td><strong>Kanana-1.5-2.1B</strong></td> <td align="center">56.30</td> <td align="center">45.10</td> <td align="center">77.46</td> <td align="center">52.44</td> <td align="center">47.00</td> <td align="center">55.95</td> </tr> --> <tr> <td><strong>Kanana-Nano-2.1B</strong></td> <td align="center">54.83</td> <td align="center">44.80</td> <td align="center">77.09</td> <td align="center">31.10</td> <td align="center">46.20</td> <td align="center">46.32</td> </tr> </table>

Open-source Base Model Evaluation

<table> <tr> <th>Models</th> <th>MMLU</th> <th>KMMLU</th> <th>HAE-RAE</th> <th>HumanEval</th> <th>MBPP</th> <th>GSM8K</th> </tr> <tr> <td><strong>Kanana-1.5-15.7B-A3B (<em>New!</em>)</strong></td> <td align="center">64.79</td> <td align="center">51.77</td> <td align="center">83.23</td> <td align="center">59.76</td> <td align="center">60.10</td> <td align="center">61.18</td> </tr> <tr> <td><strong>Kanana-1.5-8B</strong></td> <td align="center">64.24</td> <td align="center">48.94</td> <td align="center">82.77</td> <td align="center">61.59</td> <td align="center">57.80</td> <td align="center">63.53</td> </tr> <tr> <td><strong>Kanana-8B*</strong></td> <td align="center">64.22</td> <td align="center">48.30</td> <td align="center">83.41</td> <td align="center">40.24</td> <td align="center">51.40</td> <td align="center">57.09</td> </tr> <tr> <td><strong>Kanana-1.5-2.1B</strong></td> <td align="center">56.30</td> <td align="center">45.10</td> <td align="center">77.46</td> <td align="center">52.44</td> <td align="center">47.00</td> <td align="center">55.95</td> </tr> <tr> <td><strong>Kanana-Nano-2.1B</strong></td> <td align="center">54.83</td> <td align="center">44.80</td> <td align="center">77.09</td> <td align="center">31.10</td> <td align="center">46.20</td> <td align="center">46.32</td> </tr> </table>

* This model is not an open-sourced, just for comparison with Kanana-1.5-8B

<br>

Instruct Model Evaluation

<table> <tr> <th>Models</th> <th>MT-Bench</th> <th>KoMT-Bench</th> <th>IFEval</th> <th>HumanEval+</th> <th>MBPP+</th> <th>GSM8K (0-shot)</th> <th>MATH</th> <th>MMLU (0-shot, CoT)</th> <th>KMMLU (0-shot, CoT)</th> <th>FunctionChatBench</th> </tr> <tr> <td><strong>Kanana-Flag-1.5-32.5B</strong></td> <td align="center">8.13</td> <td align="center">8.12</td> <td align="center">82.70</td> <td align="center">79.88</td> <td align="center">71.96</td> <td align="center">93.03</td> <td align="center">75.96</td> <td align="center">82.76</td> <td align="center">64.10</td> <td align="center">67.17</td> </tr> <tr> <td><strong>Kanana-Flag-32.5B</strong></td> <td align="center">8.33</td> <td align="center">8.03</td> <td align="center">84.59</td> <td align="center">78.66</td> <td align="center">69.84</td> <td align="center">91.66</td> <td align="center">58.08</td> <td align="center">81.08</td> <td align="center">64.19</td> <td align="center">65.67</td> </tr> <tr> <td><strong>Kanana-Essence-1.5-9.8B</strong></td> <td align="center">7.88</td> <td align="center">7.35</td> <td align="center">76.34</td> <td align="center">72.56</td> <td align="center">66.93</td> <td align="center">90.07</td> <td align="center">62.02</td> <td align="center">72.85</td> <td align="center">52.00</td> <td align="center">51.43</td> </tr> <tr> <td><strong>Kanana-E
View on GitHub
GitHub Stars279
CategoryDevelopment
Updated3d ago
Forks15

Security Score

80/100

Audited on Mar 24, 2026

No findings