π #OpenSource #LLM Teuken-7B has been officially launched and is available for download π This milestone is part of the OpenGPT-X initiative, dedicated to the creation of large AI language models “Made in Germany” π©πͺ, designed for both business and research needs. Even with limited resources - just 18% of the computing power used to train models like Meta’s Llama3 8B - Teuken-7B comes up with some interesting features π―
Hereβs what makes Teuken-7B stand out:
π Key Features: β’ Multilingual & Open Source: Built to support all 24 official EU languages, emphasizing European linguistic diversity. π β’ Trustworthy & Versatile: Tailored to Europe’s wide range of cultures and businesses, with a strong focus on openness and community collaboration. π€
βοΈ Technical Innovations: β’ Custom Multilingual Tokenizer: Specifically optimized for European languages for enhanced efficiency and performance. Might help also to include other languages into LLMs. β’ Efficient training: Developed using just 18% of the computing power required for models such as Meta’s Llama3 8B, and less than 1% for larger models such as Llama3 405B - making this an interesting approach for resource-constrained environments and lower energy consumption for training. ππ‘
π Training Data: β’ Over 50% non-English content, with training content also in languages like Maltese. β’ Own benchmarks for multilingualism and achieving comparable quality of output in all supported languages.
With Teuken-7B, it seems that Europe is showing some degree of resilience in the context of a geostrategic competition for influence in the emerging AI market. π
For more details, check out the project: π Teuken-7B: opengpt-x.de/en/models…
#AI #OpenSource #Teuken7B #Innovation #MultilingualAI #OpenGPTX #MachineLearning #EuropeanTech π»π
