TLDRs on Googles Gemma, spoiler dont use Gemma 7B yet? N

15 Aug, 2024 linkedin

⚡ TLDRs on Google’s Gemma, spoiler: don’t use Gemma 7B (yet?)

👞 Non technical TLDR

Google “open weighted” 2 models: Gemma-2B and Gemma-7B, which are direct competitors of Microsoft’s phi2 and Mistral AI’s Mistral-7B resp.
They report that their 7B model is better on most benchmarks than Mistral-7B (the current leader for this “size class”)
The community (and I) see the contrary so far, but it wouldn’t be the first time that errors in implementations hinder performances, it has only been 4 days, let’s give it some time
Google created a Gemma tuning competition on Kaggle, awarding $10k / winner: they are pushing adoption
If Gemma is indeed worse, it could just be seen as a move from Google to be able to sit at the “LLM open [[sourcers]]” table without contributing significant value

🔬 Technical TLDR

Google released weights for Gemma-2B & Gemma-7B + a technical report + a standalone C++ inference engine implementation
2B trained on 2T tokens, 7B trained on 6T tokens. 6T is a lot but rumours say Mistral-7B was trained on 8T
Both models are now available in llama.cpp and ollama
Vocabulary 8x larger than Llama2’s (256k vs 32k)
Google observes better performance with Gemma 7B compared to Mistral 7B, the community (me included) observes the contrary so far
I tested it using llama.cpp, it would not be the first time that errors in implementation degrades performances, it’s maybe too early to judge, let’s see
Google introduced yet a new chat template (check attached screenshot)
They kinda lie on the number of parameters: Mistral-7B has 7.2B parameters; Gemma-7B has 8.5B
You can try Gemma on Hugging Face Chat & Perplexity: link in the comments
Waiting for chatbot areana benchmarks

Links in the comments

theo martin