Google's lightweight open source Llama challenge

Building on the work of the Gemini model, Google introduced the Gemma family of open models. Gemma 2B and 7B open models are available in two sizes, each with pre-trained and instruction-tuned options. Across all benchmarks, Gemma outperforms Meta's LLM, Llama 2. For example, Gemma's 7 billion parameter model is great. He achieved Rama II in areas such as reasoning and mathematics, with an overall accuracy rate of 64.3%.

With a free tier of Colab notebooks and free access to Kaggle, users can start using Gemma today. Additionally, first-time users of Google Cloud can take advantage of her $300 credit. Google Cloud credits, which researchers can request in amounts up to $500,000, can also be used to speed up research. Gemma will have two variations of him available: Gemma 2B with 2 billion parameters and Gemma 7B with 7 billion parameters. Each size released has pre-trained and instruction-tuned variations. Additionally, Gemma comes with a new Responsible Generative AI toolkit that provides all the tools you need to create AI apps that are safer to use.

Other features include:

Native Keras 3.0 inference and supervised fine-tuning (SFT) toolchain in all major frameworks (JAX, PyTorch, TensorFlow). Gemma's integration with popular tools like Hugging Face, MaxText, Nvidia NeMo, TensorRT-LLM, and ready-to-use tools – Colab and Kaggle notebooks make it easy to get started. Easily deploy to Vertex AI and Google Kubernetes Engine (GKE) to run pre-trained, instruction-tuned Gemma models on your laptop, desktop, or on Google. Cloud. Optimizations across many AI hardware platforms, including Google Cloud TPUs and Nvidia GPUs, ensure the best performance on the market.

All organizations, regardless of size, are permitted to make responsible commercial use and distribution according to the terms of use. Several benchmarks such as MMLU, HellaSwag, and HumanEval show that Gemma performs better than her Llama 2. Gemma is natively Keras 3.0 compliant and can be used with TensorFlow, PyTorch, and JAX, making it widely used. This edition provides pre-built notebooks for Kaggle and Colab, as well as integrations with widely used tools such as Hugging Face, MaxText, NVIDIA NeMo, and TensorRT-LLM.

Optimized for NVIDIA GPUs and Google Cloud TPUs, Gemma models deliver industry-leading performance. However, you can run it on a variety of platforms, including workstations, laptops, and Google Cloud. Google's recently introduced Gemini 1.5 boasts the largest 1 million token context window in NLP model history, and this advancement follows suit. The context windows for GPT-4 Turbo and Claude 2.1 are 128K and 200K, respectively.

