Cellm Models Hub

This repository contains a collection of optimized Large Language Models (LLMs) in the .cellm format. These models are specifically tuned for high-performance inference using the Cellm engine, featuring Metal-accelerated kernels and memory-mapped efficiency.

Models

Gemma 3 1B IT (Int8)

  • Path: gemma-3-1b-it-int8/gemma-3-1b-it-int8.cellmd
  • Size: 1.3 GB
  • Type: Quantized Int8 (Symmetric Weight-Only)

Gemma 4 2.3B IT (LiteRT)

  • Path: gemma-4-2p3b-it-litert/gemma-4-2p3b-it-litert.cellm
  • Size: 2.4 GB
  • Type: LiteRT-optimized for Cellm

Usage

You can use these models directly with the Cellm CLI or in your applications:

cellm run --model-path jeffasante/cellm-models/gemma-4-2p3b-it-litert/gemma-4-2p3b-it-litert.cellm

About Cellm

Cellm is a high-performance inference engine for local LLMs, written in Rust with a focus on Metal GPU acceleration and minimal memory overhead.

License

This model is subject to the Gemma Terms of Use. By downloading or using these weights, you agree to the terms listed at ai.google.dev/gemma/terms.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support