Cellm Models Hub

This repository contains a collection of optimized Large Language Models (LLMs) in the .cellm format. These models are specifically tuned for high-performance inference using the Cellm engine, featuring Metal-accelerated kernels and memory-mapped efficiency.

Models

Gemma 3 1B IT (Int8)

Path: gemma-3-1b-it-int8/gemma-3-1b-it-int8.cellmd
Size: 1.3 GB
Type: Quantized Int8 (Symmetric Weight-Only)

Gemma 4 2.3B IT (LiteRT)

Path: gemma-4-2p3b-it-litert/gemma-4-2p3b-it-litert.cellm
Size: 2.4 GB
Type: LiteRT-optimized for Cellm

Usage

You can use these models directly with the Cellm CLI or in your applications:

cellm run --model-path jeffasante/cellm-models/gemma-4-2p3b-it-litert/gemma-4-2p3b-it-litert.cellm

About Cellm

Cellm is a high-performance inference engine for local LLMs, written in Rust with a focus on Metal GPU acceleration and minimal memory overhead.

License

This model is subject to the Gemma Terms of Use. By downloading or using these weights, you agree to the terms listed at ai.google.dev/gemma/terms.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support