The 1.7B LM models are incorrectly converted

#1
by concedo - opened

They are missing a .model prefix across all their tensor names, causing a load failure

They are missing a .model prefix across all their tensor names, causing a load failure

Yes:) solved, it's a typo from upstream .safetensors they likely serialized the inner model instead of the wrapper.

diff --git a/convert.py b/convert.py
index 72b0a47..ba8816b 100755
--- a/convert.py
+++ b/convert.py
@@ -117,6 +117,11 @@ def add_tensors_from_sf(w, sf_path, tag):

     for name in names:
         info = meta[name]
+
+        # normalize: some upstream checkpoints omit the "model." prefix
+        if not name.startswith("model."):
+            name = "model." + name
+
         dtype_str = info["dtype"]
         shape = info["shape"]
         off0, off1 = info["data_offsets"]

Sign up or log in to comment