The 1.7B LM models are incorrectly converted
#1
by concedo - opened
They are missing a .model prefix across all their tensor names, causing a load failure
They are missing a
.modelprefix across all their tensor names, causing a load failure
Yes:) solved, it's a typo from upstream .safetensors they likely serialized the inner model instead of the wrapper.
diff --git a/convert.py b/convert.py
index 72b0a47..ba8816b 100755
--- a/convert.py
+++ b/convert.py
@@ -117,6 +117,11 @@ def add_tensors_from_sf(w, sf_path, tag):
for name in names:
info = meta[name]
+
+ # normalize: some upstream checkpoints omit the "model." prefix
+ if not name.startswith("model."):
+ name = "model." + name
+
dtype_str = info["dtype"]
shape = info["shape"]
off0, off1 = info["data_offsets"]