Proposal: new chat_template_arg `enable_history_reasoning` for reusing prompt cache among querys within Agents .

#39
by Abioy - opened

Currently reasoning contents before the last user query msg will be ignored.
This might cause prompt cache miss, especially within agents (eg. Coding Agents / Deep Agents) that just calling tools many time before the last user query msg.
So, here I propose a new chat template arg enable_history_reasoning for (optionally) keep the history reasoning contents in the final prompt, and reusing prompt cache (better) in such cases.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment