-
Notifications
You must be signed in to change notification settings - Fork 13.5k
server/public_simplechat - basic builtin data store related tool calls added - use builtin browser/client side tool calling with minimal setup #16852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Enable streaming by default, to check the handshake before going on to change the code, given that havent looked into this for more than a year now and have been busy with totally different stuff. Also updated the user messages used for testing a bit
Define the meta that needs to be passed to the GenAi Engine. Define the logic that implements the tool call, if called. Implement the flow/structure such that a single tool calls implementation file can define multiple tool calls.
Make tooljs structure and flow more generic Add a simple_calculator tool/function call logic Add initial skeleton wrt the main tools.mjs file.
Changed latestResponse type to an object instead of a string. Inturn it contains entries for content, toolname and toolargs. Added a custom clear logic due to the same and used it to replace the previously simple assigning of empty string to latestResponse. For now in all places where latestReponse is used, I have replaced with latestReponse.content. Next need to handle identifying the field being streamed and inturn append to it. Also need to add logic to call tool, when tool_call triggered by genai.
Update response_extract_stream to check for which field is being currently streamed ie is it normal content or tool call func name or tool call func args and then return the field name and extracted value. Previously it was always assumed that only normal content will be returned. Currently it is assumed that the server will only stream one of the 3 supported fields at any time and not more than one of them at the same time. TODO: Have to also add logic to extract the reasoning field later, ie wrt gen ai models which give out their thinking. Have updated append_response to expect both the key and the value wrt the latestResponse object, which it will be manipualted. Previously it was always assumed that content is what will be got and inturn appended.
I was wrongly checking for finish_reason to be non null, before trying to extract the genai content/toolcalls, have fixed this oversight with the new flow in progress. I had added few debug logs to identify the above issue, need to remove them later. Note: given that debug logs are disabled by replacing the debug function during this program's initialisation, which I had forgotten about, I didnt get the debug messages and had to scratch my head a bit, before realising this and the other issue ;) Also either when I had originally implemented simplechat 1+ years back, or later due to changes on the server end, the streaming flow sends a initial null wrt the content, where it only sets the role. This was not handled in my flow on the client side, so a null was getting prepended to the chat messages/responses from the server. This has been fixed now in the new generic flow.
Make latestResponse into a new class based type instance wrt ai assistant response, which is what it represents. Move clearing, appending fields' values and getting assistant's response info (irrespective of a content or toolcall response) into this new class and inturn use the same.
Switch oneshot handler to use AssistantResponse, inturn currenlty only handle the normal content in the response. TODO: If any tool_calls in the oneshot response, it is currently not handled. Inturn switch the generic/toplevel handle response logic to use AssistantResponse class, given that both oneshot and the multipart/streaming flows use/return it. Inturn add trimmedContent member to AssistantResponse class and make the generic handle response logic to save the trimmed content into this. Update users of trimmed to work with this structure.
As there could be failure wrt getting the response from the ai server some where in between a long response spread over multiple parts, the logic uses the latestResponse to cache the response as it is being received. However once the full response is got, one needs to transfer it to a new instance of AssistantResponse class, so that latestResponse can be cleared, while the new instance can be used in other locations in the flow as needed. Achieve the same now.
Previously if content was empty, it would have always sent the toolcall info related version even if there was no toolcall info in it. Fixed now to return empty string, if both content and toolname are empty.
The implementations of javascript and simple_calculator now use provided helpers to trap console.log messages when they execute the code / expression provided by GenAi and inturn store the captured log messages in the newly added result key in tc_switch This should help trap the output generated by the provided code or expression as the case maybe and inturn return the same to the GenAi, for its further processing.
Checks for toolname to be defined or not in the GenAi's response If toolname is set, then check if a corresponding tool/func exists, and if so call the same by passing it the GenAi provided toolargs as a object. Inturn the text generated by the tool/func is captured and put into the user input entry text box, with tool_response tag around it.
As output generated by any tool/function call is currently placed into the TextArea provided for End user (for their queries), bcas the GenAi (engine/LLM) may be expecting the tool response to be sent as a user role data with tool_response tag surrounding the results from the tool call. So also now at the end of submit btn click handling, the end user input text area is not cleared, if there was a tool call handled, for above reasons. Also given that running a simple arithmatic expression in itself doesnt generate any output, so wrap them in a console.log, to help capture the result using the console.log trapping flow that is already setup.
and inform the GenAi/LLM about the same
Should hopeful ensure that the GenAi/LLM will generate appropriate code/expression as the argument to pass to these tool calls, to some extent.
ie in vs code with ts-check
Move tool calling logic into tools module. Try trap async promise failures by awaiting results of tool calling and putting full thing in an outer try catch. Have forgotten the nitty gritties of JS flow, this might help, need to check.
So that when tool handler writes the result to the tc_switch, it can make use of the same, to write to the right location. NOTE: This also fixes the issue with I forgetting to rename the key in js_run wrt writing of result.
to better describe how it will be run, so that genai/llm while creating the code to run, will hopefully take care of any naunces required.
Also as part of same, wrap the request details in the assistant block using a similar tagging format as the tool_response in user block.
Instead of automatically calling the requested tool with supplied arguments, rather allow user to verify things before triggering the tool. NOTE: User already provided control over tool_response before submitting it to the ai assistant.
Instead of automatically calling any requested tool by the GenAi / llm, that is from the tail end of the handle user submit btn click, Now if the GenAi/LLM has requested any tool to be called, then enable the Tool Run related UI elements and fill them with the tool name and tool args. In turn the user can verify if they are ok with the tool being called and the arguments being passed to it. Rather they can even fix any errors in the tool usage like the arithmatic expr to calculate that is being passed to simple_calculator or the javascript code being passed to run_javascript_function_code If user is ok with the tool call being requested, then trigger the same. The results if any will be automatically placed into the user query text area. User can cross verify if they are ok with the result and or modify it suitabley if required and inturn submit the same to the GenAi/LLM.
Also avoid showing Tool calling UI elements, when not needed to be shown.
Also take care of updating the toolcall ui if needed from within this.
Fix up the initial skeleton / logic as needed. Remember that we are working with potentially a subset of chat messages from the session, given the sliding window logic of context managing on client ui side, so fix up the logic to use the right subset of messages array and not the global xchat when deciding whether a message is the last or last but one, which need special handling wrt Assistant (with toolcall) and Tool (ie response) messages. Moving tool call ui setup as well as tool call response got ui setup into ChatShow of MultiChatUI ensures that switching between chat sessions handle the ui wrt tool call triggering ui and tool call response submission related ui as needed properly. Rather even loading a previously auto saved chat session if it had tool call or tool call response to be handled, the chat ui will be setup as needed to continue that session properly.
Also cleanup the minimal based showing of chat messages a bit And add github.com to allowed list
Add a newline between name and content in the xml representation of the tool response, so that it is more easy to distinguish things Add github, linkedin and apnews domains to allowed.domains for simpleproxy.py
Seperate out the message ui block into a container containing a role block and contents container block. This will allow themeing of these seperately, if required. As part of same, currently the role has been put to the side of the message with vertical text flow.
Also make reasoning easily identifiable in the chat
Define rules to ensure that chat message contents wrap so as to avoid overflowing beyond the size of the screen being viewed. The style used for chat message role to be placed with vertical oriented text adjacent to the actual message content on the side seems to be creating issue with blank pages in some browsers, so avoid that styling when one is printing.
Create the DB store Try Get and Set operations The post back to main thread done from asynchronous paths. NOTE: given that it has been ages since indexed db was used, so this is a logical implementation by refering to mdn as needed.
Update tooldb logic to match that needed for the db logic and its web worker. Bring in the remaining aspects of db helpers into tools flow.
So mention that may be ai can send complex objects in stringified form. Rather once type of value is set to string, ai should normally do it, but no harm is hinting.
In the eagerness of initial skeleton, had forgotten that the root/generic tool call router takes care of parsing the json string into a object, before calling the tool call, so no need to try parse again. Fixed the same. Hadnt converted the object based response from data store related calls in the db web worker, into json string before passing to the generic tool response callback, fixed the same. - Rather the though of making the ChatMsgEx.createAllInOne handle string or object set aside for now, to keep things simple and consistant to the greatest extent possible across different flows. And good news - flow is working atleast for the overall happy path Need to check what corner cases are lurking like calling set on same key more than once, seemed to have some flow oddity, which I need to check later. Also maybe change the field name to value from data in the response to get, to match the field name convention of set. GPT-OSS is fine with it. But worst case micro / nano / pico models may trip up, in worst case, so better to keep things consistent.
And indexedDB add isnt the one to be happy with updating existing key.
Update the descriptions of set and get to indicate the possible corner cases or rather semantic in such situations. Update the readme also a bit. The auto save and restore mentioned has nothing to do with the new data store mechanism.
|
Will be adding basic list and delete tool calls wrt data store to logically complete this PR, however the data store is usable independent of the same. The other PRs are ready for merging as is wrt their features. |
The basic skeleton added on the web worker side for listing keys. TODO: Avoid duplication of similar code to an extent across some of these db ops.
Avoid the duplicate plumbing code and use a common ops plumbing helper. Remove args[key] oversight from DataStoreList msg on webworkr
|
all needed basic data store tool calls implemented now. Along with your ai model use it to collate and augment the ai model's knowledge or to summarise and reduce the context window that one has to worry about or ... |
The alternate web client ui tools/server/public_simplechat's client side tool calling has been now updated to support a basic data store using browser's indexedDB (from within a web worker context, to isolate from browsers normal indexedDB), thus not requiring any additional setup for the same (and other builtin client side tool calls) other than running llama-server and pointing to this alternate web client ui.
THis builds on other PRs in this series
#16819 - added reasoning support, some minimal client ui and flow cleanup / updates
#16791 - added web search support to simpleproxy based builtin tools, bearer auth for bundled simpleproxy server, ...
#16563 - added support for toolcalls and inturn few direct builtin tools like calculator and javascript execution as well as support for fetch web pages using a bundled simpleproxy.py server, ...
NOTE: As the other PRs mentioned above arent merged into main/master yet, so even commits related to those PRs in the chain of commits shown below