NexusRaven-V2-13B Online Demo: The New Standard in Function Calling Beats GPT4

compared with GPT4
NexusRaven

Introduction

NexusRaven-V2-13B is an open-source and commercially viable function calling Large Language Model (LLM) that surpasses the state-of-the-art in function calling capabilities. It offers versatile function calling capability, full explainability, superior performance, and generalization to unseen functions. It is commercially permissive and does not involve any data generated by proprietary LLMs such as GPT-4

Why Raven?

Not only is Raven good at Function Calling, but it also provides reasoning on why it issued those function calls, to justify the calls it has issued.

This includes connecting the arguments it has filled in the function call back to the user’s prompt, making the generation more interpretable.

NexusRaven-V2 13B Online Demo

Google Places API Copilot Demo, Driven by NexusRaven-V2 13B

Quick Start Prompting Guide

Please refer to our notebook, How-To-Prompt.ipynb, for more advanced tutorials on using NexusRaven-V2!

  1. When giving docstrings to Raven, please provide well-indented, detailed, and well-written docstrings as this can help accuracy.
  2. Raven does better when all functions provided to it has arguments, either required or optional, (i.e. func(dummy_arg) is preferred over func()) as this can help accuracy.
  3. We strongly recommend to set sampling to False when prompting NexusRaven-V2.
  4. We strongly recommend a very low temperature (~0.001).
  5. We strongly recommend following the prompting style below.

Key Features of NexusRaven-V2

FeatureDescription
Versatile Function Calling CapabilityNexusRaven-V2 is capable of generating single function calls, nested calls, and parallel calls in many challenging cases.
Fully ExplainableNexusRaven-V2 is capable of generating very detailed explanations for the function calls it generates. This behavior can be turned off, to save tokens during inference.
Performance HighlightsNexusRaven-V2 surpasses GPT-4 by 7% in function calling success rates in human-generated use cases involving nested and composite functions.
Generalization to the UnseenNexusRaven-V2 has never been trained on the functions used in evaluation.
Commercially PermissiveThe training of NexusRaven-V2 does not involve any data generated by proprietary LLMs such as GPT-4. You have full control of the model when deployed in commercial applications.

NexusRaven-V2’s Capabilities

NexusRaven-V2 can generate deeply nested function calls, parallel function calls, and simple single calls. It can also justify the function calls it generated.

Example of Function Call Generation

def get_weather_data(coordinates):
    """
    Fetches weather data from the Open-Meteo API for the given latitude and longitude.
    """

def get_coordinates_from_city(city_name):
    """
    Fetches the latitude and longitude of a given city name using the Maps.co Geocoding API.
    """

# User Query: "What's the weather like in Seattle right now?"

# Result: get_weather_data(coordinates=get_coordinates_from_city(city_name='Seattle'))

Using NexusRaven-V2 with OpenAI FC Schematics and LangChain

NexusRaven-V2 can be easily integrated with workflows built around OpenAI’s function calling. A package is provided to help drop in NexusRaven-V2. There’s also a small demo for using NexusRaven-V2 with LangChain.

Evaluation

NexusRaven-V2 surpasses GPT-4 by up to 7% in function calling success rates in human-generated use cases involving nested and composite functions. It has never been trained on the functions used in evaluation.

Function Calling Average Accuracy

ModelAverage Accuracy
NexusRaven-V27% higher than GPT-4
GPT-4

Limitations and License

While NexusRaven-V2 is a powerful tool, it has its limitations. It works best when connected with a retriever when there are a multitude of functions. The model can be prone to generate incorrect calls and the explanations generated might be incorrect. The model is licensed under the Nexusflow community license.

Conclusion

NexusRaven-V2 represents a significant step forward in the development of function calling LLMs. It outperforms GPT-4 in function calling capabilities and offers a range of features that make it an ideal choice for both commercial and open-source applications.

Prompting Notebook CoLab 

https://colab.research.google.com/drive/19JYixRPPlanmW5q49WYi_tU8rhHeCEKW?usp=sharing#scrollTo=r2g172hXUMm0

NexusRaven-V2-13B Github

https://github.com/nexusflowai/NexusRaven-V2

Nexus_Function_Calling_Leaderboard

https://huggingface.co/spaces/Nexusflow/Nexus_Function_Calling_Leaderboard

FAQs

  1. What is NexusRaven-V2?
    NexusRaven-V2 is an open-source and commercially viable function calling Large Language Model (LLM) that surpasses the state-of-the-art in function calling capabilities.
  2. What are the capabilities of NexusRaven-V2?
    NexusRaven-V2 can generate deeply nested function calls, parallel function calls, and simple single calls. It can also justify the function calls it generated.
  3. How does NexusRaven-V2 perform compared to GPT-4?
    NexusRaven-V2 surpasses GPT-4 by up to 7% in function calling success rates in human-generated use cases involving nested and composite functions.
  4. What are the limitations of NexusRaven-V2?
    NexusRaven-V2 works best when connected with a retriever when there are a multitude of functions. The model can be prone to generate incorrect calls and the explanations generated might be incorrect.
  5. What is the license of NexusRaven-V2?
    The model is licensed under the Nexusflow community license.

About the Author

About the Channel:

Share the Post:
en_GBEN_GB