Large language models (LLMs) have revolutionized the way we interact with artificial intelligence, enabling a wide range of applications from text generation to code writing, and beyond. While powerful models like GPT-4 have gained widespread adoption, their computational requirements and costs can be prohibitive for many use cases, especially those involving real-time or on-device processing.
This is where local LLMs come into play. Smaller, more efficient models like Mistral 7b or various Hugging Face offerings can be deployed on local hardware, providing faster inference times and greater privacy. However, these models often struggle with complex prompts and instructions, requiring more detailed and carefully crafted system prompts to produce accurate outputs.
Introducing promptrefiner
promptrefiner is a Python tool designed to bridge this gap, empowering users to create highly effective system prompts for their local LLMs with the help of the powerful GPT-4 model. By iteratively refining the prompts based on evaluation examples and the local model's responses, promptrefiner ensures that the final system prompt is tailored to the specific task at hand, unlocking the full potential of smaller LLMs.
At its core, promptrefiner comprises three key components:
- AbstractLLM: A base class that serves as a blueprint for integrating local LLMs into the tool.
- PromptTrackerClass: Responsible for keeping track of evaluation prompts, system prompts, and the local LLM's responses during the refinement process.
- OpenaiCommunicator: Handles communication with the OpenAI GPT-4 API, leveraging its capabilities to suggest improved system prompts.
Setting up promptrefiner is straightforward. First, we import the necessary modules and classes:
import openai
from dotenv import load_dotenv
from promptrefiner.abstract_llm import AbstractLLM
from promptrefiner.prompt_tracker import PromptTrackerClass
from promptrefiner.openai_communicator import OpenaiCommunicator
Next, we create a custom class that inherits from AbstractLLM
, tailored to our specific local LLM. In this example, we'll use the Mistral 7b model served via llama-cpp-python:
class LlamaCPPModel(AbstractLLM):
def __init__(self, base_url, api_key, temperature=0.1, max_tokens=200):
self.base_url = base_url
self.api_key = api_key
self.temperature = temperature
self.max_tokens = max_tokens
def predict(self, input_text, system_prompt):
response = openai.Completion.create(
engine="mistral-7b",
prompt=f"{system_prompt}\n\nHuman: {input_text}\nAssistant:",
temperature=self.temperature,
max_tokens=self.max_tokens,
top_p=1,
frequency_penalty=0,
presence_penalty=0,
stop=["Human:", "Assistant:"],
headers={"Authorization": f"Bearer {self.api_key}"},
base_url=self.base_url
)
return response.choices[0].text.strip()
With our local LLM class set up, we can create an instance of it and define our evaluation examples:
# Create an instance of your local LLM
llm = LlamaCPPModel(base_url="<http://localhost:8080>", api_key="my_api_key")
# Define evaluation examples
evaluation_inputs = [
"Elon Musk is the founder of Tesla and SpaceX. He is a visionary entrepreneur and engineer.",
"Steve Jobs was the co-founder of Apple. He was a brilliant innovator and marketed products like the iPhone and iPad.",
]
evaluation_outputs = [
"['Elon Musk: founder of Tesla and SpaceX, visionary entrepreneur and engineer']",
"['Steve Jobs: co-founder of Apple, brilliant innovator, marketed products like the iPhone and iPad']",
]
Now, we can initialize our PromptTrackerClass
instance with the evaluation examples and an initial system prompt:
# Define an initial system prompt
initial_system_prompt = "Extract the key details about individuals from the given text and format them as Python strings in a list."
# Create a PromptTrackerClass instance
prompt_tracker = PromptTrackerClass(llm, evaluation_inputs, evaluation_outputs, initial_system_prompt)
With everything set up, we can create an instance of OpenaiCommunicator
and begin the refinement process:
# Create an OpenaiCommunicator instance
openai_communicator = OpenaiCommunicator(model="gpt-4")
# Run the refinement process
number_of_iterations = 3
openai_communicator.refine_prompts(prompt_tracker, number_of_iterations)
After the refinement process completes, we can access the final, optimized system prompt and use it with our local LLM for future tasks:
# Get the final system prompt
final_system_prompt = prompt_tracker.llm_system_prompts[-1]
# Use the final system prompt with your local LLM
new_input_text = "Jeff Bezos founded Amazon, an e-commerce giant. He is a successful entrepreneur and businessman."
output = llm.predict(new_input_text, final_system_prompt)
print(output)
This output should now accurately reflect the desired format, thanks to the optimized system prompt generated by promptrefiner.
Unleashing Local LLM Potential
The true power of promptrefiner lies in its ability to unlock the full potential of local LLMs, enabling them to tackle complex tasks previously reserved for larger, more resource-intensive models. By leveraging the deep understanding and instruction-following capabilities of GPT-4, promptrefiner ensures that your local LLM receives a highly tailored and effective system prompt, significantly improving its performance and output accuracy.
One of the key advantages of using local LLMs is their fast inference times, making them well-suited for real-time applications or scenarios where low latency is crucial. Additionally, local deployments offer greater privacy and control over data, as sensitive information never leaves the local environment.
promptrefiner opens up new possibilities for using local LLMs in a wide range of applications, from text summarization and question answering to code generation and data analysis. By optimizing system prompts, the tool enables these smaller models to handle complex tasks with precision and accuracy, all while benefiting from their computational efficiency and privacy advantages.
Conclusion
In the rapidly evolving landscape of artificial intelligence, promptrefiner stands out as a powerful tool that bridges the gap between powerful but resource-intensive LLMs and their smaller, more efficient counterparts. By harnessing the capabilities of GPT-4 to refine and optimize system prompts, promptrefiner empowers users to unleash the full potential of their local LLMs, unlocking new possibilities for real-time, privacy-focused, and on-device AI applications.
Whether you're a developer, researcher, or simply an AI enthusiast, promptrefiner offers a seamless and intuitive way to enhance the performance of your local LLMs, opening up a world of possibilities for innovative and impactful AI solutions.