freeradiantbunny.org

freeradiantbunny.org/blog

langgraph examples

LangGraph is an emerging framework for constructing and managing workflows involving language model agents, APIs, and other tools in a composable and flexible manner. In this example, we'll create a simple LangGraph application to compare responses from Groq AI and ChatGPT. We'll describe the components and how they work within LangGraph as we go.

Parts of the Application

1. Node Definitions: Nodes represent discrete tasks in LangGraph. We define nodes for interacting with the Groq API, ChatGPT, and the ComparisonAgent.

2. Edges (Workflow): Edges define the flow of data between nodes. LangGraph handles dependencies and parallel execution based on these edges.

3. Agents and Outputs: The ComparisonAgent processes both responses and compares them. The final output is logged or returned to the user.

Code Implementation

Here’s how the code might look:

from langgraph import Graph, Node, Edge, Agent
                                # 1. Define the API Nodes for Groq AI and ChatGPT
                                class GroqAPINode(Node):
                                    def run(self, prompt):
                                        # Simulated API call to Groq AI
                                        response = f"GroqAI's response to '{prompt}'"  # Replace with actual API logic
                                        return response
                                class ChatGPTNode(Node):
                                    def run(self, prompt):
                                        # Simulated API call to ChatGPT
                                        response = f"ChatGPT's response to '{prompt}'"  # Replace with actual API logic
                                        return response
                                # 2. Define the Comparison Agent
                                class ComparisonAgent(Agent):
                                    def compare(self, groq_response, chatgpt_response):
                                        # Simulate a basic comparison
                                        if groq_response == chatgpt_response:
                                            result = "Both responses are identical."
                                        else:
                                            result = f"Responses differ:\n- Groq: {groq_response}\n- ChatGPT: {chatgpt_response}"
                                        return result
                                # 3. Construct the LangGraph Application
                                def create_comparison_app():
                                    # Create nodes
                                    groq_node = GroqAPINode(name="GroqAPI")
                                    chatgpt_node = ChatGPTNode(name="ChatGPTAPI")
                                    comparison_agent = ComparisonAgent(name="ComparisonAgent")
                                    # Define the graph
                                    graph = Graph()
                                    # Add edges for workflow
                                    graph.add_edge(Edge(source=groq_node, target=comparison_agent, data_key="groq_response"))
                                    graph.add_edge(Edge(source=chatgpt_node, target=comparison_agent, data_key="chatgpt_response"))
                                    # Return the graph
                                    return graph
                                # 4. Execute the Graph
                                def main():
                                    prompt = input("Enter your prompt: ")
                                    # Initialize the graph
                                    graph = create_comparison_app()
                                    # Set inputs
                                    graph.set_input("GroqAPI", prompt=prompt)
                                    graph.set_input("ChatGPTAPI", prompt=prompt)
                                    # Run the graph
                                    responses = graph.run()
                                    # Get and print the comparison result
                                    comparison_result = responses["ComparisonAgent"]
                                    print("\nComparison Result:")
                                    print(comparison_result)
                                if __name__ == "__main__":
                                    main()

Explanation of Components

1. Node: - `GroqAPINode` and `ChatGPTNode` inherit from `Node`. Their `run` methods define the task of sending prompts to respective APIs and receiving responses.

2. Agent: - `ComparisonAgent` is specialized to handle multiple inputs (`groq_response` and `chatgpt_response`) and process them using the `compare` method.

3. Graph: - The `Graph` orchestrates the workflow. It manages execution order, passing data between nodes and agents based on defined `Edge` connections.

4. Edge: - `Edge` connects nodes and specifies how data flows through the graph. Each edge has a `data_key` that indicates the payload to pass between tasks.

5. Execution Flow: - The graph is constructed with nodes and edges. Inputs are fed to the graph, which automatically resolves dependencies, executes the nodes, and aggregates outputs.

How to Extend the Application

1. Real API Calls: - Replace simulated responses in `GroqAPINode` and `ChatGPTNode` with actual API calls.

2. Advanced Comparison: - Implement sophisticated logic in `ComparisonAgent.compare`, such as evaluating coherence, tone, or factual accuracy.

3. Logging and Error Handling: - Add logging and exception handling to monitor and debug the workflow.

This example introduces a basic yet functional LangGraph application that demonstrates core concepts like nodes, edges, agents, and workflows.

Comparison Measurements

When comparing the output of two different Large Language Models from the same prompt, it is crucial to ensure the comparison is both rigorous and methodologically sound.

By using a combination of these best practices and measurements, a programmer can effectively compare the outputs of different LLMs and draw meaningful insights about the relative performance of each model in various contexts.

Here is a list of measurements to consider when comparing two responses to the same prompt:

1. Clarity and Coherence

2. Relevance and Appropriateness

3. Factual Accuracy

4. Performance Metrics

5. Diversity and Creativity

6. Tone and Sentiment

7. Efficiency and Latency

8. Robustness and Reliability

9. Bias and Fairness

10. User Experience Metrics

11. Explainability

12. Language Metrics

13. Human-Like Features