Detailed Guide to Tracing a LangChain LLM App with OpenTelemetry

November 20, 2024

You might be wondering why anyone would go through the effort of tracing a Python app, right? Well, tracing is like having a magic map that shows you exactly what's happening under the hood of your application.

Here’s why tracing can be super helpful

Keeping an Eye on Performance
  • Spotting Slowdowns: Tracing helps you pinpoint where your app might be dragging its feet. If a part of your app is slower than a sleepy tortoise, tracing can point it out.
  • Resource Efficiency: It shows you how your app is using resources like CPU and memory, so you can make it run more smoothly without guzzling too much power.
Smoothing Out the Bugs
  • Zeroing in on Errors: If something goes haywire, traces lead you like a flashlight straight to where the problem started.
  • Quick Fixes: By providing a timeline, tracing makes it easier to figure out what went wrong and get things back on track faster.
Getting to Know Your App Better
  • Visualizing the Journey: Tracing gives you a neat picture of how data flows through your app. It's like having a GPS for your software!
  • Gaining Insights: It’s all about understanding how your app behaves under different conditions, which is great for making improvements.
Boosting Reliability
  • Catching Odd Behavior: Traces can help spot issues before they turn into big problems by revealing unusual patterns.
  • Setting a Baseline: When you know what "normal" looks like, it's easier to catch when things go amiss.
Staying Compliant and Secure
  • Audit Trails: Tracing logs offer a record of what's happening inside your app, which is crucial if you’re working in a sector with strict compliance needs.
  • Spotting Security Issues: It can also highlight unusual access patterns that might signal a security breach.

In the following section, we're going to go through the process of using open-telemetry to instrument a python application. We will demonstrate different otel configurations.

Trace your Langchain application

Prerequisites
  1. Python Installation: Ensure Python is installed (Python 3.6 and above).
  2. Langchain Library: Your application should utilize the Langchain library. Install it if it's not already:
  3. pip install langchain
  4. OpenTelemetry Components: We'll need OpenTelemetry API, SDK, and an exporter to send trace data.
Step 1: Install Required Packages

Install OpenTelemetry SDK, API, and an exporter for the backend service you want to use.

pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp

If you're using a specific exporter like Jaeger or Zipkin, install those as well:

pip install opentelemetry-exporter-jaegerpip install opentelemetry-exporter-zipkin

Step 2: Set Up OpenTelemetry in Your Application

  1. Configure the Tracer Provider: Setup a tracer provider with a resource identifier for your application.
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider

# Define a resource for the tracer (essentially metadata that accompanies the traces)
resource = Resource(attributes={"service.name": "langchain-llm-service"})

# Set the global tracer provider
trace.set_tracer_provider(TracerProvider(resource=resource))
  1. Set Up Exporter and Span Processor: Choose an exporter based on where you want to send the trace data.

Using OTLP (default):

from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

# Configure the OTLP exporter
otlp_exporter = OTLPSpanExporter(
    endpoint="http://localhost:4317",
    insecure=True
)

# Create a batch span processor and add it to the tracer provider
span_processor = BatchSpanProcessor(otlp_exporter)
trace.get_tracer_provider().add_span_processor(span_processor)

Using Jaeger:

from opentelemetry.exporter.jaeger import JaegerExporter

# Configure the Jaeger exporter
jaeger_exporter = JaegerExporter(
    agent_host_name='localhost',
    agent_port=6831,
)

trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(jaeger_exporter)
)

Using Zipkin:

from opentelemetry.exporter.zipkin import ZipkinExporter

# Configure the Zipkin exporter
zipkin_exporter = ZipkinExporter(
    endpoint="http://localhost:9411/api/v2/spans",
)

trace.get_tracer_provider().add_span_processor(
    BatchSpanProcessor(zipkin_exporter)
)
  1. Get a Tracer: Obtain a tracer instance with a specific name for your application component.
tracer = trace.get_tracer(__name__)

Step 3: Instrument Your Langchain Application

Integrate tracing logic into your Langchain application components where you want to observe the execution.

from langchain import Langchain

# Example function to be traced
def predict_weather():
    with tracer.start_as_current_span("predict-weather-function-span"):
        # Simulating a Langchain prediction call
        response = Langchain().predict("What is the weather today?")
        print(response)
        return response

# Using traces in a Langchain component
def main():
    with tracer.start_as_current_span("main-span"):
        predict_weather()

if __name__ == "__main__":
    main()

Step 4: Configure the OpenTelemetry Collector
  1. Installation: Download and install the OpenTelemetry Collector. Follow OpenTelemetry Collector documentation for the setup.
  2. Configuration: Configure the collector (collector.yaml or equivalent configuration file) to receive traces and export to your observability backend (e.g., Jaeger or Zipkin).

Example configuration for receiving OTLP trace data and exporting to Jaeger:

receivers:
  otlp:
    protocols:
      grpc:

exporters:
  jaeger:
    endpoint: "localhost:14250"
    insecure: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [jaeger]

Run the collector using:

otelcol --config collector.yaml

Step 5: Test and Validate
  1. Run the Application: Execute your Python application to start generating traces.
python your_langchain_app.py

Monitor Traces in Backend:

Visit the Jaeger UI (http://localhost:16686) or Zipkin UI (http://localhost:9411) to view the incoming trace data.

Step 6: Adjust and Monitor
  • Sampling Configuration: Adjust sampling rates using probability samplers if required. This is controlled in the tracer provider.
from opentelemetry.sdk.trace.sampling import AlwaysOnSampler

trace.get_tracer_provider().sampler = AlwaysOnSampler()
  • Resource Attributes: Enrich traces with additional resource attributes to help with filtering and querying.
  • Context Propagation: Ensure distributed tracing context is propagated accurately across different services. Utilize OpenTelemetry context propagation for HTTP clients, gRPC, etc.
Conclusion

By following these steps, you should be able to trace your Python application using Langchain and monitor the traces using OpenTelemetry. Ensure you adjust configurations as per your environment requirements, and refer to OpenTelemetry official documentation for advanced features and settings.