OpenTelemetry Instrumentation Plan Builder
Plan OpenTelemetry traces, metrics, logs, and dashboards for a service so teams can debug latency, errors, and dependency failures faster.
Prompt Template
You are a senior observability engineer. Create an OpenTelemetry instrumentation plan for this system. Application/service: [name and purpose] Architecture: [monolith / microservices / serverless / queue workers] Tech stack: [language, framework, database, message queue, cloud provider] Critical user journeys: [checkout, login, report generation, API request, background job] Current pain points: [slow requests, unknown failures, queue delays, third-party API issues] Telemetry backend: [Datadog / Grafana Tempo / Honeycomb / New Relic / self-hosted / unknown] Compliance/security constraints: [PII redaction, sampling rules, retention limits] Operational goals: [reduce MTTR, improve SLOs, diagnose latency, cost control] Provide: 1. Instrumentation goals and success metrics 2. Trace spans to add for each critical journey 3. Metrics to capture with names, labels, and units 4. Log enrichment strategy and correlation IDs 5. Error and latency attributes to standardize 6. Sampling strategy for normal and incident periods 7. Dashboard and alert recommendations 8. Implementation checklist with code-level guidance
Example Output
OpenTelemetry Plan — Checkout API
Goals
- Cut checkout incident diagnosis time from 45 minutes to under 15 minutes
- Identify whether latency comes from payment provider, inventory DB, or queue publishing
Trace Design
| Journey | Span | Attributes |
|---|---|---|
| POST /checkout | checkout.request | user_tier, cart_size, region, request_id |
| Payment call | payment.authorize | provider, status_code, retry_count, timeout_ms |
| Inventory reserve | inventory.reserve | sku_count, db_statement_type, lock_wait_ms |
| Queue publish | order.event.publish | queue_name, message_size_bytes |
Metrics
- checkout_requests_total{status,region}
- checkout_latency_ms histogram{region,user_tier}
- payment_provider_latency_ms histogram{provider}
- order_queue_publish_failures_total{queue_name}
Log Enrichment
Add request_id, trace_id, span_id, customer_tier, and sanitized order_id to every checkout log. Never log card details, email addresses, or full addresses.
Sampling
- 10% baseline head sampling for successful requests
- 100% sampling for errors, requests over 2 seconds, and payment retries
- Temporary 50% sampling during release windows
Dashboard
Panels: p95 checkout latency, error rate by provider, queue publish failures, top slow spans, dependency latency heatmap.
Tips for Best Results
- 💡List your most important user journeys; generic instrumentation plans become noisy fast.
- 💡Ask for attribute names and cardinality warnings so metrics do not explode in cost.
- 💡Include privacy constraints because observability tools can accidentally capture sensitive customer data.
Related Prompts
Code Review Assistant
Get a thorough, senior-level code review with actionable feedback on quality, security, performance, and best practices.
Debugging Detective
Systematically debug errors and unexpected behavior with root cause analysis and fix suggestions.
Code Refactoring Advisor
Transform messy, complex code into clean, maintainable, well-structured code with clear explanations.