Best practice for testing Grafana app plugins without using live production data before publishing?

Hi Grafana community :waving_hand:

I’m building an App Plugin (incident-first reliability / RCA style workflow) that correlates metrics, logs, traces, and Kubernetes context into a guided investigation view.

The plugin includes:

  • A backend (Go) that aggregates data and builds incident timelines

  • A frontend app UI inside Grafana

  • A TestData-based “chaos / simulation mode” for safe testing (deterministic, synthetic, real-time streams via SSE)

  • Clear labeling of test vs prod mode (no real telemetry is queried in test mode)

My current challenge

I’m not able to test against real production telemetry, which means:

  • I rely on Grafana TestData and synthetic events for most testing

  • Some incident lists / lifecycle flows are generated or simulated

  • End-to-end behavior is realistic, but the data source is not live production Prometheus/Loki/Tempo

I’ve followed Grafana plugin docs and TestData guidance, but I want to confirm best practices from experienced plugin authors.

My questions

  1. Is it acceptable to publish a plugin for review when:

    • Testing is done using Grafana TestData / synthetic sources

    • Production datasources are supported but not exercised with real prod traffic yet?

  2. Are reviewers generally okay with:

    • Deterministic synthetic incidents

    • Explicit “test/simulation mode” clearly labeled in the UI and docs?

  3. Is there a recommended staging or non-prod testing pattern for app plugins that require complex observability data, before catalog publication?

I want to make sure I’m aligning with Grafana’s expectations and not over-engineering pre-publish testing.

Any guidance from Grafana team members or plugin authors would be hugely appreciated :folded_hands:
Thanks!



Hi!

Totally understand the challenge. I’m afraid we currently don’t have a single best practice or standardized way to mock datasource or third-party responses for plugins since datasources vary a lot. In general, solid e2e coverage matters, and it’s up to the plugin author to decide where mocks make sense vs where live integrations are needed.

Using deterministic synthetic incidents and a clearly labeled simulation/test mode is fine and can be a good way to validate workflows and UI safely.

There may also be cases where Playwright network mocking is useful. The plugine2e package includes some mocking abstractions too (it’s partly covered here).

One important thing for catalog review though: we don’t have a shared testing environment for plugin submissions. The review team still needs a reliable way to test the plugin with actual data to approve it. In the submission form you’re asked to provide setup details for the review environment (how to provision datasources, required config, any credentials needed, sample data, etc), so make that as actionable as possible.

Whether a specific test strategy is sufficient for approval is hard to answer without seeing the repo and how the plugin behaves in practice. If you’re reasonably confident in your current setup, submit it and we’ll give concrete feedback during review.

Sorry I can’t give a more straight answer, but hope this helps set expectations and gives you a reasonable path forward.

/Erik