how it works

Three steps.
Then we don't bother you.

Setup takes about twenty minutes. After that, the goal is silence — broken only by the occasional "here's what broke and how to fix it" note in your inbox.

01
connect

Point us at the URLs that matter.

Drop your app URL and the URLs of the flows you care about — login, checkout, webhook receivers, scheduled jobs. For Bubble, Webflow, Softr, and similar, we ask for read-only editor access. For custom code we add a single webhook for deploys. No SDK, no code change.

  • Read-only access only — nothing we do can break your live app.
  • Setup wizard for the major no-code platforms; manual config for everything else.
  • Twenty minutes from "I signed up" to "we're watching."
connection · diagram
your appBUBBLE · WEBFLOWNightlampWATCH · DIAGNOSE · RECOMMENDprobes →← diagnosesFIG. 01 · CONNECTION
02
we watch

Around the clock. Every flow that matters.

We probe each flow on a schedule — every minute for the critical ones, every five for the rest. We watch for response codes, latency drift, schema changes, and silent failures (the cron job that didn't fire). When something looks wrong, we re-check three times before we declare it a real incident.

  • Confirmed retries from us-east-1, so a transient network blip has to fail repeatedly before it pages.
  • Latency baselines per-flow, per-hour — we know what "normal" looks like at 3 a.m.
  • Webhook contracts pinned to your last-known-good schema.
watch pattern · 24hconfirmed retry
checkretry 1retry 200:0006:0012:0018:0024:00↘ anomaly · 08:30
03
we diagnose it

A real engineer pages you with the fix recipe.

When something breaks, an on-call engineer (a real one — not a tier-one chatbot) takes the page, reproduces the break, and replies with the root cause plus a recommended fix in plain English. You — or your developer, agency, or no-code builder — ships the fix in your platform. We re-probe the flow afterward to confirm it's healthy again.

  • First response within 5 minutes, 24/7 on Priority and White-glove — contractual.
  • Root cause + fix recipe within 60 minutes on Priority — contractual, backed by the auto-credit guarantee.
  • We determine a root cause and recommend a likely fix — no more than that.
  • Every incident gets a written post-mortem the same day.
diagnosis flow · statusillustrative example
DETECTEDDIAGNOSEDRECOMMENDEDVERIFIED!02:14?02:2102:3102:38FIG. 03 · ILLUSTRATIVE EXAMPLE · fix recipe @ 02:42
the deliverable

This is what we actually send you.

Not a stack trace, not a red dashboard — a written report your whole team can act on. Here's a sample, built on the failure mode we diagnose most often.

Incident reportCheckout → Stripe webhook
Illustrative example — sample report

Checkout webhook failing — Stripe events not reaching Bubble

Flow
Checkout → POST /stripe_checkout_completed
Detected
02:14 EST · confirmed after 3 consecutive probe failures
Status
Fix recommended · 02:31

Root cause, in plain English

Your live Stripe webhook still points at the development URL — the one with /version-test/ in it. Test-mode events reach the development version of your app, so everything looked fine while you were testing. Live events hit a workflow that doesn't exist on the deployed app, Stripe got a 502, and queued retries. Nothing in your Bubble editor is broken — the pointer is.

Fix recipe

  1. Open the live webhook endpoint in Stripe

    Stripe Dashboard → Developers → Webhooks. You have one endpoint receiving both test and live events — that's the first thing to untangle.

  2. Replace the URL with your deployed workflow URL

    The live endpoint still points at the development URL — the one containing /version-test/. Drop that segment so live events reach the deployed version of the workflow.

  3. Keep test and live endpoints separate

    One endpoint per mode, each storing its own signing secret. Mixed keys are the next failure waiting behind this one.

  4. Replay the failed events

    In Stripe's delivery log, resend the failed checkout.session.completed events. The four affected checkouts complete without customers having to retry.

After you ship it, we re-probe the flow and confirm it's healthy — you get a one-line "all clear."

What we checked

  • Live endpoint URL vs deployed workflow URLmismatch — root cause
  • Event-type subscription (checkout.session.completed)subscribed
  • Endpoint signing secret matches the workflowmatch
  • Workflow exposed without user authenticationpublicly callable
  • Adjacent flows (login, password reset, cron)healthy
Start the trial — get reports like this free 14 days · no card · keep a free monitor when it ends
the engineers on call

Who actually answers the page.

Nightlamp is founder-operated. Incident response is led by the founding engineer, and every responder on the rotation is named on this page — we don't outsource diagnosis to anonymous contractors or a tier-one support farm.

Priority and White-glove pages go to a 24/7 on-call rotation. Baseline incidents are handled during business hours with a written post-mortem on every incident.

On-call engineers work from the read-only access you grant: scoped editor logins, webhook payloads from monitored flows, response codes, and platform logs for those flows. They cannot push code, deploy releases, or touch your customer database.

How we access your app — read-only, audited →
Yoni RyabinskiFounder · on-call engineerUS-East (ET)

Built Nightlamp and leads the on-call rotation personally. The person who picks up your incident page works directly with the product, the probes, and the post-mortems — not a tier-one queue.

Honest avatars, real names. As the rotation grows, every responder is added here — never stock photos, never invented engineers.

what we monitor

Six things that quietly carry the business.

01 · auth

Login & signup flows

Synthetic users running every minute. We notice when password reset stops sending email before your customer does.

02 · money

Payment webhooks

Stripe, Paddle, Lemon Squeezy. We probe the contract — payload shape, status code, retry behaviour.

03 · vendors

Third-party API calls

Klaviyo, Twilio, Postmark, OpenAI. We log when their latency drifts and when their schema changes break yours.

04 · cron

Scheduled jobs

Daily digests, weekly billing runs, hourly sync jobs. We track expected cadence and notice silence.

05 · reach

Email delivery

Magic-link, OTP, signup confirmation, and round-trip messages. AgentDraft lets us verify that the email actually reached an inbox.

06 · paths

Custom user journeys

Describe the path that matters in plain English. We turn it into a probe and watch it like the rest.

debugging right now?Use the Bubble + Stripe live-mode webhook checklist
what "diagnosed" actually means

The response commitments, plan by plan.

One canonical table backs every response-time claim on this site — the same numbers you'll find on pricing.

PlanFirst responseWritten diagnosisCoverageCommitment type
WatchtowerNo human-response SLA — alerting onlyBest-effort plain-English summary attached to alertsAutomated alerts in near real-time when a check failsdescriptive
BaselineFirst response within 1 business dayWritten post-mortem on every incidentBusiness hours, weekdaysdescriptive
PriorityFirst response within 5 minutes, 24/7Root cause + fix recipe within 60 minutes typically ~30 minutes — typical, not a commitment24/7 on-callcontractual
White-gloveFirst response within 5 minutes, 24/7Same-day deep root-cause investigation · dedicated engineer24/7 on-call · dedicated engineercontractual
StudioPriority SLAs across all client appsPriority SLAs across all client apps24/7 on-callcontractual

Contractual numbers are backed by the SLA auto-credit. Miss a contractual SLA and we credit the full month — automatically on your next invoice. No tickets, no negotiation. Numbers labeled typical are observed medians, not commitments. Descriptive rows carry no minute-level promise.

common questions

How Nightlamp works, in five questions.

How long does setup take?

About twenty minutes. Drop your app URL, point us at the flows that matter (login, checkout, webhooks, cron), and grant read-only access. No SDK and no code change required for the no-code platforms we support.

What happens when a flow breaks?

A real on-call engineer takes the page, reproduces the failure, and replies with a plain-English root cause plus a recommended fix. Priority and White-glove carry a contractual first response within 5 minutes, 24/7; Priority also carries a contractual root cause + fix recipe within 60 minutes — see the table above.

Does Nightlamp deploy the fix?

No. We diagnose and recommend; you (or your developer or no-code builder) ship the change in your platform. After deploy, we re-probe the flow and confirm it's healthy.

What if I'm not technical?

The setup wizard speaks plain English: paste your URL, click the platforms you use, describe the flows that matter. The dashboard is a list of "things working" and "things we're handling."

How is this different from Pingdom or UptimeRobot?

URL pings tell you a page is up. We probe whether the actual flow works (login, checkout, webhook contract, scheduled job) and send a diagnosed root cause plus a recommended fix when it doesn't.

always-on flow monitoring

See it on your own app.
Twenty-minute setup.

no card required · cancel from settings · same-day setup