January 2026 – Where's my keys?

January 31, 2026January 31, 2026

When place matters: living with cutaneous lupus

I haven’t written much about health before, but over the last few months it’s become part of the backdrop of my decision-making — so this feels worth sharing.

A while ago, after a persistent facial rash that worsened with sun exposure, I went through a biopsy. At the time, skin cancer was a genuine concern.

A really good friend of mine from my university days passed away last year from Skin Cancer, so I was very aware of the dangers and it certainly was a big concern.

So when the results came back not cancer, there was real relief — the kind you feel immediately and deeply.

The doctor came to see me as I was getting the stitch taken out, and handed me the report – her first words were re-assuring, important to hear.

”Read this, then come through and see me. Don’t panic – it’s totally manageable, but you’ll need a lot of sun screen….”

That relief was then tempered by a different diagnosis: cutaneous lupus erythematosus (CLE).

CLE is a skin-limited autoimmune condition — not systemic lupus — and in day-to-day terms it’s very manageable. There’s a clear plan, specialist care, and no immediate impact on my ability to work or live fully. But it does come with one very clear and non-negotiable trigger: UV exposure.

That combination of relief and recalibration became a quiet inflection point — the moment we started thinking more deliberately about environment, sustainability, and where we wanted to be long-term.

Living in Australia, that question takes on a particular weight.

Australia’s sunlight is extraordinary — and unforgiving. Even with good sun habits, high-SPF protection, and sensible precautions, the baseline UV exposure here is simply higher than in most parts of the world. For most people, that’s a lifestyle footnote. For someone with a photosensitive autoimmune condition, it becomes a constant background constraint.

None of this is dramatic or debilitating — but it is cumulative. Managing CLE well is about reducing repeated immune activation over time, not “pushing through” flare after flare. Environment plays a role in that, whether we like it or not.

At the same time, this diagnosis prompted some honest reflection. I have a family history of lupus, and while my own condition is different and far milder, it does sharpen your sense of perspective. You start asking quieter questions about sustainability, stress, proximity to support, and what you want the next phase of life to look like.

That combination — health management, environment, and family — ultimately fed into a broader decision my wife and I were already circling: to return to the UK.

This isn’t a reaction, and it isn’t fear-driven. It’s a deliberate choice to put ourselves in an environment that makes long-term health management simpler, not harder — lower ambient UV, easier moderation, and closer proximity to family.

Australia has been an incredible chapter. We’ve built memories here that will always matter. But sometimes the most adult decision is recognising when a place that’s wonderful isn’t the right place anymore.

I’m sharing this not for sympathy, but for completeness. Health doesn’t always arrive as a crisis — sometimes it arrives as information, and what matters is what you do with it.

For me, it’s meant choosing an environment that works with me, not against me.

January 31, 2026

Bet Placement: Why It Fails (and What the Architecture Is Usually Trying to Tell You)

Bet placement is the most latency-sensitive, revenue-critical, and failure-prone path in any iGaming platform. It sits at the intersection of customer experience, trading risk, regulatory control, and real-time data churn. When it fails, it fails loudly – often under peak load, with money on the line, and very little tolerance for excuses.

What’s interesting is that bet placement failures are rarely caused by a single “bug”. They are almost always the result of architectural tension: too many responsibilities, unclear boundaries, or optimistic assumptions about dependency behaviour.

Below are the most common technical reasons bet placement fails, drawn from real-world operation of high-volume wagering platforms.

1. Too many synchronous dependencies

The fastest way to break bet placement is to make it depend synchronously on everything.

Common offenders include:

Identity and session validation
KYC / jurisdiction checks
Wallet balance and limits
Market state
Pricing confirmation
Trading approval
Promotions / bonuses
Payments (yes, people still do this)

Every synchronous hop adds latency and multiplies failure probability.

Under peak load, even a small slowdown in one dependency can push the entire request over its latency budget.

What the system is telling you:

The bet placement path should be short, deterministic, and aggressively bounded.

Anything that doesn’t need to be synchronous shouldn’t be.

2. Market state churn and race conditions

Markets move. Prices change. Selections suspend and re-open.

Feed updates arrive in bursts.

If bet placement:

Reads market state from multiple sources
Relies on stale caches without invalidation
Doesn’t enforce price staleness windows

…you get classic race conditions, namely:

Bets accepted on suspended markets
Bets rejected even though the UI showed availability
Duplicate retries hitting different market states

Failure mode: Customers see “technical error” or inconsistent rejections during peak trading.

What the system is telling you:

Market state must be versioned, cached close to placement, and validated with explicit tolerances (“price valid for X ms”).

3. Wallet design flaws (the silent killer)

Wallet issues are typically responsible for a disproportionate number of bet placement failures.

Typical problems that I’ve seen include:

No true reservation/hold model
Weak or missing idempotency
Balance checks separated from debits
Ledger writes mixed with business logic (this one is classic, you’d be surprised how any times it happens!)

Under concurrency, this leads to:

Double spends
Phantom insufficient-funds errors
Reconciliation nightmares after recovery

What the system is telling you:

Wallets must be boring, deterministic, and mathematically correct.

If your wallet logic is clever, it’s probably broken when you have high traffic. Or even when you don’t!

4. Trading decisions that don’t degrade gracefully

Trading systems often assume they’ll always respond quickly.

Newsflash: they won’t.

When trading:

Times out
Is under heavy load
Is partially unavailable

…bet placement frequently has no clear fallback. The result can be long timeouts that cascade back to the edge, rather than fast, explainable rejections.

Better behaviour patterns include:

Explicit time budgets for trading decisions
Default reject on timeout with a clear reason code
Rapid market suspension when instability is detected

What the system is telling you:

A fast reject is better than a slow maybe.

5. Retry storms and idempotency gaps

During peak events, clients retry.

Load balancers retry.

Upstream services retry.

If bet placement:

Doesn’t enforce idempotency keys
Treats retries as new requests
Emits side effects before commit

…you get duplicate bets, duplicate wallet postings, or corrupted state.

What the system is telling you:

Idempotency is not an optimisation.

It’s a core correctness requirement.

6. Overloaded databases and hidden coupling

Bet placement often looks stateless at the API level, but is tightly coupled to:

Shared databases
Hot tables (balances, open bets)
Lock-heavy schemas

Under load, lock contention silently destroys throughput, leading to sudden, nonlinear failure.

What the system is telling you:

If throughput collapses before CPU does, you have a data contention problem, not a scaling problem.

7. Poor observability in the critical path

When bet placement fails and you can’t answer:

Where did the time go?
Which dependency failed?
Was this a reject, a timeout, or a partial commit?

…you lose the ability to respond confidently during incidents.

This leads to:

Over-suspension (“turn everything off”)
Over-engineering after the fact
Loss of trust from trading and operations

What the system is telling you:

If you can’t see it under pressure, you can’t control it.

The pattern behind all failures

Nearly all bet placement failures share one root cause:

The system is trying to do too much, too synchronously, with unclear ownership of outcomes.

Healthy bet placement services are:

Thin at the edge, thick in the domain
Ruthless about timeouts and failure modes
Explicit about what they will and will not guarantee

A better mental model

Think of bet placement as this:

A transaction coordinator, not a workflow engine
A risk gate, not a business logic dumping ground
A trust boundary, not an integration hub

If something feels awkward to implement in bet placement, that’s usually your architecture asking for a boundary to be moved.

Final thought

When bet placement fails, teams often reach for more caching, more hardware, or more retries.

Rarely do those fixes address the underlying problem.

The real work is harder: simplifying the synchronous path, tightening ownership, and designing for rejection as a first-class outcome.

Platforms that get this right don’t just place bets faster—they fail more gracefully, recover more predictably, and earn trust when it matters most.

January 31, 2026

Why L1 / L2 / L3 Support Models Fail Without Ownership

The L1 / L2 / L3 support model is one of the most widely adopted – and most poorly understood – operating patterns in modern technology organisations.

On paper, it looks really clean and rational: first-line support handles intake, second-line investigates, third-line engineers fix root causes.

Escalation is orderly. Responsibilities are clear. Everyone knows their lane.

In practice, many organisations discover the uncomfortable truth: without clear ownership, L1/L2/L3 doesn’t reduce incidents: it completely institutionalises confusion.

After years of operating platforms in regulated, high-availability environments, I’ve seen the same failure modes repeat with remarkable consistency. The issue is rarely the model itself. It’s the absence of real accountability at the seams.

The illusion of escalation

The biggest misconception is that escalation equals ownership.

In weak implementations, an incident “moves up the stack” without ever truly belonging to anyone. L1 logs the ticket and hands it off. L2 adds commentary and escalates. L3 investigates when time permits.

Meanwhile, the system remains degraded, customers are impacted, and no single individual feels responsible for resolution.

Escalation becomes a mechanism for risk transfer, not problem solving.

When nobody owns the outcome end-to-end – and I mean technical fix, communication, and crucially learning – the model devolves into a queueing system that optimises for local convenience rather than global reliability.

L1 without ownership becomes a call centre

L1 is often positioned as “just intake”: logging tickets, resetting passwords, acknowledging alerts. But when L1 lacks clear ownership boundaries, it becomes little more than a message relay.

Effective L1 teams do more than triage.

They:

Own initial diagnosis, not just categorisation
Apply runbooks with authority, not fear of escalation
Decide whether an issue is noise, delay, or degradation

Without ownership, L1 staff are incentivised to escalate early and often—because escalation feels safe. The result is alert fatigue upstream and a complete lack of signal discipline.

L2 becomes a dumping ground

L2 is where many models quietly collapse.

You’ve seen it. I’ve seen it. We’ve all rolled our eyes, collectively and individually, and groaned.

In theory, L2 provides deeper technical investigation and remediation within defined limits. In reality, L2 often inherits ambiguity: unclear service boundaries, incomplete documentation, and no authority to make changes.

When L2 doesn’t own specific systems or outcomes, it becomes a holding pen for unresolved problems. Tickets stall. Context is lost.

Engineers re-diagnose the same issue repeatedly because nobody is accountable for closing the loop.

This is how mean time to resolution quietly stretches from minutes to hours: without anyone feeling explicitly at fault.

L3 without ownership breeds resentment

L3 teams (usually product or platform engineers) are where the real fixes happen.

But when ownership isn’t explicit, L3 becomes reactive and defensive.

Common symptoms that I’ve seen usually include:

Engineers pulled into incidents with no context or priority clarity
Fixes made under pressure without time for proper remediation
Repeated incidents caused by known issues that never get scheduled work

From the engineer’s perspective, L3 becomes an interruption tax.

From the business’s perspective, it’s a black box.

Neither side is well served.

Everyone loses!

The real failure: nobody owns the service

The core problem isn’t the number of layers – it’s the absence of service ownership.

In healthy organisations:

Every system has a clearly identified owner (individual or team)
That owner is accountable for availability, performance, and support outcomes
L1/L2/L3 act as capability layers, not responsibility boundaries

To be clear on this, and many people make this mistake – ownership does not mean “doing everything yourself”!

It means being accountable for:

Decision-making during incidents
Trade-offs between speed, risk, and correctness
Ensuring learning happens after recovery

Without this, post-incident reviews become blame-avoidance exercises rather than improvement mechanisms.

What works instead

Successful support models invert the usual thinking:

Service ownership first – Define who owns each system. Make that ownership visible and unambiguous.
L1 and L2 operate under delegated authority – Runbooks, thresholds, and decision rights matter more than escalation paths.
L3 owns root cause, not just fixes – If an issue repeats, it’s an ownership failure—not a support failure.
Incidents have a named incident owner – One person is accountable for coordination, communication, and closure, regardless of where the fix lands.
Support is a feedback loop, not a firewall – Good support improves the system. Bad support merely absorbs pain.

The uncomfortable truth

L1/L2/L3 models don’t fail because they’re outdated. They fail because they’re often implemented as organisational insulation, designed to protect teams from responsibility rather than enabling reliable delivery and clear learnings.

Here’s the truth – true ownership is uncomfortable.

It forces clarity.

It exposes weak interfaces, poor documentation, and brittle systems.

But it’s also the only thing that turns support from a cost centre into a reliability engine.

If your support model feels busy but ineffective, the question isn’t whether you need another layer – it’s whether anyone truly owns the outcome.