• It’s Not Just What Model — The Where Matters Too

    Why the infrastructure running a large language model can change its behavior

    An analysis of inference-time variability in modern LLMs

    The hidden variable in every LLM conversation

    For most users, large language models are treated as singular entities. People talk about them the way they would talk about a single piece of software:

    • “Use GPT-4.”
    • “Claude Opus is better.”
    • “This model got worse.”
    • “That prompt works better on Model X.”

    But there is an under-discussed reality in modern LLM systems: the hardware and inference environment running the model can materially affect the output. Not just latency. Not just throughput. The actual behavior of the model.

    A frontier model running on a rack of NVIDIA Blackwell GPUs may not behave identically to the same model deployed on older Hopper or Ampere hardware, even when the weights themselves are unchanged. This matters because users increasingly try to correlate prompt quality, model capability, token consumption, reasoning consistency, and output reliability. If the inference environment itself introduces variability, those correlations become statistically noisy.

    The myth of perfect determinism

    Most people intuitively assume LLMs behave like traditional software: same input, same output. But modern transformer inference systems are probabilistic numerical systems operating at massive scale. At their core, LLMs repeatedly compute a probability distribution over possible next tokens:

    P( tₙ₊₁ | t₁, t₂, …, tₙ )

    Tiny numerical differences during inference can alter which token is selected. Once generation diverges by even one token, the entire downstream reasoning path may change. This is not hypothetical. The official PyTorch reproducibility documentation states it plainly:

    “Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. … Results may not be reproducible between CPU and GPU executions, even when using identical seeds.”

    Source: PyTorch documentation — Reproducibility notes

    Floating-point arithmetic is not perfectly stable

    Modern LLM inference relies on massive chains of floating-point operations. Floating-point arithmetic is not associative:

    (a + b) + c  ≠  a + (b + c)

    Execution order matters. Parallel GPU systems reorder operations constantly for performance, and different GPU architectures, kernels, or scheduling paths may accumulate values differently, producing slightly different numerical results.

    Normally these differences are microscopic. But LLMs are highly sensitive dynamical systems — a tiny change early in generation can alter later token probabilities enough to flip a greedy-decoded argmax and meaningfully change the output.

    A refinement: it’s not only the floating point

    Recent research from Thinking Machines Lab (“Defeating Nondeterminism in LLM Inference,” Sept 2025) argues that the conventional “concurrency plus floating point” story is incomplete. They show that the dominant driver of nondeterminism in production LLM endpoints is batch invariance failure: server-side dynamic batching means a single request is co-batched with whichever other requests happen to be active at that moment, and standard kernels for normalization, matmul, and attention are subtly batch-sensitive. The fix exists — batch-invariant kernels — but it carries a meaningful throughput cost, which is why most production systems do not use it.

    This refinement doesn’t weaken the broader point of this article — it strengthens it. The thing producing the output is not just the model; it is the model plus the kernels, plus the batching policy, plus the hardware, plus everything else in the serving stack.

    GPU determinism has a cost

    NVIDIA has published technical material discussing explicit determinism controls in CUDA reduction operations. Strict GPU-to-GPU determinism requires specialized accumulation strategies that are slower than the standard high-performance execution paths. The tradeoff is straightforward:

    Optimization GoalResulting Behavior
    Maximum throughputHigher numerical variability
    Strict reproducibilityLower performance & higher cost

    Modern inference providers generally optimize for latency, utilization, throughput, and serving cost — not perfect determinism. As a result, public LLM deployments routinely operate in environments where small numerical divergences are expected.

    Hardware changes can change model behavior

    Different GPU generations support different tensor precisions, fused kernels, memory hierarchies, compiler optimizations, and scheduling behaviors. The precision formats themselves illustrate this:

    • FP16 — 16-bit, classic mixed-precision training and inference
    • BF16 — 16-bit with FP32-range exponent, more stable in training
    • TF32 — NVIDIA-specific 19-bit format for Ampere tensor cores
    • FP8 — 8-bit, introduced on Hopper, widely used on Blackwell
    • FP4 / NVFP4 — 4-bit, native on Blackwell tensor cores

    Each introduces different precision and accumulation characteristics. Newer architectures such as Blackwell are designed to aggressively accelerate low-precision workloads — NVFP4 delivers a ~1.8x memory reduction versus FP8 and Blackwell’s fifth-generation tensor cores natively run FP4, FP6, and FP8 paths. That improves scale economics dramatically, but it can also increase approximation effects relative to older hardware.

    This does not mean the model becomes a different intelligence. It can mean different verbosity, altered reasoning depth, different failure modes, inconsistent chain-of-thought behavior, or changing token consumption.

    Mixture-of-Experts models amplify the problem

    Many frontier models now use Mixture-of-Experts (MoE) architectures, in which tokens are dynamically routed between specialized subnetworks called “experts.” Routing decisions are made via a softmax over the gating logits and are sensitive to tiny logit differences — sometimes a fraction of the last representable bit is enough to swap which expert is selected.

    Small numerical drift can therefore:

    • activate different experts on the same token,
    • alter the reasoning trajectory downstream,
    • or produce stylistic divergence between otherwise-identical runs.

    As models become more sparse, distributed, and dynamic, inference reproducibility becomes harder, not easier.

    Prompt engineering becomes statistical, not deterministic

    This has major implications for how people think about prompts. The intuitive assumption is: “a better prompt produces a better answer.” But if inference itself is probabilistic and infrastructure-sensitive, prompt optimization is a statistical exercise, not a deterministic one. The observed output is closer to a function of many variables:

    Output = f(Prompt, Model, Sampling, Hardware, Runtime, Quantization, Routing, Batch)

    The user controls one variable directly: the prompt. Everything else may shift underneath the surface. This explains a long list of common user complaints:

    • “The same prompt worked yesterday.”
    • “The API behaves differently than the web UI.”
    • “The model seems dumber now.”
    • “Token usage suddenly increased.”

    Sometimes the weights changed. Sometimes the infrastructure did. Often, the user cannot tell which.

    VariableControlled By
    PromptThe user — the only directly controlled input
    Model weightsThe provider; may change silently with updates
    Sampling parametersUser or provider defaults (temperature, top-p, seed)
    HardwareProvider — GPU generation, interconnects, memory
    Runtime / kernelsProvider — CUDA version, kernel selection, scheduling
    Quantization & precisionProvider — FP8, BF16, FP16, TF32, FP4
    Batch compositionProvider — varies with concurrent traffic
    Expert routing (MoE)Provider — sensitive to logit drift

    What actually determines the output you see

    Token consumption is affected too

    Small inference divergences compound over long generations. One execution path may answer concisely, terminate early, and consume 400 tokens. Another — same prompt, same nominal model — may self-correct repeatedly, explore alternate reasoning paths, and consume 2,000+ tokens. For API users and enterprise deployments this is a real operational concern, because token consumption directly impacts cost, latency, throughput, and agent reliability.

    Researchers already acknowledge this problem

    The machine-learning ecosystem has been discussing reproducibility challenges for years. PyTorch’s deterministic-algorithm documentation explains that deterministic execution often requires explicitly disabling nondeterministic optimizations. Researchers routinely encounter reproducibility drift across:

    • GPU architectures and CUDA versions,
    • cuDNN implementations,
    • threading models and atomic operation ordering,
    • distributed execution and collective communication, and
    • server-side batching policies.

    Recent academic and industry work — including the Thinking Machines analysis cited above — has continued to expose how low-level GPU and serving behavior contribute to reproducibility challenges across architectures and precision modes.

    The emerging reality

    As LLM systems scale, the distinction between “the model” and “the infrastructure running the model” is becoming increasingly blurry. The public often treats models like static software artifacts. In reality, modern LLMs are distributed probabilistic systems whose behavior emerges from an interaction between weights, hardware, runtime optimizations, batching, routing, precision strategies, and serving infrastructure.

    The consequence is subtle but important: it’s not just what model you are using. The where matters too.

    Sources referenced: PyTorch Reproducibility documentation; NVIDIA Developer Blog (Blackwell Ultra, NVFP4); Thinking Machines Lab, “Defeating Nondeterminism in LLM Inference” (Sept 2025).

  • “AI” is nothing more than dynamic regular expressions.

    In recent years, large language models (LLMs) have been described as revolutionary, intelligent, and even proto-conscious systems.
    However, a compelling counter-position argues that these systems are nothing more than extraordinarily sophisticated pattern-matching machines – essentially dynamic, probabilistic regular expression engines operating at massive scale.

    This article presents a steelman version of that argument: the strongest, most intellectually rigorous case that LLMs are fundamentally advanced statistical pattern processors rather than thinking entities.

    1. Next-Token Prediction as Pattern Completion

    At their core, LLMs are trained to predict the next token in a sequence. Given prior tokens, the system calculates the probability distribution of possible continuations and selects one based on learned statistical weights.

    This is pattern completion. Regular expression engines also operate on sequences, identifying matches based on structured symbolic rules. While regex uses deterministic transitions and fixed syntax, LLMs use probabilistic transitions and learned weights. In both cases, the system maps input sequences to outputs based on pattern structure rather than understanding.

    2. Transformers as Probabilistic State Machines

    Modern LLMs rely on the transformer architecture, which computes attention scores between tokens and assigns weights to contextual relationships. Conceptually, this resembles a vast probabilistic state machine operating in high-dimensional vector space.

    A traditional regular expression compiles to a finite state automaton with deterministic transitions. An LLM can be seen as a soft, differentiable automaton whose transitions are weighted by learned statistical correlations.
    The structure differs in scale and flexibility, but the functional role — sequence processing via state transitions — remains analogous.

    3. Statistical Correlation Without Grounded Semantics

    Regular expressions do not understand what they match. They recognize structure.

    Similarly, LLMs do not possess intrinsic semantic grounding. They model statistical relationships between tokens in training data.
    Their outputs reflect learned correlations rather than lived experience or intentional meaning.
    The appearance of understanding may emerge from scale and complexity, but internally the system manipulates symbol patterns.

    4. Emergent Behavior Does Not Imply Cognition

    Critics of the regex analogy point to reasoning, planning, and abstraction capabilities in LLMs.
    However, the steelman position argues that emergent behavior from sufficiently complex statistical systems does not constitute true cognition.

    Chess engines evaluate massive search trees without understanding chess.
    Similarly, LLM reasoning may be structured interpolation across learned distributions rather than deliberate thought.
    Complex pattern simulation can mimic reasoning without instantiating it.

    5. The Compression Perspective

    Another powerful framing views LLMs as compression engines. During training, vast corpora of text are compressed into parameter weights. During inference, those weights generate plausible continuations — effectively decompressing structured language patterns.

    Regular expressions also encode compressed pattern descriptions. LLMs simply encode patterns at a scale and dimensionality far beyond manual symbolic systems.

    6. Turing Completeness and Category Errors

    Some argue that because transformers are Turing-complete in principle, they transcend simple pattern matching. The steelman response notes that Turing completeness alone does not imply intelligence. Many simple systems are computationally universal yet devoid of cognition.

    Thus, the ability to simulate reasoning does not entail genuine reasoning — only sufficient structural complexity.

    Conclusion

    The strongest version of the argument concludes:

    • LLMs operate purely on statistical token prediction.
    • They lack intrinsic semantic grounding.
    • Their internal processes are weighted pattern transitions.
    • Apparent reasoning is structured probability, not cognition.

    Under this interpretation, LLMs are not minds, thinkers, or agents.
    They are adaptive, high-dimensional, probabilistic pattern-matching systems — dynamic regular expression engines operating at planetary scale.

  • Does AI Dominate All Programming Languages Equally?

    Does AI Dominate All Programming Languages Equally?

    No. AI performs far better in some languages (like JavaScript, Python, TypeScript) and far worse in others (like Rust, Haskell, COBOL, or highly specialized embedded languages).

    AI Dominates Best in “High-Surface-Area” Languages With Huge Training Data

    AI models perform best when a language:

    • has massive open-source code ecosystems
    • has lots of GitHub repositories (JS, TS, Python)
    • appears everywhere in tutorials, forums, StackOverflow, docs
    • is used in web development (simple, repetitive patterns ideal for LLMs)

    The biggest beneficiaries:

    • JavaScript / TypeScript
    • Python
    • SQL
    • HTML/CSS (small domain, predictable)

    These languages account for the majority of all public code on Earth, so AI models naturally excel at them. This is why AI coding demos overwhelmingly use JS, TS, or Python.

    JavaScript Case

    JavaScript is a dream for LLMs:

    • It’s ubiquitous
    • It’s forgiving
    • It has millions of nearly identical code examples
    • Prompts like “make a todo app” or “fetch this API” always produce workable patterns

    This is why Bun, Vercel, Cloudflare, Anthropic, OpenAI, and GitHub all showcase AI coding features using JavaScript/TypeScript.

    AI is Weaker in “Correct-by-Construction” or “Strict” Languages

    Languages built around correctness, strict typing, manual memory management, or advanced type systems are harder for AI to generate correctly.

    Examples where AI struggles more:

    Rust

    • Strong ownership and borrowing semantics
    • Compiler is strict
    • Memory safety rules are non-negotiable
    • Small ecosystem relative to JS

    AI can write Rust that looks right but fails to compile.

    Haskell / OCaml / F#

    • Strong functional paradigms
    • Less real-world training data
    • Abstract math-heavy patterns

    C and C++

    • Manual memory management
    • Project structure complexities
    • Undefined behavior hazards
    • Inconsistent patterns across codebases

    AI often generates code that segfaults or is not safe.

    Go

    • AI is moderately strong but weaker than in JS/Python because Go codebases are simpler but far fewer in number.

    AI Is Worst at Specialized or Legacy Languages

    Low-resource, domain-specific languages give available models very little to learn from:

    • COBOL (mainframes, banks)
    • ABAP (SAP)
    • VHDL/Verilog (hardware description languages)
    • MATLAB (scientific, proprietary)
    • LabVIEW
    • Embedded C dialects

    These domains:

    • Have minimal public code
    • Often include proprietary/internal code
    • Require deep domain knowledge beyond syntax

    AI frequently generates syntactically valid but practically unusable code.

    AI Performs Better at Glue Code Than Systems Code

    AI excels at:

    • CRUD
    • API wrappers
    • UI components
    • CLI tools
    • JSON transformations
    • Refactoring
    • Writing tests

    But struggles with:

    • Performance tuning
    • Concurrency
    • Systems-level code
    • Distributed systems
    • Memory-safety critical code
    • Security-sensitive code

    This is why AI produces React components with ease, but writing a correct multi-threaded C++ scheduler often fails.

    Model Architecture Matters

    Different AI models are better with different languages:

    Claude, ChatGPT, Cursor

    • Best at JS, TS, Python, SQL

    DeepSeek Coder (latest generation)

    • Stronger at C, C++, Rust, Go
    • Explicitly trained on low-level code
    • Still worse at JavaScript-scale “glue work,” but better at systems engineering tasks

    Copilot (OpenAI o-series & gpt-4o)

    • Extremely strong with TypeScript
    • GitHub training bias

    Tooling Influences AI Dominance

    Languages with strong AI-integrated dev tools:

    • JS/TS → Vercel AI, Copilot, Cursor
    • Python → Jupyter, PyCharm, GitHub Copilot
    • SQL → Natural language queries via AI

    Languages lacking good AI-native IDEs (COBOL, ABAP) remain AI-poor.

    Summary Chart

    LanguageAI StrengthReason
    JavaScript / TypeScript⭐⭐⭐⭐⭐Tons of training data, easy patterns, used everywhere
    Python⭐⭐⭐⭐⭐Hugely popular, simple syntax, ML + scripting
    SQL⭐⭐⭐⭐⭐Small domain, predictable patterns
    Go⭐⭐⭐⭐Moderate data, simple syntax
    Java⭐⭐⭐⭐Enterprise-heavy data, many repeatable patterns
    C#⭐⭐⭐Good ecosystem, less open-source surface
    C / C++⭐⭐Hard to reason, memory issues
    Rust⭐⭐Strict compiler rules
    Haskell / OCamlNiche, complex semantics
    COBOL, ABAP, HDLVery limited public training data

  • AI Is Reshaping What Engineering Work Looks Like

    The rapid rise of generative-AI in software development has sparked proclamations that AI will obsolete human programmers. But a closer look at major tech firms’ behavior — and recent research — paints a different picture. Rather than eliminating engineering jobs, AI seems to be reshaping what engineering work looks like — and in many cases increasing demand for engineers with the right skills to oversee, integrate, and scale AI-driven systems.

    Why the Narrative of “AI Will Kill Coding Jobs” Isn’t Holding Up

    It’s common to hear executives or media suggest AI will replace large swathes of software-engineering roles. Terms like “vibe coding” — where a developer prompts an AI and accepts its output — are gaining popularity as shorthand for an AI-first future. Wikipedia+1

    Yet firms continue hiring — and often hiring more. That suggests that even as AI becomes more capable, human engineers remain essential.

    Concrete Example: Anthropic + Bun Acquisition

    Take a recent, telling move: in December 2025, Anthropic acquired Bun — a JavaScript/TypeScript runtime and toolkit — at the same time it shared that its AI coding assistant (Claude Code) had reached a milestone of $1 billion in annualized run-rate revenue. IT Pro+1

    Bun is not a simple toy — it’s a foundational runtime used for building, bundling, and running JavaScript/TypeScript applications. By bringing Bun into its fold, Anthropic signaled confidence not just in AI-generated code, but in needing human engineers to build the infrastructure, integrate AI into real-world systems, and maintain stability and performance.

    In other words: even as Anthropic pushes forward with code-generating AI, it still invests in human engineering talent and software infrastructure. The acquisition underlines that AI alone isn’t sufficient — human oversight, toolchain maintenance, and architectural work remain essential.

    Broader Industry Pattern — Not Just One-Off

    This isn’t unique to Anthropic. Across the industry, there’s mounting evidence that AI adoption often correlates with continued, or even increased, hiring of engineers — especially those with expertise in AI, infrastructure, or integration. Analysts argue generative-AI doesn’t so much replace developers as it augments them — requiring new kinds of human skills. AP News+2Zen van Riel+2

    At the same time, studies of AI-generated code point to serious limitations: security flaws, lack of deep understanding of context, inefficient or sub-optimal patterns, and a need for human review. For example, one empirical study found that nearly 30 % of AI-generated Python snippets and 24 % of AI-generated JavaScript snippets showed security weaknesses when merged into real-world projects. arXiv

    These findings suggest that AI-generated code — while useful — remains far from “set it and forget it.” In many cases, human engineers must still review, debug, secure, and integrate the code, often adding complexity rather than removing it.

    Languages Matter — AI Doesn’t Dominate All Programming Languages Equally

    An important insight: AI’s effectiveness depends heavily on the programming language — and the nature of the task. Recent research into Large Language Model (LLM) preferences for code generation confirms that AI tends to favour certain languages. In one 2025 study, LLMs generated code in Python 90–97 % of the time for language-agnostic benchmark tasks, even when Python wasn’t the most suitable language for production code. arXiv

    That bias reflects both the abundance of training examples in languages like Python (and by extension, also languages like JavaScript or TypeScript) and the simplicity and flexibility of those languages. AI tools have a far easier time generating correct boilerplate, simple scripts, web-app code, and high-level logic in such languages. Zen van Riel+2IntuitionLabs+2

    On the other hand, languages that require stricter type systems, memory safety, or low-level control — such as Rust, C++, or languages used for systems programming — remain more challenging for AI. Some developers and community reports note that while AI can scaffold basic code in those languages, it often fails to satisfy compiler constraints, produce efficient or safe output, or handle complex system-level logic. BigGo+2AICodes+2

    Thus, AI’s impact is highly uneven: it’s more likely to disrupt or assist in web development, scripting, or high-level logic, while offering limited benefit in systems, embedded, or high-performance programming.

    What Remains the Role of Human Engineers — and What’s Changing

    Given these trends and limitations, here’s how the role of human engineers is evolving:

    • More emphasis on oversight, review, and integration: AI-generated code often needs human validation, security hardening, and testing. Engineers must review, debug, and refine output rather than simply accept it.
    • Higher demand for infrastructure, tooling, and system-level expertise: As firms build AI-assisted pipelines (like Anthropic integrating Bun), demand grows for engineers who can architect, maintain, and scale complex toolchains.
    • Shift toward languages and domains where AI is weaker: Systems programming, security-critical code, performance-sensitive components—areas where AI struggles—become more reliant on human expertise.
    • New hybrid workflows combining AI + human intelligence: Engineers increasingly act as supervisors, guides, and quality-assurance agents, using AI for boilerplate, scaffolding, or prototyping, while steering overall design.

    In many ways, AI is serving as a force multiplier — increasing what a given team can produce, but also increasing the need for thoughtful engineering, oversight, and human judgment.

    Why “AI Replaces All Engineers” Is Overstated — For Now

    Based on current evidence, the narrative that AI will wholesale eliminate software-engineering jobs seems overstated. Instead:

    • AI performs best on high-level, structured, well-documented languages — precisely where a large share of current web and app development lives.
    • AI-generated code is error-prone, insecure, or inefficient — making human review and expertise indispensable.
    • Companies adopting AI (like Anthropic) are also investing in building engineering teams — not downsizing them.

    Thus, the present and near-term future seem to favor more engineers — especially those capable of working with AI rather than being replaced by it.

    Conclusion — The Shape of Engineering Is Changing, Not Disappearing

    What we see in late 2025 is not the end of software engineering — but its transformation. AI doesn’t so much eliminate the need for human engineers as it shifts what kinds of engineering work matter:

    • From writing boilerplate or repetitive code
    • Toward system architecture, integration, maintenance, security, performance tuning, and high-level design
    • Toward languages and domains where AI assistance is less reliable

    In effect, AI becomes a toolchain multiplier, enabling faster development — but not eliminating the need for human engineers. If anything, demand for engineers adept at working in an AI-augmented world is rising.

  • Why Silver Fell at the Start of the Pandemic — Even While the Fed “Printed” Money

    When COVID-19 crashed into markets in March 2020, silver did something that surprised many people who view it mainly as a hedge against monetary expansion: it plunged sharply. That drop looks at odds with the “money printer” narrative (QE, fiscal stimulus, and fears of inflation), but a closer look shows the crash was driven by short-term mechanics — liquidity, industrial demand collapse, margin calls and the quirks of paper vs. physical markets — while the monetary effects showed up later. Below I explain the sequence and the drivers, and provide sources so you can read deeper.

    The crash was first a liquidity event, not a long-term monetary signal

    In mid-March 2020 global markets seized up. Investors needed cash immediately and sold whatever they could convert to dollars. Central bank actions to restore market functioning (swap lines, repo operations and emergency facilities) came as markets were already plunging and often after forced selling had begun. In short: the panic to raise cash caused indiscriminate selling across asset classes — including silver — before the Fed’s longer-term QE and balance-sheet expansion could support safe-haven or inflation-hedge flows. Federal Reserve+1

    Silver is a “dual-use” metal — half investment, half industry

    Unlike gold, a large share of silver’s demand is industrial: electronics, solar panels, medical equipment and other manufacturing uses. When factories shut and demand for industrial inputs fell in early 2020, that removed a major natural buyer of silver at the same time panic selling hit financial markets. The combination amplified downward pressure on price relative to gold, which is overwhelmingly a monetary/investment asset. Data and industry analyses repeatedly show industrial demand as a major component of annual silver consumption. The Silver Institute

    Paper silver crashed while physical demand surged — the disconnect widened price moves

    One confusing feature of 2020 was that the “paper” silver market (futures and ETF share redemptions) saw violent selling while retail buyers rushed toward physical bullion and coins. This created a divergence: futures prices fell with forced liquidations, but coin premiums rose as dealers ran low on inventory. In other words, paper and physical are linked, but under stress they can move differently because physical supply chains and dealer inventories don’t instantly reprice the futures market. Evidence of unusually strong physical buying and sell-outs of popular coins appeared in March 2020. BullionByPost+1

    Margin calls and forced liquidations magnified the sell-off

    Margin requirements and forced redemptions were another transmission channel. When equities and other positions tanked, brokers issued margin calls; funds and traders liquidated positions to meet those calls. Leveraged positions in commodity futures — including silver — were prime candidates for sudden selling. That mechanical forced-selling pushes futures prices down regardless of longer-term views about money supply or inflation, and it happened before fiscal and monetary stimulus could fully propagate to markets. Brookings

    The Fed’s “printing” (QE and balance-sheet expansion) came fast — but its price effects lagged

    It’s true that the Fed and fiscal authorities responded aggressively in the weeks after March 2020: massive QE, repo facilities, and large fiscal packages followed. The Fed’s balance sheet expanded rapidly from roughly $4.7 trillion in mid-March 2020 to about $7 trillion within two months as various facilities and open-ended asset purchases were used to restore market functioning. Those actions removed liquidity stresses, lowered real rates and eventually created an environment in which precious metals would rally — but that rally arrived after the initial crash. The timing mismatch (instant panic vs. policy implementation and transmission) explains much of the apparent contradiction. Congress.gov+1

    Market microstructure and delivery/backwardation issues

    Under extreme market stress, futures markets can display backwardation (near-term prices above deferred prices) or other distortions reflecting acute delivery risk or shortages of physical metal at certain delivery points. These microstructure signals can cause additional volatility and disconnects between spot, futures and physical dealer pricing, and they played a role in the spring and later phases of 2020 when delivery logistics and dealer inventories were strained. Analysts and bullion traders documented elevated premiums and localized strains in physical availability during crunch periods. Sprott+1

    What happened next — why silver rebounded

    Once liquidity returned — aided by Fed actions, swap lines, and fiscal packages — investors refocused on the macro environment. The combination of near-zero interest rates, unprecedented balance sheet expansion and later concerns about inflation and currency dilution increased demand for monetary hedges and speculative interest in silver. At the same time, as economies recovered and industrial activity resumed, industrial demand for silver improved. Those forces combined to push silver significantly higher later in 2020. The initial crash was therefore a short-term liquidity and industrial-demand shock; the subsequent rally reflected the monetary and recovery story. Brookings+1

    Takeaway: two timelines — an immediate liquidity contraction and a slower monetary expansion

    If you step back, the apparent paradox resolves into two distinct timelines:

    1. Immediate (days–weeks): panic, flight to cash, margin calls and collapsing industrial demand — price falls. Federal Reserve+1
    2. Medium term (months): policy response, restored liquidity, and renewed demand for hedges and industrial goods — price rises. Congress.gov+1

    That duality is why silver can both “align” with monetary expansion over the medium term and still plunge during a liquidity-driven crash.


    Sources (key references)

    • Federal Reserve staff analysis of the COVID-19 crisis and Fed policy response. Federal Reserve
    • Brookings overview of the Fed’s March 2020 actions and QE program expansion. Brookings
    • Congressional Research Service: Fed balance sheet expansion and policy timeline (spring 2020). Congress.gov
    • Silver Institute: supply, demand and the industrial share of silver usage. The Silver Institute
    • Reports of physical coin sell-outs and dealer shortages in March 2020 (US Mint / bullion dealers). BullionByPost
  • We found a magic loophole

    Apparently the right conditions allowed us to peel about 98% of the bottom in under 8 hours. The majority of the gel coat came off in slabs with a heavy acetic acid layer between it and the laminate. There was plenty of general osmosis but not as many blisters as one would expect.
    Every spot should have been a blister but instead just release pressure into the layer between the gel coat and laminate.

    You could see a number of blisters that did make it through complete forming.

    This is how big the majority of hull pieces were. We had 5, 13 gallon bags to dump at the end of the work.

    Here are some pictures after we completed the first run of scraping.

  • We’re going to need a bigger paint brush.

    After inspecting the hull after haul out some damage was identified on the port aft section of the hull. A .5 square foot section of the gel coat has came off exposing the laminate.
    So it’s time to investigate the condition of the gel coat and moisture below the waterline.


    Well sure enough we see elevated moisture throughout the majority of the hull. Peaking to as high as 26% around the keel and damaged gel coat sections. Time to do some exploratory gel coat removal and see what is below.

    With a plastic paint scraper we took this off in under 10 minutes. The gel coat so far is not adhered to the hull. A consistent layer of acetic acid and blisters show a hull wet from osmosis. Even at this point none of the gel coat surfaces are sufficient to fix except the one above the water line which is rock solid.
    Time to investigate solutions and make some decisions.

  • Fall Haul-Out Project

    or “It’s only a scratch.”

    Fall came and it was time to pull the boat from the water. Our diver had reported some hull damage underneath and it was our first haul out for the boat so off we went to Duck Creek to attend to our hull issues.

    The boat hauled out