I have spent a life time thinking about the “Observer Effect” (where the act of measuring a system changes its state) before I even knew there was such a thing as the Measurement Problem.

I think I was eight years old when I first heard the riddle: If a tree falls in the forest and there is no one there to hear it, does it make a sound? I really thought about this question. A lot. I wondered if the tree even fell at all.

I wasn’t there when my father died. At his funeral, asked if I wanted to view him, to say goodbye, I declined. I believed that if I didn’t see him then he would still be alive for me. A small part of that still believes it.

These thoughts never really stopped. What if observation matters? What if our minds interact with information in ways we don’t yet understand?

Experiment 3 and 4 were my attempt to build an experimental design strong enough that if something changed in the data, I could rule out hardware quirks, random drift or timing artifacts.

Quantum Consciousness Assay (QCA)

I designed what I call the Quantum Consciousness Assay (QCA) with a goal to test if the intent of a subject can impact the randomness of a quantum system by moving past the simple hit & miss scores of the subjects and looking to see if the intent of the subject could impose order on a quantum system, squeezing the entropy out of the bitstream (making it less random).

To make sure I was seeing “psi” and not just a glitch or environmental drift, I built in three ways to detect glitches or random drift.

1. The Automated Hardware Audit

To ensure the integrity of the experiment, I inserted an automated hardware audit every 5 blocks. The system fetches a raw 2,048-bit sample directly from the hardware entropy pool and subjects it to a battery of tests, including a Shannon Entropy check. If the bitstream’s entropy falls below 0.999 or the 50/50 balance drifts by more than 2%, the system flags a hardware exception. Throughout Experiment 4, the audits confirmed the QRNG remained in a state of maximum entropy, proving that any significant signal was a product of intent, not machine failure.

2. The Baseline and the AI

I ran humans against a baseline and AI.

The Baseline Agent: I created an automated script that ran the experiment like a human would, but with zero intent. This was my “dead control.”
The AI Agent: I used a language model explicitly instructed to favor targets to see if silicon-based intent could influence the bits.

By running these side-by-side with humans, I could see if the effects were unique to biological consciousness or if any “Observer” could create the tilt.

3. Common Mode Rejection

The core of the QCA is the Subject-Demon Split.

Each QRNG request is split in real time into two streams:

one assigned to the subject
one assigned to a shadow control (the “Demon”)

Both streams originate from the same hardware call, at the same moment.

Noise-cancelling headphones use a second microphone to listen to the outside noise and subtract it from what you hear. My experiment does the same thing. The Demon is a second data stream that listens to the hardware noise at the exact same millisecond as the human. This allows direct subtraction of shared noise. This is a technique known in engineering as common mode rejection.

Traditional PSI studies often compare subject data to baselines collected earlier or later. This creates a comparability problem: measurements span different temporal and environmental contexts, making causal attribution fundamentally ambiguous.

By contrast, in this design, hardware drift, environmental fluctuations, and pipeline artifacts appear equally in both streams and cancel out under subtraction.

What the Demon Guards Against

This architecture neutralizes several common false positives:

Start-up effects
Early-session deviations appear in both streams and cancel.
Environmental or hardware drift
Temperature, voltage, or EM fluctuations affect both streams equally.
End-of-stream artifacts
Buffering or session-boundary effects are removed by subtraction.

Random sampling error still exists, which is why statistical testing is required. Only differences exceeding what random fluctuation predicts are considered meaningful.

Why This Surpasses Prior QRNG Methodology

Rather than searching for high scores, this design searches for persistent divergence from a real-time control under subtraction.

Aspect	Design Limitations in PSI Studies	QCA
Baseline comparison	Outcomes compared to historical baselines	Outcomes compared to real-time baselines
Control structure	No simultaneous controls	Subject and control measured simultaneously
Hardware assumptions	Assume hardware stationarity	Allows divergence if present
Infrastructure noise	Infrastructure noise cannot be subtracted	Infrastructure noise can be subtracted
Drift confounds	Subject effects conflated with device drift	Subject effects separated from device drift
Effect size bounds	Effect size cannot be bounded	Hierarchical subtraction constrains ε (see FAQ-Q7)

Rather than searching for high scores, this design searches for persistent divergence from control under subtraction.

Experiment 4 Design

To allow fair comparison with AI agents, each block was stateless and brief.

Each QRNG request was split into two simultaneous bitstreams:

a subject stream (human, AI, or baseline process)
a demon stream (non-intentional control)

Both streams shared the same hardware source, timing, and call structure. The demon stream was not assigned a target and served as a real-time paired control, enabling direct subtraction of QRNG and pipeline noise.

Sessions consisted of ~30 short blocks (~3 minutes max). Each block lasted ~1-2 seconds (~150 bits), with the target randomly switching every block.

A substantial portion of participants were recruited via an online task-exchange platform, likely resulting in heterogeneous motivation and shallow engagement. While this does not introduce systematic bias due to paired demon controls, it may reduce sensitivity to effort-dependent effects.

Analyses Performed

Gemini, ChatGPT and I conducted multiple complementary analyses:

Session-level subject–demon comparisons
Bayesian hierarchical bit-level modeling estimating per-bit bias (ε)
Demon-only audits to validate QRNG stability
Stratification by meditation frequency
Stratification by belief (“sheep vs goat”)
Cumulative within-session ramp-up tests
AI-specific intent testing

Key Findings

AI agents, despite being explicitly instructed to favor a target, showed no detectable deviation from chance and no divergence from demon controls across all analyses.
Baseline processes also showed no effects.
Humans exhibited a very small posterior asymmetry in hierarchical modeling (ε ≈ 0.001–0.002), but:
- credible intervals included zero
- no session-level hit advantage was observed
- no cumulative ramp-up occurred within sessions
Demon streams were statistically indistinguishable across human, AI, and baseline sessions, ruling out QRNG hardware, access-path, or timing artifacts.
Meditation frequency and belief (sheep vs goat) stratifications showed no group-level directional effects.
While directional means were null, human sessions exhibited slightly higher outcome dispersion (variance) relative to baseline controls, suggesting a possible interaction with system volatility rather than directional bias.

Interpretation

After all subtractions, the remaining human asymmetry is extremely small and remains within noise bounds. The experiment does not demonstrate a robust mind–matter interaction.

However, it does allow unusually strong conclusions about what cannot explain the data: hardware artifacts, timing artifacts, and non-simultaneous baseline confounds are sharply constrained by the simultaneous paired-control design.

This experiment rules out mind–matter effects operating at millisecond scale timescales under rapidly switching targets and low-engagement conditions, but it does not address slower, sustained-intent regimes.

Significance

Unlike prior QRNG experiments, this study employed simultaneous paired controls, allowing direct subtraction of hardware noise and enabling strong null conclusions.

While no definitive mind–matter effect was demonstrated, the experiment places unusually tight bounds on where such effects could plausibly exist.

Future Work

The Updated February 2026 Redesign

To test for slow or sustained effects, Experiment 3 will pivot from the “stateless” rapid switching of previous trials to a Singular Focused Target protocol. This design is specifically built to see if a consistent mental anchor can “bias” the quantum stream over a significant duration.

The Experiment 3 Specifications:

Real-Time Demon Subtraction: As always, the Simultaneous Paired Control (SPC) will pull an identical bitstream. Because the session is longer, the “Demon” is even more critical here to track and subtract any slow, 10-minute thermal or voltage drifts in the hardware.

Stable 10-Minute Targets: The target (High or Low) remains constant for the entire session. This removes the “switching noise” and allows for the study of Long-Range Temporal Integration.

30-Second Data Blocks: While the target is constant, the QRNG is queried in 30-second “bouts.” This provides 20 distinct data points per session, allowing us to track the Evolution of the Delta from minute 1 to minute 10.

Visual Modulation: To combat “Attentional Blink” or habituation, the visual interface toggles between two different representations of the same data every 30 seconds. This keeps the observer engaged without changing the underlying intentional goal.

Reader FAQ

Q1: Isn’t this just another way of re-labeling noise as “psi”?

No.
The design explicitly removes shared noise before inference is made. Any deviation must survive:

simultaneous subtraction,
statistical testing against random fluctuation,
and cross-agent comparison (human vs AI vs baseline).

Any effect that depends on non-simultaneous baselines, unmodeled drift, or post-hoc windowing would not survive this design (a limitation present in many historical experiments.)

Q2: Could hardware drift still explain the results?

Only if drift affects the subject stream but not the demon stream drawn from the same hardware call at the same moment.

Because both streams originate from the same QRNG request, temperature, voltage, entropy source aging, and API-level effects appear equally in both and are removed under subtraction.

Residual divergence would have to be asymmetric within the same hardware call, which is not supported by QRNG physics.

Q3: Why not just compare results to chance (50/50)?

Because chance comparisons ignore non-stationarity.

QRNGs and their access pipelines are not guaranteed to be stationary over time. Comparing subject outcomes to historical baselines conflates subject effects with environmental drift.

This experiment instead compares subject and control in real time, eliminating the need to assume stationarity.

Q4: Could the demon stream itself be biased or corrupted?

The demon stream is not special; it is simply the other half of the same QRNG output. Demon-only audits show it behaves identically across human, AI, and baseline sessions.

If the demon were biased, that bias would appear consistently across all conditions—which it does not.

Q5: Why introduce AI agents at all?

AI agents serve two purposes:

They act as a non-biological intentional system, testing whether formal semantic intent alone is sufficient.
They function as a pipeline control. If prompt semantics, instruction framing, or API timing artifacts were sufficient to induce apparent effects, they would have appeared in AI sessions. Formal semantic intent implemented in this way does not produce measurable divergence under this design.

AI sessions showed clean null results.

Q6: Doesn’t rapid target switching suppress real effects?

Possibly and intentionally.

This experiment was designed to test instantaneous, stateless influence, not slow or cumulative effects.

If a mind–matter interaction requires sustained focus or temporal integration, it would not be detectable here. That limitation is acknowledged and motivates Experiment 3.

Q7: Why trust Bayesian modeling here?

Bayesian hierarchical modeling allows:

estimation of per-bit bias (ε),
explicit uncertainty bounds,
partial pooling across sessions without inflating false positives.

Importantly, the posterior credible intervals include zero, reinforcing the null conclusion rather than overstating weak effects.

Q8: Isn’t ε ≈ 0.001 evidence of a real effect?

Not on its own.

An effect must:

exclude zero with high credibility,
accumulate or replicate across conditions,
and survive alternative explanations.

Here, ε is small, non-accumulating, and absent in AI and baseline agents. It is statistically indistinguishable from noise.

Q9: Could increased variance in human sessions indicate something real?

Possibly…but variance increases alone do not imply directional influence.

The observed increase in dispersion could also reflect:

human interaction with the task interface,
attentional fluctuations,
unmodeled behavioral dynamics.

It does not alone support a claim of directed mind–matter causation.

Q10: Why should we trust null results in PSI research?

Because strong nulls require strong designs.

This experiment:

uses simultaneous controls,
avoids historical baselines,
bounds effect sizes,
and rules out major classes of artifacts.

Null results under such tight conditions are more informative. They constrain theory space rather than expand it.

Q11: What would convince you that an effect is real?

At minimum:

persistent divergence from demon controls,
accumulation over time,
replication across sessions and agents,
survival under subtraction and drift modeling,
and exclusion of alternative explanations.

This experiment did not meet those criteria.

Q12: So is this a failure?

No.

This work shows how easily non-simultaneous baselines and unmodeled drift can create apparent deviations in small-effect regimes, and it demonstrates a paired-control framework that can tightly bound (and potentially detect) any effect that survives real-time subtraction.

Beyond the Hit-Rate: Using Real-Time Subtraction to Bound the Observer Effect