Research Journey November 16, 2025 10 min read

The Research History of Omega Mod M: When a Bug Becomes a Paper

Research trajectories rarely follow linear paths. This is the story of an unexpected discovery made during baryogenesis research—how a suspected coding error turned into a published number theory paper.



The Context: Why Was I Studying Prime Factors Anyway?

2024-early 2025: I was investigating ternary structures for understanding matter-antimatter asymmetry. The core idea: maybe the universe has three sectors (matter, antimatter, buffer) operating on mod 3 arithmetic, and this somehow prevents complete annihilation.

To test this computationally, I needed tools that understood how integers distribute across residue classes modulo 3. One foundational check: verify that prime factors of random integers distribute uniformly across mod 3 equivalence classes.

Spoiler: They don’t.


The “Wait, What?” Moment

Early 2025, during development of a ternary sorting verification tool, a routine diagnostic check flagged an anomaly:

Testing ternary sorting algorithm...
Checking prime factor distribution mod 3:
  Factors ≡ 1 (mod 3): 67.2% of occurrences
  Factors ≡ 2 (mod 3): 32.8% of occurrences
  Expected ratio: 1:1
  Observed ratio: 2.05:1

First reaction: Is this a bug in my prime factorization routine?


The Investigation: Is This Real?

Standard debugging procedures kicked in:

1. Code review: Manual inspection of factorization algorithm → No errors found

2. Expand sample size: Increased from 10⁴ to 10⁶ integers → Ratio persisted at 2.05:1

3. Rewrite from scratch: Different algorithm, same result → Still 2:1 imbalance

4. Hand computation: Small cases (n ≤ 30) verified manually → Pattern confirmed

The imbalance was genuine. For example: - Ω(12) = Ω(2² · 3) = 3 factors, two are ≡ 2 (mod 3), one is 3 ≡ 0 - Ω(28) = Ω(2² · 7) = 3 factors, two are ≡ 2, one is ≡ 1 (mod 3) - Ω(15) = Ω(3 · 5) = 2 factors, one is ≡ 0, one is ≡ 2

Counting across all integers up to N showed persistent bias toward residue class 1.


The “Is This Known?” Phase

Initial hypotheses for the asymmetry:

  1. Dirichlet’s theorem artifact: Primes are equidistributed in residue classes, but Ω(n) counts with multiplicity—maybe higher powers favor certain residues?

  2. Selberg-Delange influence: Classical analytic number theory provides asymptotic formulas for additive functions like Ω(n)—could these imply finite-size deviations?

  3. Computational bias: Perhaps consecutive integer sampling introduced artifacts vs. random sampling?

These questions pointed to something deeper than a coding bug.


Testing Generalization: Does This Work for Other Moduli?

Natural question: Is this a mod 3 quirk, or universal?

Computational test (Spring-Summer 2025):

Modulus m Type Expected Observed Pattern
3 Prime 1:1 2.05:1 Strong bias
5 Prime Uniform Non-uniform Imbalanced
7 Prime Uniform Non-uniform Imbalanced
11 Prime Uniform Non-uniform Imbalanced
6 Composite 1:1 1.15:1 Weaker effect

Discovery: The imbalance is universal for prime moduli, though specific ratios vary. This pointed to fundamental prime factorization structure, not a mod 3 accident.


The Literature Deep Dive

July 2025: Literature review to check if this was known.

What I found:

Selberg-Delange Method (1950s-60s)

Provides asymptotics for sums of complex-valued additive functions. Predicts Fourier coefficients should decay as:

\[|S(x)|/x \sim C_m (\log x)^{\alpha_m}\]

where: - \(\alpha_m = \cos(2\pi/m) - 1\) (theoretically known) - \(C_m\) = constants computable via Euler products (not tabulated!)

The gap I found: Theory existed, but nobody had: 1. Computed explicit numerical values of \(C_m\) to high precision 2. Verified the decay law at accessible finite x (not just x → ∞) 3. Investigated short-interval behavior [x, x+H] 4. Provided open-source reproducible implementation

The opportunity: Computational verification was an actionable contribution.


The Pivot Decision

August 2025: ~80 hours invested in omega-mod-m investigation: - Algorithm development: 40 hours - Literature review: 15 hours - Computational verification: 20 hours - Visualization: 5 hours

Decision point:

Option A: Document as technical appendix to future baryogenesis paper - Pros: Maintains connection to original motivation - Cons: Delays dissemination, dilutes audience

Option B: Publish standalone mathematical paper - Pros: Reaches appropriate audience, demonstrates breadth - Cons: Requires pausing baryogenesis work

I chose Option B because:

  1. Self-contained result: The omega-mod-m pattern requires no physics context
  2. Fast execution: Computational verification done, could publish in 2-3 weeks
  3. Portfolio value: Demonstrates recognizing unexpected findings and executing rapid publication
  4. Definitive foundation: Citeable reference for future work using these tools
  5. Priority: Risk of being scooped—pattern is discoverable by anyone studying Ω(n) distributions

The Computational Approach

Methodology (February-August 2025):

1. Sieve-Based Factorization

  • Smallest prime factor (SPF) sieve preprocessing
  • Enables O(log n) factorization per integer
  • Efficient computation up to N = 10⁸

2. Fourier Analysis

Represent residue class imbalances via discrete Fourier coefficients:

\[S(x) = \sum_{n \leq x} \omega^{\Omega(n)}, \quad \omega = e^{2\pi i/m}\]

3. Dyadic Shell Regression

  • Fit power-law decay on intervals [2^k, 2^(k+1)]
  • Reduces autocorrelation bias from cumulative sums
  • More accurate exponent estimates than naive least-squares

4. Bootstrap Uncertainty Quantification

  • 1000-sample resampling (seed=42)
  • 95% confidence intervals
  • First application to Selberg-Delange constant estimation

The Key Results

Main contribution: Explicit computation of Selberg-Delange constants \(C_m\) to 6 decimal places:

m α_m C_m (theory) C_m (empirical) Agreement
3 -1.500 1.708456 1.708 ± 0.025 Excellent
4 -1.000 1.555237 1.555 ± 0.020 Excellent
5 -0.691 1.273375 1.273 ± 0.015 Excellent
6 -0.500 1.117734 1.118 ± 0.012 Excellent

For mod 3 specifically: \[\alpha_3 = \cos(2\pi/3) - 1 = -3/2\]

Agreement to 6 decimals validates both theoretical prediction and computational method.

Computational Innovation

Dyadic shell regression: Novel application reducing bias from autocorrelation in cumulative sums

Bootstrap uncertainty: First systematic application to these constants

Reproducibility: All code, data, figures archived publicly (MIT/CC-BY licenses)

Short-Interval Extension

Extended analysis to intervals [x, x+H] revealed threshold behavior: - H ≲ x^0.6: Noise dominates, no clear pattern - H ≳ x^0.6: Decay law emerges with same exponent α_m - Optimal: H ≈ x^0.8 balances sample size vs. asymptotic regime

Implications for prime number races and probabilistic number theory.


Publication Timeline

Period Activity
Feb-Mar 2025 Initial observation during baryogenesis tool development
Apr-Jun 2025 Computational verification across moduli m = 3, 4, 5, 6
Jul 2025 Literature review; Selberg-Delange connection identified
Aug 2025 Pivot decision → standalone publication
Sep 2025 Code refactoring, figure generation
Oct 1-10 AI-assisted peer review (ChatGPT, Claude)
Oct 11-20 Final fact-checking, arXiv package prep
Oct 24, 2025 Published to Zenodo with DOI

Total time: 8 months from observation to publication

Total investment: ~150 hours

Output: - 21-page paper (Download PDF) - Full codebase (GitHub: omega-mod-m) - 8 publication-quality figures - Interactive web tool (Prime Factor Distribution Explorer)


Why Zenodo?

Zenodo was chosen for: - Immediate DOI assignment and citeability - Open access without article processing charges - Flexibility for independent researchers (no institutional affiliation needed) - Integration with arXiv-style preprint workflow

Early impact (as of November 2025): - Zenodo downloads and views tracking impact - First portfolio piece demonstrating publication capability - Interactive tool available for exploration and education


Value of the Detour

What It Gave the Baryogenesis Research

  1. Computational infrastructure: SPF sieve and Fourier analysis tools now used for testing tripartite operators

  2. Deeper intuition: Understanding how additive functions distribute informs expectations for complex observables

  3. Credibility boost: Published mathematical result demonstrates rigor extends beyond speculative physics

Current baryogenesis work (November 2025): - Proving targeted non-annihilation theorems for F₃ tripartite sorting - Investigating 432-element group structure - Exploring discrete gauge theory connections

What It Gave My Skill Set

Analytic number theory: Deep familiarity with Selberg-Delange method, additive function asymptotics

Computational rigor: Best practices for numerical verification, error quantification, reproducibility

Mathematical writing: Adapting technical content for different audiences

Strategic research management: Balancing immediate opportunities against long-term commitments


The Publication Efficiency Story

Timeline breakdown:

Phase Duration Percentage
Observation & verification 6 weeks 23%
Literature review 3 weeks 12%
Computational experiments 8 weeks 31%
Writing & figures 5 weeks 19%
AI peer review 2 weeks 8%
Publication logistics 2 weeks 8%
Total 26 weeks 100%

Efficiency factors:

  • Reused codebase: Factorization tools from baryogenesis work (saved ~3 weeks)
  • AI-assisted review: ChatGPT/Claude rapid feedback (saved ~2 weeks vs. traditional peer review)
  • Zenodo workflow: Direct publication without journal overhead (saved ~6-12 months)
  • Evening/weekend work: Parallel execution minimized disruption

Cost-benefit for independent researcher:

  • Investment: ~150 hours over 8 months
  • Output: First-author publication, full reproducible codebase
  • Timeline: Observation → published paper in one semester

This represents efficient use of limited time when building a publication record outside traditional academia.


Lessons in Opportunistic Research Management

How to Recognize Publishable Surprises

Not every unexpected observation warrants pursuit. What elevated this case:

  1. Generality: Pattern held across multiple moduli → suggests deep structure
  2. Computational verifiability: Rigorous testing possible with available resources
  3. Self-containment: Publishable independent of motivating context
  4. Literature gap: No existing computational verification of these constants

How to Execute Without Derailing Main Research

Keys to completing in 6 months without excessive disruption:

  1. Clear scope: Computational verification only, defer deep analytic proofs
  2. Existing infrastructure: Leveraged code already written for other purposes
  3. Parallel work: Writing during computational runs (evenings/weekends)
  4. Realistic timeline: Set 3-month target, achieved in 2.5 months

When to Return to Main Work

After publication, immediately resumed baryogenesis rather than: - ✅ Pursuing ω(n) extension (distinct prime factors) → Completed as 3-page appendix - ❌ Investigating multiplicative functions μ(n) → Deferred - ❌ Developing pedagogical materials → Deferred

Discipline: Bounded excursions prevent scope creep while allowing valuable side projects.


Cross-Domain Thinking at Work

The discovery demonstrates value of working at disciplinary boundaries:

Physics → Mathematics

  • Physics intuition: Matter/antimatter asymmetry motivated exploring mod 3 structure
  • Mathematical formalism: Fourier analysis provided tools to characterize patterns
  • Computational verification: Bootstrap resampling validated theoretical predictions

Adaptability Required

Successfully publishing required:

  • Audience shift: Particle physicists → number theorists
  • Presentation adjustment: Emphasize computational methods over physical motivation
  • Standard adherence: Open-science practices (reproducible code, open data)

Strategic Positioning

The omega-mod-m project occupies a valuable niche:

  • Not deep analytic number theory: Doesn’t prove new theorems about primes
  • Not physics: Makes no claims about particle physics/cosmology
  • Computational contribution: First systematic numerical verification of Selberg-Delange predictions with explicit constants

This interdisciplinary position is valuable for independent researchers who can move fluidly between domains.


The Math (Simplified)

For those curious about what was actually discovered:

The Prime Factor Count Function

Ω(n) counts prime factors with multiplicity: - Ω(12) = Ω(2² · 3) = 3 (counting 2, 2, 3) - Ω(30) = Ω(2 · 3 · 5) = 3 (counting 2, 3, 5)

The Question

When you look at all prime factors of integers from 1 to N, how do they distribute across residue classes modulo m?

Naive expectation: Uniformly (Dirichlet’s theorem says primes are equidistributed)

Reality: Counting with multiplicity creates bias

The Decay Law

The imbalance decreases as:

\[\frac{|S(x)|}{x} \sim C_m (\log x)^{\alpha_m}\]

where: - α_m = cos(2π/m) - 1 (theoretically predicted) - C_m = mysterious constants I computed for the first time

For m=3: α₃ = -3/2, and C₃ ≈ 1.708456…

What I Verified

  1. The theoretical predictions match computational results to 6 decimals
  2. The decay law holds for accessible N (10⁶ - 10⁸), not just asymptotically
  3. Short intervals show threshold behavior at H ~ x^0.6
  4. The constants can be computed via convergent Euler products

Scope Definition: Keeping It Focused

To maintain feasibility, strict boundaries were set:

Pure number theory presentation: No mention of baryogenesis, antimatter, or ternary sorting in main text

Computational focus: Emphasize verification and reproducibility over deep analytic proofs

Acknowledgment only: Brief note that work originated during physics research

Page limit: 15-20 pages for readability and focus

This framing made the work accessible to mathematicians, physicists, and computer scientists alike.


What the Paper Demonstrates

For Mathematics

  • First tabulated values of Selberg-Delange constants C_m
  • Novel dyadic shell regression technique
  • Bootstrap methods for asymptotic constant estimation
  • Reproducible computational framework

For Portfolio

Ability to: - Recognize unexpected findings worthy of independent pursuit - Execute rapid publication while maintaining rigor - Adapt presentation to different mathematical audiences - Manage multiple research threads simultaneously

Skills demonstrated: - Analytic number theory (Euler products, asymptotic methods) - Computational mathematics (numerical verification, error quantification) - Statistical methods (bootstrap, confidence intervals) - Software engineering (reproducible code, documentation) - Scientific writing (mathematical exposition, audience adaptation)


Return to Baryogenesis: The Detour Strengthened the Main Path

Following publication on October 24, 2025, work on ternary structures resumed with enhanced foundation:

Infrastructure: The computational tools built for omega-mod-m are now core components of the baryogenesis verification pipeline

Confidence: Successfully publishing rigorous mathematics validated the research approach

Network effect: The paper reached computational number theorists who might never have seen baryogenesis work, creating unexpected connections

Portfolio breadth: Demonstrates versatility beyond single-domain expertise


The Psychological Benefit

Here’s something not mentioned in the paper: finishing a well-defined, bounded project provided momentum during the longer, more uncertain baryogenesis program.

When you’re working on speculative theoretical physics with uncertain experimental prospects, having a concrete publication—a DOI, a GitHub repo with stars, an interactive tool people actually use—provides tangible evidence of progress.


What I’d Do Differently

Looking back with 8 months of hindsight:

What worked:

  • Clear scope boundaries (computational only, defer deep theory)
  • AI-assisted review (fast iteration, caught errors early)
  • Public reproducibility (GitHub + Zenodo workflow)
  • Parallel execution (evenings/weekends, didn’t block main work)

What I’d change:

  • Start the literature review EARLIER (I almost reinvented known theory)
  • Set up the GitHub repo at the BEGINNING (not after 4 months of local work)
  • Create the interactive tool FIRST (would have guided paper development)
  • Reach out to number theory community sooner (got valuable feedback only in Oct)

The meta-lesson: Even successful projects have inefficiencies. Document them for next time.


The Bigger Picture: Independent Research in 2025

This case study illustrates what’s now possible for independent researchers:

What you CAN do (2025 tools):

  • Rigorous computational mathematics
  • AI-assisted peer review and fact-checking
  • Immediate publication (Zenodo, arXiv)
  • Open-source dissemination
  • Building credibility paper-by-paper

What’s still HARD:

  • No institutional credibility initially
  • No built-in collaborator network
  • Self-funded (time = money)
  • Higher bar to prove rigor without university affiliation

But the barriers are lower than they were in 2015. Tools matter. AI assistance matters. Open science infrastructure matters.

For career changers (like me, coming from project management into data science/research):

  • You can produce publication-quality work in months, not years
  • You can learn by doing real research, not just tutorials
  • You can build a portfolio that demonstrates actual capabilities
  • You don’t need a PhD to contribute rigorously to mathematics

Current Status & Impact

Publication details:

As of November 2025:

  • Published and citeable
  • Part of broader 5-paper portfolio demonstrating research breadth
  • Foundation for ongoing baryogenesis computational work
  • Example case study in opportunistic research management

The Takeaway

Research productivity requires both:

  • Planned systematic investigation (the baryogenesis program)
  • Opportunistic pursuit of emergent findings (omega-mod-m)

The omega-mod-m case exemplifies the latter: seeing an unexpected pattern, recognizing its value, executing efficiently, and returning to primary work with enhanced capability and broader portfolio.

Key principle: Not every anomaly is worth pursuing. But when you find one that’s:

  • Generalizable
  • Computationally verifiable
  • Self-contained
  • Filling a literature gap

…you pursue it rigorously, publish it quickly, and move on stronger.

The baryogenesis program continues. The omega-mod-m detour strengthened it. And the journey from “wait, what?” to “here’s the paper” took one semester of focused work.

That’s the story of how a suspected bug became my first publication. 🎯


Links:


Oksana Sudoma

Oksana Sudoma

Independent Researcher

← All Posts