The Research History of Omega Mod M: When a Bug Becomes a Paper
Research trajectories rarely follow linear paths. This is the story of an unexpected discovery made during baryogenesis research—how a suspected coding error turned into a published number theory paper.
The Context: Why Was I Studying Prime Factors Anyway?
2024-early 2025: I was investigating ternary structures for understanding matter-antimatter asymmetry. The core idea: maybe the universe has three sectors (matter, antimatter, buffer) operating on mod 3 arithmetic, and this somehow prevents complete annihilation.
To test this computationally, I needed tools that understood how integers distribute across residue classes modulo 3. One foundational check: verify that prime factors of random integers distribute uniformly across mod 3 equivalence classes.
Spoiler: They don’t.
The “Wait, What?” Moment
Early 2025, during development of a ternary sorting verification tool, a routine diagnostic check flagged an anomaly:
Testing ternary sorting algorithm...
Checking prime factor distribution mod 3:
Factors ≡ 1 (mod 3): 67.2% of occurrences
Factors ≡ 2 (mod 3): 32.8% of occurrences
Expected ratio: 1:1
Observed ratio: 2.05:1
First reaction: Is this a bug in my prime factorization routine?
The Investigation: Is This Real?
Standard debugging procedures kicked in:
1. Code review: Manual inspection of factorization algorithm → No errors found
2. Expand sample size: Increased from 10⁴ to 10⁶ integers → Ratio persisted at 2.05:1
3. Rewrite from scratch: Different algorithm, same result → Still 2:1 imbalance
4. Hand computation: Small cases (n ≤ 30) verified manually → Pattern confirmed
The imbalance was genuine. For example: - Ω(12) = Ω(2² · 3) = 3 factors, two are ≡ 2 (mod 3), one is 3 ≡ 0 - Ω(28) = Ω(2² · 7) = 3 factors, two are ≡ 2, one is ≡ 1 (mod 3) - Ω(15) = Ω(3 · 5) = 2 factors, one is ≡ 0, one is ≡ 2
Counting across all integers up to N showed persistent bias toward residue class 1.
The “Is This Known?” Phase
Initial hypotheses for the asymmetry:
Dirichlet’s theorem artifact: Primes are equidistributed in residue classes, but Ω(n) counts with multiplicity—maybe higher powers favor certain residues?
Selberg-Delange influence: Classical analytic number theory provides asymptotic formulas for additive functions like Ω(n)—could these imply finite-size deviations?
Computational bias: Perhaps consecutive integer sampling introduced artifacts vs. random sampling?
These questions pointed to something deeper than a coding bug.
Testing Generalization: Does This Work for Other Moduli?
Natural question: Is this a mod 3 quirk, or universal?
Computational test (Spring-Summer 2025):
| Modulus m | Type | Expected | Observed | Pattern |
|---|---|---|---|---|
| 3 | Prime | 1:1 | 2.05:1 | Strong bias |
| 5 | Prime | Uniform | Non-uniform | Imbalanced |
| 7 | Prime | Uniform | Non-uniform | Imbalanced |
| 11 | Prime | Uniform | Non-uniform | Imbalanced |
| 6 | Composite | 1:1 | 1.15:1 | Weaker effect |
Discovery: The imbalance is universal for prime moduli, though specific ratios vary. This pointed to fundamental prime factorization structure, not a mod 3 accident.
The Literature Deep Dive
July 2025: Literature review to check if this was known.
What I found:
Selberg-Delange Method (1950s-60s)
Provides asymptotics for sums of complex-valued additive functions. Predicts Fourier coefficients should decay as:
\[|S(x)|/x \sim C_m (\log x)^{\alpha_m}\]
where: - \(\alpha_m = \cos(2\pi/m) - 1\) (theoretically known) - \(C_m\) = constants computable via Euler products (not tabulated!)
The gap I found: Theory existed, but nobody had: 1. Computed explicit numerical values of \(C_m\) to high precision 2. Verified the decay law at accessible finite x (not just x → ∞) 3. Investigated short-interval behavior [x, x+H] 4. Provided open-source reproducible implementation
The opportunity: Computational verification was an actionable contribution.
The Pivot Decision
August 2025: ~80 hours invested in omega-mod-m investigation: - Algorithm development: 40 hours - Literature review: 15 hours - Computational verification: 20 hours - Visualization: 5 hours
Decision point:
Option A: Document as technical appendix to future baryogenesis paper - Pros: Maintains connection to original motivation - Cons: Delays dissemination, dilutes audience
Option B: Publish standalone mathematical paper - Pros: Reaches appropriate audience, demonstrates breadth - Cons: Requires pausing baryogenesis work
I chose Option B because:
- Self-contained result: The omega-mod-m pattern requires no physics context
- Fast execution: Computational verification done, could publish in 2-3 weeks
- Portfolio value: Demonstrates recognizing unexpected findings and executing rapid publication
- Definitive foundation: Citeable reference for future work using these tools
- Priority: Risk of being scooped—pattern is discoverable by anyone studying Ω(n) distributions
The Computational Approach
Methodology (February-August 2025):
1. Sieve-Based Factorization
- Smallest prime factor (SPF) sieve preprocessing
- Enables O(log n) factorization per integer
- Efficient computation up to N = 10⁸
2. Fourier Analysis
Represent residue class imbalances via discrete Fourier coefficients:
\[S(x) = \sum_{n \leq x} \omega^{\Omega(n)}, \quad \omega = e^{2\pi i/m}\]
3. Dyadic Shell Regression
- Fit power-law decay on intervals [2^k, 2^(k+1)]
- Reduces autocorrelation bias from cumulative sums
- More accurate exponent estimates than naive least-squares
4. Bootstrap Uncertainty Quantification
- 1000-sample resampling (seed=42)
- 95% confidence intervals
- First application to Selberg-Delange constant estimation
The Key Results
Main contribution: Explicit computation of Selberg-Delange constants \(C_m\) to 6 decimal places:
| m | α_m | C_m (theory) | C_m (empirical) | Agreement |
|---|---|---|---|---|
| 3 | -1.500 | 1.708456 | 1.708 ± 0.025 | Excellent |
| 4 | -1.000 | 1.555237 | 1.555 ± 0.020 | Excellent |
| 5 | -0.691 | 1.273375 | 1.273 ± 0.015 | Excellent |
| 6 | -0.500 | 1.117734 | 1.118 ± 0.012 | Excellent |
For mod 3 specifically: \[\alpha_3 = \cos(2\pi/3) - 1 = -3/2\]
Agreement to 6 decimals validates both theoretical prediction and computational method.
Computational Innovation
Dyadic shell regression: Novel application reducing bias from autocorrelation in cumulative sums
Bootstrap uncertainty: First systematic application to these constants
Reproducibility: All code, data, figures archived publicly (MIT/CC-BY licenses)
Short-Interval Extension
Extended analysis to intervals [x, x+H] revealed threshold behavior: - H ≲ x^0.6: Noise dominates, no clear pattern - H ≳ x^0.6: Decay law emerges with same exponent α_m - Optimal: H ≈ x^0.8 balances sample size vs. asymptotic regime
Implications for prime number races and probabilistic number theory.
Publication Timeline
| Period | Activity |
|---|---|
| Feb-Mar 2025 | Initial observation during baryogenesis tool development |
| Apr-Jun 2025 | Computational verification across moduli m = 3, 4, 5, 6 |
| Jul 2025 | Literature review; Selberg-Delange connection identified |
| Aug 2025 | Pivot decision → standalone publication |
| Sep 2025 | Code refactoring, figure generation |
| Oct 1-10 | AI-assisted peer review (ChatGPT, Claude) |
| Oct 11-20 | Final fact-checking, arXiv package prep |
| Oct 24, 2025 | Published to Zenodo with DOI |
Total time: 8 months from observation to publication
Total investment: ~150 hours
Output: - 21-page paper (Download PDF) - Full codebase (GitHub: omega-mod-m) - 8 publication-quality figures - Interactive web tool (Prime Factor Distribution Explorer)
Why Zenodo?
Zenodo was chosen for: - Immediate DOI assignment and citeability - Open access without article processing charges - Flexibility for independent researchers (no institutional affiliation needed) - Integration with arXiv-style preprint workflow
Early impact (as of November 2025): - Zenodo downloads and views tracking impact - First portfolio piece demonstrating publication capability - Interactive tool available for exploration and education
Value of the Detour
What It Gave the Baryogenesis Research
Computational infrastructure: SPF sieve and Fourier analysis tools now used for testing tripartite operators
Deeper intuition: Understanding how additive functions distribute informs expectations for complex observables
Credibility boost: Published mathematical result demonstrates rigor extends beyond speculative physics
Current baryogenesis work (November 2025): - Proving targeted non-annihilation theorems for F₃ tripartite sorting - Investigating 432-element group structure - Exploring discrete gauge theory connections
What It Gave My Skill Set
Analytic number theory: Deep familiarity with Selberg-Delange method, additive function asymptotics
Computational rigor: Best practices for numerical verification, error quantification, reproducibility
Mathematical writing: Adapting technical content for different audiences
Strategic research management: Balancing immediate opportunities against long-term commitments
The Publication Efficiency Story
Timeline breakdown:
| Phase | Duration | Percentage |
|---|---|---|
| Observation & verification | 6 weeks | 23% |
| Literature review | 3 weeks | 12% |
| Computational experiments | 8 weeks | 31% |
| Writing & figures | 5 weeks | 19% |
| AI peer review | 2 weeks | 8% |
| Publication logistics | 2 weeks | 8% |
| Total | 26 weeks | 100% |
Efficiency factors:
- Reused codebase: Factorization tools from baryogenesis work (saved ~3 weeks)
- AI-assisted review: ChatGPT/Claude rapid feedback (saved ~2 weeks vs. traditional peer review)
- Zenodo workflow: Direct publication without journal overhead (saved ~6-12 months)
- Evening/weekend work: Parallel execution minimized disruption
Cost-benefit for independent researcher:
- Investment: ~150 hours over 8 months
- Output: First-author publication, full reproducible codebase
- Timeline: Observation → published paper in one semester
This represents efficient use of limited time when building a publication record outside traditional academia.
Lessons in Opportunistic Research Management
How to Recognize Publishable Surprises
Not every unexpected observation warrants pursuit. What elevated this case:
- Generality: Pattern held across multiple moduli → suggests deep structure
- Computational verifiability: Rigorous testing possible with available resources
- Self-containment: Publishable independent of motivating context
- Literature gap: No existing computational verification of these constants
How to Execute Without Derailing Main Research
Keys to completing in 6 months without excessive disruption:
- Clear scope: Computational verification only, defer deep analytic proofs
- Existing infrastructure: Leveraged code already written for other purposes
- Parallel work: Writing during computational runs (evenings/weekends)
- Realistic timeline: Set 3-month target, achieved in 2.5 months
When to Return to Main Work
After publication, immediately resumed baryogenesis rather than: - ✅ Pursuing ω(n) extension (distinct prime factors) → Completed as 3-page appendix - ❌ Investigating multiplicative functions μ(n) → Deferred - ❌ Developing pedagogical materials → Deferred
Discipline: Bounded excursions prevent scope creep while allowing valuable side projects.
Cross-Domain Thinking at Work
The discovery demonstrates value of working at disciplinary boundaries:
Physics → Mathematics
- Physics intuition: Matter/antimatter asymmetry motivated exploring mod 3 structure
- Mathematical formalism: Fourier analysis provided tools to characterize patterns
- Computational verification: Bootstrap resampling validated theoretical predictions
Adaptability Required
Successfully publishing required:
- Audience shift: Particle physicists → number theorists
- Presentation adjustment: Emphasize computational methods over physical motivation
- Standard adherence: Open-science practices (reproducible code, open data)
Strategic Positioning
The omega-mod-m project occupies a valuable niche:
- Not deep analytic number theory: Doesn’t prove new theorems about primes
- Not physics: Makes no claims about particle physics/cosmology
- Computational contribution: First systematic numerical verification of Selberg-Delange predictions with explicit constants
This interdisciplinary position is valuable for independent researchers who can move fluidly between domains.
The Math (Simplified)
For those curious about what was actually discovered:
The Prime Factor Count Function
Ω(n) counts prime factors with multiplicity: - Ω(12) = Ω(2² · 3) = 3 (counting 2, 2, 3) - Ω(30) = Ω(2 · 3 · 5) = 3 (counting 2, 3, 5)
The Question
When you look at all prime factors of integers from 1 to N, how do they distribute across residue classes modulo m?
Naive expectation: Uniformly (Dirichlet’s theorem says primes are equidistributed)
Reality: Counting with multiplicity creates bias
The Decay Law
The imbalance decreases as:
\[\frac{|S(x)|}{x} \sim C_m (\log x)^{\alpha_m}\]
where: - α_m = cos(2π/m) - 1 (theoretically predicted) - C_m = mysterious constants I computed for the first time
For m=3: α₃ = -3/2, and C₃ ≈ 1.708456…
What I Verified
- The theoretical predictions match computational results to 6 decimals
- The decay law holds for accessible N (10⁶ - 10⁸), not just asymptotically
- Short intervals show threshold behavior at H ~ x^0.6
- The constants can be computed via convergent Euler products
Scope Definition: Keeping It Focused
To maintain feasibility, strict boundaries were set:
✅ Pure number theory presentation: No mention of baryogenesis, antimatter, or ternary sorting in main text
✅ Computational focus: Emphasize verification and reproducibility over deep analytic proofs
✅ Acknowledgment only: Brief note that work originated during physics research
✅ Page limit: 15-20 pages for readability and focus
This framing made the work accessible to mathematicians, physicists, and computer scientists alike.
What the Paper Demonstrates
For Mathematics
- First tabulated values of Selberg-Delange constants C_m
- Novel dyadic shell regression technique
- Bootstrap methods for asymptotic constant estimation
- Reproducible computational framework
For Portfolio
Ability to: - Recognize unexpected findings worthy of independent pursuit - Execute rapid publication while maintaining rigor - Adapt presentation to different mathematical audiences - Manage multiple research threads simultaneously
Skills demonstrated: - Analytic number theory (Euler products, asymptotic methods) - Computational mathematics (numerical verification, error quantification) - Statistical methods (bootstrap, confidence intervals) - Software engineering (reproducible code, documentation) - Scientific writing (mathematical exposition, audience adaptation)
Return to Baryogenesis: The Detour Strengthened the Main Path
Following publication on October 24, 2025, work on ternary structures resumed with enhanced foundation:
Infrastructure: The computational tools built for omega-mod-m are now core components of the baryogenesis verification pipeline
Confidence: Successfully publishing rigorous mathematics validated the research approach
Network effect: The paper reached computational number theorists who might never have seen baryogenesis work, creating unexpected connections
Portfolio breadth: Demonstrates versatility beyond single-domain expertise
The Psychological Benefit
Here’s something not mentioned in the paper: finishing a well-defined, bounded project provided momentum during the longer, more uncertain baryogenesis program.
When you’re working on speculative theoretical physics with uncertain experimental prospects, having a concrete publication—a DOI, a GitHub repo with stars, an interactive tool people actually use—provides tangible evidence of progress.
What I’d Do Differently
Looking back with 8 months of hindsight:
What worked:
- Clear scope boundaries (computational only, defer deep theory)
- AI-assisted review (fast iteration, caught errors early)
- Public reproducibility (GitHub + Zenodo workflow)
- Parallel execution (evenings/weekends, didn’t block main work)
What I’d change:
- Start the literature review EARLIER (I almost reinvented known theory)
- Set up the GitHub repo at the BEGINNING (not after 4 months of local work)
- Create the interactive tool FIRST (would have guided paper development)
- Reach out to number theory community sooner (got valuable feedback only in Oct)
The meta-lesson: Even successful projects have inefficiencies. Document them for next time.
The Bigger Picture: Independent Research in 2025
This case study illustrates what’s now possible for independent researchers:
What you CAN do (2025 tools):
- Rigorous computational mathematics
- AI-assisted peer review and fact-checking
- Immediate publication (Zenodo, arXiv)
- Open-source dissemination
- Building credibility paper-by-paper
What’s still HARD:
- No institutional credibility initially
- No built-in collaborator network
- Self-funded (time = money)
- Higher bar to prove rigor without university affiliation
But the barriers are lower than they were in 2015. Tools matter. AI assistance matters. Open science infrastructure matters.
For career changers (like me, coming from project management into data science/research):
- You can produce publication-quality work in months, not years
- You can learn by doing real research, not just tutorials
- You can build a portfolio that demonstrates actual capabilities
- You don’t need a PhD to contribute rigorously to mathematics
Current Status & Impact
Publication details:
- Title: “Finite-Size Equidistribution of Ω(n) Modulo m: Theory and Computation”
- Author: Oksana Sudoma (2025)
- DOI: 10.5281/zenodo.17432403
- Repository: github.com/boonespacedog/omega-mod-m
- Interactive tool: Prime Factor Distribution Explorer
As of November 2025:
- Published and citeable
- Part of broader 5-paper portfolio demonstrating research breadth
- Foundation for ongoing baryogenesis computational work
- Example case study in opportunistic research management
The Takeaway
Research productivity requires both:
- Planned systematic investigation (the baryogenesis program)
- Opportunistic pursuit of emergent findings (omega-mod-m)
The omega-mod-m case exemplifies the latter: seeing an unexpected pattern, recognizing its value, executing efficiently, and returning to primary work with enhanced capability and broader portfolio.
Key principle: Not every anomaly is worth pursuing. But when you find one that’s:
- Generalizable
- Computationally verifiable
- Self-contained
- Filling a literature gap
…you pursue it rigorously, publish it quickly, and move on stronger.
The baryogenesis program continues. The omega-mod-m detour strengthened it. And the journey from “wait, what?” to “here’s the paper” took one semester of focused work.
That’s the story of how a suspected bug became my first publication. 🎯
Links:
Oksana Sudoma
Independent Researcher