CLASSIFIED_ASSETS

SYSTEM_LOGS

ENGINEERING NOTES & FIELD REPORTS

[2026-05-01] TAGS: devops,gitops,cicd,self-hosted

GITOPS PIPELINE SOVEREIGNTY

Every SaaS CI/CD service is a dependency you can’t control, a billing surface you can’t predict, and a trust boundary you can’t verify. After a year of running a fully self-hosted pipeline, I’m convinced that pipeline sovereignty isn’t just an ideological choice — it’s an operational advantage.

What “Fully Self-Hosted” Means

The entire software delivery chain runs on owned infrastructure. Source control, CI/CD runners, container registry, artifact storage, deployment automation — none of it touches a third-party service. A GitHub outage doesn’t affect deployments. A Docker Hub rate limit doesn’t block builds. A Vercel pricing change doesn’t require architecture changes.

This isn’t about distrust of cloud providers. It’s about eliminating variables. When a deployment fails, the entire stack is observable — logs, metrics, network traces, all in infrastructure I control. There’s no “contact support and wait” step in the incident response playbook.

The Pipeline Architecture

The pipeline is two-stage by design. Stage one triggers on push to the main branch: lint, test, build container image, push to private registry with BuildKit inline cache layers. Cached layers mean that a code-only change (no dependency updates) rebuilds in seconds rather than minutes. Stage two triggers on a version tag: pull the built image on the production target, recreate containers with zero-downtime rolling strategy, prune old images.

Production servers are stateless. They hold no source code, no build tools, no development dependencies. They pull containers and run them. Server replacement is a three-step process: provision, configure Docker, point at the registry.

What I Gave Up

Convenience. SaaS pipelines handle runner provisioning, secret management UIs, and marketplace integrations. Self-hosted means maintaining runners, managing secrets through environment configuration, and building integrations from scratch.

But the trade-off sharpens focus. Every integration is intentional. Every secret has a known location. Every runner has a known resource profile. There’s no sprawl of marketplace actions with unknown security postures executing code in the build environment.

The Economics

The infrastructure cost of running the pipeline is marginal — it shares compute with other workloads. The human cost is real but bounded: initial setup took a weekend, and maintenance is approximately an hour per month (runner updates, registry cleanup, certificate rotation). Compare that to a SaaS CI/CD bill that scales with usage and a vendor relationship that requires contract negotiation for enterprise features.

For a small team or a solo engineer running multiple products, pipeline sovereignty pays for itself in the first quarter. The breakeven isn’t just financial — it’s operational. When everything runs on your infrastructure, you develop a deep understanding of the system that no managed service can provide.

[2026-05-15] TAGS: security,machine-learning,biometrics,research

BEHAVIORAL BIOMETRICS FOR BOT DETECTION

CAPTCHAs are a tax on legitimate users. Rate limiting is a blunt instrument. The question I wanted to answer was whether it’s possible to distinguish humans from bots using behavioral signals alone — without interrupting the user’s experience.

The Behavioral Signal Space

Human motor control produces characteristic patterns that are extremely difficult to simulate convincingly. I focused on three signal families:

Mouse dynamics — Fitts’s Law predicts that human cursor movement time is a logarithmic function of target distance and size. Humans overshoot small targets, correct course with micro-adjustments, and exhibit acceleration curves that follow well-studied biomechanical models. Bots typically move in straight lines at constant velocity, or add Gaussian noise that doesn’t match real motor control distributions.

Keystroke timing — The inter-key interval between specific character pairs (digraphs) is remarkably consistent for individual humans and remarkably variable across populations. A human typing “th” produces a different timing signature than “qz” because of finger adjacency on the keyboard. Bots either type at constant speed or add random delays that don’t correlate with character pair difficulty.

Motor control jerk — Jerk is the derivative of acceleration, and human movement minimizes it (the “minimum jerk model” from motor neuroscience). Real cursor paths are smooth in their third derivative; synthetic paths almost never are. This signal alone is a strong discriminator because it’s computationally expensive to fake without understanding the underlying biomechanics.

The Classifier

A Random Forest trained on 47 features extracted from these three signal families achieves strong separation between human and automated sessions. The model is deliberately kept simple — ensemble methods over hand-crafted features rather than deep learning on raw signals. This makes the model interpretable (I can explain exactly which features triggered a classification) and fast enough to run in-session without noticeable latency.

Privacy by Design

All behavioral signals are processed in-session and never stored as personally identifiable information. The system computes feature vectors from raw signals, feeds them to the classifier, and discards the raw data. Feature vectors are aggregated and anonymized before entering the training pipeline. The design is FADP-compliant by construction, not by afterthought.

What It Doesn’t Solve

Sophisticated adversaries with access to the feature list could build bots that approximate human motor control distributions. The defense against this is continuous model retraining and feature rotation — a signal that’s public for six months gets replaced by one extracted from a different behavioral dimension. It’s an arms race by design, but one where the defender has a significant advantage: real behavioral data is free and abundant, while synthetic behavioral data requires deep domain expertise to produce.

[2026-05-28] TAGS: security,networking,wireguard,dns

ZERO-TRUST NETWORKING AT SCALE

When I first implemented zero-trust networking, it was a mesh VPN with an identity provider. That solved the authentication problem but left gaps in service discovery, threat detection, and automated response. Over the past several months, the network plane has grown into something significantly more capable.

The Foundation: Encrypted Mesh

Every node in the infrastructure communicates through WireGuard encrypted tunnels. There’s no “trusted network” — even nodes sitting on the same physical switch communicate over encrypted channels. Identity is cryptographic, not network-based. A service doesn’t trust another service because it’s on the same subnet; it trusts it because it can verify its identity through a key exchange.

This eliminated an entire class of lateral movement attacks. Compromising one node doesn’t grant access to the network — it grants access to one encrypted endpoint with limited routing permissions.

CoreDNS for Service Discovery

Static IP-based service routing doesn’t scale when services move between hosts. CoreDNS provides internal service discovery with split-horizon resolution: internal queries resolve to mesh addresses, external queries resolve normally. This means services reference each other by name, not address, and redeployment doesn’t require updating every dependent service’s configuration.

The DNS layer also provides a natural point for policy enforcement. Queries for known-malicious domains get blocked at the resolver level. DNS query logs feed into the security operations plane as an early indicator of compromise — an internal service resolving a cryptocurrency mining pool is an immediate alert.

Intrusion Detection and Response

Network-level intrusion detection monitors traffic patterns across the mesh. Signature-based detection handles known attack patterns. Anomaly detection flags unexpected traffic flows — a database node initiating outbound connections to unfamiliar destinations, or a web server suddenly generating DNS queries at 10x its normal rate.

When the IDS triggers, the response is automated through a SOAR (Security Orchestration, Automation, and Response) pattern. Low-confidence alerts create tickets. Medium-confidence alerts isolate the suspicious node’s outbound traffic while preserving inbound monitoring. High-confidence alerts trigger full network isolation and page the on-call operator.

Self-Healing

The mesh is designed to reconverge after failure. If a node goes offline, traffic reroutes through surviving paths. If a tunnel degrades, the control plane renegotiates the connection. If the DNS resolver becomes unresponsive, nodes fall back to cached records with shortened TTLs.

The goal isn’t to prevent failure — it’s to make failure recoverable without human intervention for the common cases, and to make uncommon failures visible and well-instrumented for the cases that do need a human.

[2026-06-10] TAGS: fintech,tokens,microservices,architecture

TOKEN ECONOMICS AS INFRASTRUCTURE

Building a token system taught me that financial infrastructure is the hardest backend work I’ve done. Not because the math is complex — it’s addition and subtraction — but because the invariants are absolute. A ledger that’s wrong by one unit is completely wrong.

The Double-Entry Constraint

Every token movement in the system is recorded as an immutable debit-credit pair. If a member earns 10 reputation tokens, there’s a debit from the system reserve and a credit to the member’s account. Balances are never stored directly — they’re always derived by summing the ledger. This means the balance is a view, not a value, and it’s impossible for the system to “lose” tokens through a partial write.

The double-entry pattern isn’t novel — it’s centuries-old accounting. What surprised me was how naturally it maps to distributed systems design. Every entry is append-only (immutable), every state is derived (no mutable cache to invalidate), and every discrepancy is traceable to a specific transaction pair.

Two Token Types

The system uses two distinct token types. Reputation tokens are soulbound — they can’t be transferred between members. You earn them through verified contributions: shipping code, completing reviews, mentoring other members. They represent your standing in the cooperative, and they can never be bought or sold. Utility tokens are transferable. They function as the ecosystem’s internal medium of exchange — payment for services, marketplace transactions, resource allocation.

This separation solves a problem I saw in every single-token system I studied: when the same token represents both status and currency, people accumulate status by hoarding rather than contributing. Splitting the two makes gaming the reputation system require actual work.

Three Layers

The architecture has three explicit layers. The Truth layer is the immutable ledger — append-only, cryptographically chained, no updates or deletes. The Policy layer sits above it and implements configurable business rules: earning rates, transfer limits, cooling periods, anti-gaming thresholds. Policies can change without touching the ledger. The Operations layer handles human intervention — dispute resolution, manual adjustments, edge cases that no policy anticipated.

Negative Programming

The test suite is built around adversarial scenarios. What happens when two transfers execute simultaneously against the same balance? What if a member tries to earn reputation tokens from their own activity? What if the policy engine crashes mid-transaction? Every edge case I could think of — and several I couldn’t, until a colleague tried to break the system — became a test before it became a feature. The philosophy is simple: if the happy path works but the failure modes are untested, the system isn’t ready.

[2026-06-20] TAGS: security,architecture,defense-in-depth

MULTI-PLANE SECURITY ARCHITECTURE

I stopped thinking about security as a perimeter months ago. The mental model that replaced it is biological: security as an organism with specialized organs, each responsible for a different survival function. What emerged is a ten-plane architecture where each plane owns a distinct concern and communicates with its neighbors through well-defined interfaces.

The Ten Planes

Network handles encrypted mesh connectivity and DNS-level filtering. Control manages identity, authentication, and policy enforcement. Data covers encryption at rest, backup integrity, and access auditing. Application owns input validation, CSRF protection, CSP headers, and rate limiting. Compute isolates workloads and manages resource boundaries. Build secures the CI/CD pipeline itself — image signing, dependency scanning, and registry access controls.

The interesting planes are the last four. Bot runs adversarial red-team and blue-team agents, both driven by YAML-defined behavioral profiles. Red agents probe for weaknesses using scripted attack patterns; blue agents learn to detect and neutralize those patterns. Security Operations aggregates threat intelligence, correlates events across planes, and executes automated response playbooks. ML Pipeline trains classifiers on behavioral data — mouse dynamics, keystroke timing, scroll patterns — to detect anomalous sessions. Ops provides the human intervention layer for incidents that exceed automated thresholds.

Why Ten Planes Instead of Three

The traditional model (network / application / data) leaves enormous gaps. Build pipeline compromise is one of the most effective supply chain attacks, yet most architectures treat CI/CD as a DevOps concern, not a security surface. Bot traffic detection requires specialized behavioral analysis that doesn’t belong in generic application security. And ML-based detection needs its own pipeline lifecycle — training, validation, deployment, drift monitoring — that would pollute an application plane if mixed in.

Each plane has its own standards document (RCCP-SEC-1 through SEC-7, with more in draft). Each standard defines the plane’s responsibilities, its interfaces with adjacent planes, and its failure modes. When a plane fails, the others continue operating. When an incident spans multiple planes, the Security Operations plane coordinates the response.

The Design Philosophy

The guiding principle is “attack first, train from that.” Every defensive capability was built by first writing the attack that it needs to stop. The red-team bot plane isn’t an afterthought — it’s the primary test harness. If the blue team can’t detect the red team’s latest profile, that’s a failing test, and the defense gets improved before the profile ships.

This approach means security is never “done.” The organism adapts because new attack profiles continuously pressure it to evolve. The alternative — a static firewall and some rate limiting — is how systems get breached while their operators believe they’re protected.

[2026-02-18] TAGS: monitoring,infrastructure,observability

MONITORING WITHOUT AGENTS — THE HYBRID APPROACH

Most monitoring tools (Datadog, New Relic, Prometheus node_exporter) require agents installed on every target system. For a small infrastructure, that’s overhead — more software to maintain, more attack surface, more resource consumption.

Starting Agentless

The initial approach: a central observer queries target nodes using native protocols. SSH for system metrics, SQL for database health, Unix sockets for container status. No software installed on targets.

This worked for basic monitoring. But SSH-based polling is slow — each check opens a connection, runs a command, parses output. At 30-second intervals across multiple nodes, the observer spends most of its time waiting on SSH handshakes.

The Hybrid Evolution

The solution: lightweight HTTP microservices on nodes that benefit from faster polling. A minimal Flask app (< 50 lines) exposes /health and /stats endpoints. The observer hits these endpoints instead of opening SSH sessions.

What stayed agentless: - Database health checks (native SQL protocol) - Container status (Docker socket) - Security feeds (existing HTTP APIs)

What got agents: - Edge nodes where SSH latency was noticeable - Nodes needing sub-second response times

The Lesson

“Agentless” is a design preference, not a religion. The right answer is using native protocols where they’re fast enough and adding lightweight agents where they’re not. The key constraint: agents must be stateless, single-purpose, and trivially replaceable.

[2026-03-05] TAGS: performance,caching,redis

TIERED CACHING — REDUCING DATABASE LOAD BY 95%

The media engine was hitting the database on every request. For a content-heavy platform serving images and audio, that’s a problem.

Three Layers of Caching

Layer 1: Browser cache. Immutable assets (processed images, transcoded audio) get Cache-Control: public, max-age=31536000 — one year. Once a browser downloads an asset, it never asks for it again. This alone eliminated the majority of repeat requests.

Layer 2: Application cache. Redis sits between the API and the database. Metadata queries (file location, dimensions, format) are cached with TTLs tuned by content type: 24 hours for images, 1 hour for audio metadata, 7 days for HLS manifests. Cache invalidation happens on upload/update, not on a timer.

Layer 3: Object storage streaming. Media bytes stream directly from S3-compatible storage to the client. The API handles authentication and routing but never writes media to disk. This keeps the API server stateless and memory-efficient.

The Result

Database queries dropped by approximately 95% on frequently accessed content. The database now handles writes (uploads, metadata updates) and cold-start reads. Everything else is served from cache or streamed from storage.

The Gotcha

Cache invalidation is the hard part. We use explicit invalidation on writes — when a file is uploaded or updated, its cache keys are deleted. No stale data, but it means every write path needs to know what to invalidate. This is manageable at our scale but would need rethinking for a multi-writer system.

[2026-03-22] TAGS: security,networking,tailscale

ZERO-TRUST NETWORKING WITHOUT THE ENTERPRISE PRICE TAG

“Zero trust” usually comes with enterprise sales calls and six-figure contracts. It doesn’t have to.

The Core Idea

No service trusts any other service based on network location. Every connection is authenticated by identity, not IP address. This means:

No firewall rules like “allow traffic from 10.0.0.0/8”
No VPN concentrators that become single points of failure
No port forwarding through NAT

The Stack

Mesh VPN handles encrypted node-to-node connectivity. Every node gets a stable identity. Traffic between nodes is encrypted with WireGuard regardless of the underlying network.

Identity provider handles SSO/OIDC for web applications. Instead of each app managing its own authentication, a central identity provider issues tokens. The reverse proxy validates these tokens before traffic reaches the application.

Reverse proxy terminates TLS and enforces identity checks via forward auth. Applications behind it never see unauthenticated traffic.

What It Costs

The mesh VPN has a generous free tier. The identity provider is open-source. The reverse proxy is open-source. Total cost: the compute to run them — which is negligible when they’re containers on existing infrastructure.

What It Doesn’t Solve

This protects the network layer. Application-level security (input validation, CSRF, content security policies) is a separate concern. Zero trust is a network architecture, not a security silver bullet.

[2026-04-15] TAGS: gitops,cicd,self-hosted

WHY I LEFT GITHUB ACTIONS FOR SELF-HOSTED CI/CD

GitHub Actions is convenient. It’s also a dependency you don’t control.

What Triggered the Switch

Three things happened in the same month: 1. A GitHub outage blocked deployments for 4 hours 2. A billing surprise from exceeding free-tier minutes 3. A workflow that worked locally but failed in Actions due to Ubuntu version differences

None of these were catastrophic. But all of them were avoidable.

What I Built Instead

A self-hosted Gitea instance with act_runner executing workflows on bare metal. The entire pipeline — source control, CI/CD, container registry, deployment — runs on infrastructure I own.

Key design decisions: - Host-mode runner — direct Docker access, no nested virtualization overhead - Dormant by design — runners stay idle until a repo includes a workflow file - Two-stage pipeline — build on push to main, deploy only on version tags - BuildKit caching — inline cache layers cut rebuild times significantly

The Trade-Off

I traded convenience for ownership. Setup took a weekend. Maintenance is minimal — the runner is a systemd service that restarts automatically. The registry is a Docker container.

What I gained: zero external dependencies, zero billing surprises, and deployments that work even when GitHub is down.

[2026-04-28] TAGS: devops,docker,deployment

BUILDING A STATELESS DEPLOYMENT TARGET

The goal was simple: production servers should hold no source code, no build artifacts, no secrets baked into images. They pull containers from a private registry and run them. That’s it.

The Problem

Traditional deployment copies code to a server, installs dependencies, and runs the application. This creates state — the server knows what it’s running. If the server dies, you need to rebuild that state. If you deploy a bad version, rollback means remembering what was there before.

The Approach

Containers as the only deployment artifact. The CI pipeline builds an image, pushes it to a private registry. The production server runs:

bash docker compose pull docker compose up -d --force-recreate --remove-orphans docker image prune -f

Three commands. The server pulls the latest image, recreates containers (zero-downtime via compose’s rolling strategy), and cleans up old images. No source code touches the production filesystem.

What This Gets You

Rollback is trivial — point docker-compose.yml at a previous image tag
Server replacement is fast — new server just needs Docker and compose
No dependency drift — everything is baked into the image at build time
Secrets stay in the pipeline — injected at build/deploy time, never stored on disk

The server is disposable. The image is the truth.

[2026-02-10] TAGS: Cybersecurity, Telemetry, Honeypot, Compliance

Master's Thesis: Psychological Heuristics & Honeypot Telemetry

Currently leveraging the RedCup infrastructure to host 'Project Samosa', the high-interaction honeypot experiment for my Master's thesis in Cybersecurity and Network Defense. The research explores the convergence of psychological heuristics, UI design, and behavioral cybersecurity. By utilizing a "Dutch Auction" pricing algorithm, the application induces cognitive stress to test if panicking human users exhibit the same biometric telemetry—erratic keystrokes and fast mouse coordinates (X, Y)—as malicious bots. A major component of the engineering involves hardcoding "Compliance by Design." To legally collect this high-fidelity biometric data without violating the Swiss Federal Act on Data Protection (FADP), the system utilizes cryptographic one-way hashing to anonymize all sessions at the code level.

[2025-11-05] TAGS: Redis, C++, MinIO, Microservices

RedCup Cloud Phase 3: The Asynchronous Compute Engine

To prevent heavy tasks (like processing large photo uploads) from freezing the web interface, the system decouples web logic from compute muscle. When a user uploads a file, the Flask frontend immediately pipes the raw data into a MinIO object storage vault and drops a standardized JSON job payload into a Redis message bus. On a completely isolated virtual machine (LXC 1005), a stateless C++ compute engine polls Redis at near-zero latency. It pulls the job, downloads the file directly into a Linux RAM-disk (/dev/shm/), and processes the image using purely shared memory to prevent SSD wear. Once optimized, it fires a webhook back to the database to update the UI instantly.

[2025-08-22] TAGS: Tailscale, Caddy, Authentik, Cloudflare

RedCup Cloud Phase 2: Defense in Depth & Edge Routing

With the application factory built, the network layer required enterprise-grade segmentation. The infrastructure utilizes a 5-Ring Zero Trust architecture. Public traffic hits a Cloudflare-protected DigitalOcean VPS, which acts merely as a "dumb pipe," blindly forwarding encrypted TCP packets into a Tailscale mesh network. This allows traffic to safely bypass local firewalls and exit directly into the core hypervisor. At the ingress point, a Caddy reverse proxy terminates the SSL and acts as the ultimate gatekeeper. Before any application logic is executed, Caddy forces a sub-request to Authentik (OIDC). If a user lacks the correct session identity, the connection is dropped at the edge, ensuring total blast-radius containment.

[2025-06-10] TAGS: Proxmox, Docker, Gitea, CI/CD

RedCup Cloud Phase 1: The GitOps App Factory

To build the RedCup architecture, I abandoned manual server configuration in favor of a strict GitOps methodology hosted on Proxmox bare-metal hypervisors. The development environment is fully ephemeral: utilizing Coder, I provision temporary, isolated Docker workspaces via Terraform. When code is pushed to the self-hosted Gitea repository, it triggers an automated CI/CD pipeline. The Gitea Runner dynamically spawns a build container, compiles the application into an immutable Docker image, and pushes it to a private registry. Deployment to the production App Node (LXC 1002) is handled with zero-downtime container swapping, ensuring the application logic remains entirely stateless and disposable.

[2025-04-18] TAGS: Docker, MinIO, Grafana, InfluxDB

Infrastructure Genesis & The Data Lake Pivot

The first iteration of the new infrastructure began with a headless Ubuntu Raspberry Pi aiming to be a secure audio streamer, but immediate DNS loops caused by a Pi-hole container forced a pivot to Tailscale MagicDNS and Split DNS. The system quickly evolved into an observability stack utilizing Grafana, Loki, and InfluxDB. However, raw log volumes from Zeek and Suricata caused 429 errors and exposed the Pi's 64GB USB drive as a critical bottleneck. The architecture was shifted to a "Distributed Monolith" by repurposing an HP laptop into a bare-metal MinIO Data Lake. Loki was stabilized to use MinIO as its backend, preserving the Pi's CPU while permanently solving the storage crisis.

[2024-08-15] TAGS: Architecture, Systems Analysis, Event-Driven

Enterprise Endpoint Security & The Event-Driven Pivot

My foundation in enterprise infrastructure was solidified managing large-scale VPN deployments, VLAN security enhancements, and device lifecycles via Microsoft Intune and Azure Active Directory. However, the architectural blueprint for my current cloud platform was inspired outside of traditional IT. While working in the restaurant industry, I observed the operational flow of their delivery system: kitchen prep, packaging, and delivery teams all coordinated asynchronously through a centralized terminal. It was a physical manifestation of an event-driven microservice architecture. Recognizing how optimizing this flow allowed a single restaurant to scale like a tech platform, I began designing the digital equivalent: a highly scalable, decoupled infrastructure that would eventually become RedCup Cloud.

[2023-06-30] TAGS: VMWare, VLAN, Networking, GITI

Academic Foundations & Enterprise Cloud Testing

The journey started with a strong core in the sciences, getting certified in 8 subjects including Math and Physics during high school. Transitioning into the Bachelor of Information Technology at GITI provided the platform to turn theory into practice. During this period, the focus was heavily on network solutions testing, specifically deploying VMware implementations, backup systems, and VLAN configurations. Beyond just learning the stack, hosting practical networking labs for peers solidified a deep understanding of bridging textbook concepts with real-world enterprise cloud solutions