AI Infrastructure · NVIDIA NCCL
Multi-GPU Communication Explained
Deep-dive into NCCL transport protocols (Simple, LL, LL128), primitives, and collective ops for distributed AI training at scale.
↗ Read Article on LinkedInSenior network infrastructure engineer with 15+ years designing, operating, and automating enterprise-grade networks across WAN, LAN, data center, and cloud-connected environments. Deep hands-on background in BGP, OSPF, MPLS, Palo Alto, F5 BIG-IP LTM, and production-grade AIOps tooling built on top of Claude Sonnet.
// Projects & Labs
AI-assisted operational tooling for network infrastructure. LLM-integrated workflows via Claude Sonnet targeting Cisco Catalyst 8000 IOS XE on DevNet. Enforces ITIL change management — requires a change ticket before any config is applied. MCP-based tool dispatch with audit logging.
Active — claude-sonnet-4-6Stdlib-only Python tool that replaces manual ping sweeps. Reads a CSV device inventory, validates IPs, executes cross-platform pings, and writes timestamped results — the same logic behind SolarWinds NCM pre-change validation, built from first principles.
Week 1 — 100-Day ChallengeHands-on F5 BIG-IP LTM lab covering virtual servers, pools, health monitors, persistence profiles, SSL offload, SNAT, and traffic engineering. Migration planning workflows, reliability validation runbooks, and production-style documentation aligned with enterprise load-balancing environments.
Active — AWS LabOAuth integration and telemetry workflows using Python and secure credential management on macOS. Demonstrates API integration patterns applicable to enterprise infrastructure automation. Sanitized telemetry artifacts published to GitHub Pages.
Lab — OAuth + Telemetry// Latest Articles & Posts
AI Infrastructure · NVIDIA NCCL
Multi-GPU Communication Explained
Deep-dive into NCCL transport protocols (Simple, LL, LL128), primitives, and collective ops for distributed AI training at scale.
↗ Read Article on LinkedInAI Infrastructure · GPU Clusters
AI Backbone Systems in 2026
Full landscape of AI infrastructure — H100 DGX clusters, rack-scale systems, training vs. inference split, and what's cutting edge today.
↗ Read Article on LinkedInARM + F5 VE don't mix on Mac M2. Pivoted to AWS — VPC, Security Group locked to home IP, F5 BIG-IP VE 30-day trial. Under $40.
↗ View on LinkedInBuilt Optimus — a personal AI agent on OpenClaw TUI, claude-sonnet-4-6 backend, focused on AI-driven network automation.
↗ View on LinkedInConnected AI to the network. First thing it asked: "What's your change ticket #?" — ITIL enforcement built in from day one.
↗ View on LinkedIn15–20 min daily. Real problems. Raw thinking. Brutal AI feedback. Training the mental discipline of a Principal Architect.
↗ View on LinkedIn// Experience
// Core Competencies
// Education & Certifications