Executive Summary
NVIDIA announced two products at GTC 2026 (March 16) that directly impact our OpenClaw evaluation:
- Nemotron — NVIDIA's open-source LLM family. Hybrid Mamba-Transformer architecture with Mixture-of-Experts. Models range from 4B to ~500B parameters, but only activate 1B–50B at inference time, making them fast and efficient. Free, permissive license. Designed for agentic AI workloads at scale.
- NemoClaw — An enterprise security wrapper that installs OpenClaw + Nemotron models inside a sandboxed runtime with network isolation, filesystem restrictions, and privacy routing. Apache 2.0 license. Currently alpha.
Together, they transform the OpenClaw equation. Our original evaluation flagged OpenClaw as CRITICAL security risk. NemoClaw addresses the top concerns (network exposure, filesystem access, credential leakage). Nemotron provides free local inference, eliminating API costs for routine tasks.
NemoClaw + Nemotron makes OpenClaw deployable in a way raw OpenClaw never was — but it's alpha software with no third-party audits yet, and our core objections (Jack's allergy safety, prompt-based security) still apply.
NVIDIA Nemotron — The Model Family
What Is It?
Nemotron is NVIDIA's family of open foundation models, spanning four generations since 2024. The current generation (Nemotron 3) introduces a breakthrough hybrid Mamba-Transformer Mixture-of-Experts architecture that delivers frontier-class quality at a fraction of the compute cost.
NVIDIA's strategy is clear: give away the models to sell the hardware. But the models are genuinely good, and the licensing is among the most permissive in the industry.
The Nemotron 3 Lineup
| Model | Total Params | Active Params | Context Window | Target Use Case |
|---|---|---|---|---|
| Nano 4B | 4B | ~1B | 1M tokens | Edge devices, mobile, IoT |
| Nano 30B | 30B | 3B | 1M tokens | Efficient agent tasks, local workstations |
| Super 120B | 120B | 12B | 1M tokens | Multi-agent workflows, complex reasoning |
| Ultra ~500B | ~500B | ~50B | 1M tokens | Frontier reasoning (expected H1 2026) |
Architecture Deep Dive
The Nemotron 3 architecture combines three paradigms that have individually proven successful:
1. Mamba-2 Layers (Linear-Time Sequence Processing)
- Process sequences in O(n) time instead of O(n²) for standard attention
- Excellent for long context windows — 1M tokens becomes practical
- Handle sequential reasoning and pattern matching efficiently
- 23 of 52 layers in Nano 30B are Mamba-2
2. Transformer Attention Layers (Precise Associative Recall)
- Grouped Query Attention (GQA) with 2 groups for efficiency
- Handle tasks requiring precise retrieval from context (names, numbers, code references)
- 6 of 52 layers in Nano 30B are GQA attention
- Placed strategically where precise recall matters most
3. Mixture-of-Experts (Parameter Efficiency)
- 23 of 52 layers are MoE in Nano 30B
- Each MoE layer has multiple specialist "expert" sub-networks
- Router selects only 1–2 experts per token — rest stay dormant
- Result: 30B total params but only 3B active per token = 10x efficiency
Novel Innovation — Latent MoE
Compresses token representations before routing to experts. Enables 4x more expert specialists at the same inference cost. Think of it as "expert specialization on a budget."
Multi-Token Prediction (MTP)
Model predicts multiple future tokens simultaneously. Up to 3x wall-clock speedup for structured output (JSON, code, markdown). Particularly valuable for agent tool-calling patterns.
NVFP4 Native Training
First models trained natively in 4-bit floating point precision. Purpose-built for NVIDIA B200 GPUs. 4x memory and compute efficiency vs FP8 on H100. Means smaller GPUs can run larger models.
Training Pipeline
Phase 1 — Pretraining
- 25 trillion tokens (massive — GPT-4 was reportedly ~13T)
- NVFP4 precision on B200 GPU clusters
- NVIDIA released 3T+ tokens of the pretraining data publicly
Phase 2 — Supervised Fine-Tuning
- 7 million samples from a 40M-sample post-training corpus
- Curated for instruction following, tool use, and agentic behavior
- NVIDIA released 18M samples of this data publicly
Phase 3 — Multi-Environment Reinforcement Learning
- 21 different environment configurations
- 1.2 million environment rollouts
- 10 new training gym environments (open-sourced)
- Optimized for multi-step reasoning and real-world task completion
Benchmark Performance
Nemotron 3 Super (120B / 12B active)
| Benchmark | Result | What It Measures |
|---|---|---|
| PinchBench | 85.6% (best open model in class) | Agent reasoning and planning |
| AIME 2025 | Leading in size class | Advanced mathematics |
| SWE-Bench Verified | Leading in size class | Real-world software engineering |
| Terminal Bench | Leading in size class | Command-line task completion |
| Throughput | 5x previous Nemotron | Raw inference speed |
Historical: Llama-Nemotron 70B vs Competitors
| Benchmark | Nemotron 70B | GPT-4o | Claude 3.5 Sonnet |
|---|---|---|---|
| Arena Hard | 85.0 | 79.3 | 79.2 |
| AlpacaEval 2 LC | 57.6 | — | — |
| MT-Bench | 8.98 | — | — |
| Aider (coding) | 55.0% | 72.9% | — |
Nemotron wins on alignment/chat benchmarks. Claude and GPT-4o still lead on coding and complex reasoning. Nemotron 3 Super is more competitive on coding (SWE-Bench leading in class), but detailed head-to-head vs Claude Opus/Sonnet is not yet published.
Where Nemotron truly excels: Throughput. When you need many parallel agents doing moderate-complexity tasks, Nemotron's MoE architecture delivers more tokens per second per dollar than any competitor.
Specialized Variants
| Variant | Purpose |
|---|---|
| Nemotron 3 Omni | Multimodal — audio + vision + language in one model |
| Nemotron 3 VoiceChat | Real-time simultaneous listen-and-respond |
| Nemotron Nano VL 12B | Vision-language for image understanding |
| Nemotron RAG | Retrieval and embedding (leading ViDoRe, MTEB leaderboards) |
| Nemotron Safety | Content moderation and guardrails |
| Nemotron Speech | Automatic speech recognition and text-to-speech |
Licensing
NVIDIA Open Model License:
- Use, modify, distribute, commercially deploy — all allowed
- Royalty-free, perpetual, worldwide
- No attribution required
- Weights, training data, AND training recipes all published
- One of the most permissive AI model licenses in existence
Availability
| Platform | Access |
|---|---|
| Hugging Face | All models (BF16, FP8 variants) |
| NVIDIA NIM | API via build.nvidia.com |
| Ollama | Nemotron 3 Super for local inference |
| NeMo Framework | Full training and fine-tuning |
| GitHub | Developer assets at NVIDIA-NeMo/Nemotron |
The Nemotron Coalition
Announced at GTC 2026 — a first-of-its-kind global collaboration:
Members: Black Forest Labs, Cursor, LangChain, Mistral AI, Perplexity, Reflection AI, Sarvam, Thinking Machines Lab
Goal: Collaboratively build the next generation of open frontier models across six families:
- Nemotron — Language
- Cosmos — World models / vision
- Isaac GR00T — Robotics
- Alpaymayo — Autonomous driving
- BioNeMo — Biology / chemistry
- Earth-2 — Weather / climate
Notable Adopters
Accenture, Cadence, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud Infrastructure, Palantir, Perplexity, ServiceNow, Siemens, Synopsys, Zoom
NVIDIA NemoClaw — The Security Wrapper
What Is It?
NemoClaw is an open-source software stack that wraps OpenClaw with enterprise-grade security, privacy, and isolation controls. It is not a separate agent — it is OpenClaw running inside NVIDIA's security infrastructure.
Jensen Huang, GTC 2026 keynote: "30 years of NVIDIA computing, distilled into an agent platform."
Peter Steinberger (OpenClaw creator, now at OpenAI): "With NVIDIA and the broader ecosystem, we're building the claws and guardrails that let anyone create powerful, secure AI assistants."
One-Command Install
curl -fsSL https://nvidia.com/nemoclaw.sh | bash
This installs:
- OpenClaw agent
- Nemotron models (default: Nemotron 3 Super 120B)
- OpenShell sandboxed runtime
- NVIDIA Agent Toolkit with pre-configured security policies
The curl | bash installation pattern is a security anti-pattern. Mitigate by reviewing the script content before running it. This alone should not be a blocker, but it's worth flagging.
Architecture
Two-component design:
| Component | Language | Role |
|---|---|---|
| CLI Plugin | TypeScript | Integrates with OpenClaw CLI, user-facing |
| Blueprint | Python | Orchestrates OpenShell resources, manages sandbox |
The Four-Layer Security Model
This is NemoClaw's core value proposition — the direct answer to OpenClaw's CRITICAL security rating.
Layer 1: Network Isolation
- Default deny — all network connections blocked unless explicitly whitelisted
- Policy defined in
openclaw-sandbox.yaml(human-readable, version-controlled) - Unauthorized requests are blocked AND surfaced in a TUI for operator review
- Hot-reloadable — change policies without restarting the sandbox
- What this fixes: OpenClaw's 40K+ exposed instances problem. NemoClaw never listens on 0.0.0.0.
Layer 2: Filesystem Restrictions
- Agent can write ONLY to
/sandboxand/tmp - All other filesystem paths are read-only
- No access to host filesystem outside the container
- What this fixes: OpenClaw's unrestricted file access (could read SSH keys, credentials, browser data)
Layer 3: Process Protection
- Agent runs inside OpenShell — a K3s-based container sandbox
- All blueprints are immutable, versioned, and digest-verified
- Executed as subprocesses with restricted capabilities
- No privilege escalation possible from within the sandbox
- What this fixes: OpenClaw's Docker sandbox being OFF by default, arbitrary code execution risk
Layer 4: Inference Routing (Privacy Router)
- All model API calls route through OpenShell — agent cannot call external APIs directly
- Privacy router makes the key decision for each query:
- Sensitive data → routed to local Nemotron models (never leaves the machine)
- Non-sensitive / high-capability needed → routed to frontier cloud models (Claude, GPT)
- Configurable classification rules for what counts as "sensitive"
- What this fixes: OpenClaw's credential and PII leakage to external model providers
System Requirements
| Spec | Minimum | Recommended |
|---|---|---|
| CPU | 4 vCPU | 4+ vCPU |
| RAM | 8 GB | 16 GB |
| Disk | 20 GB | 40 GB |
| OS | Ubuntu 22.04 LTS+ | Ubuntu 22.04 LTS+ |
| Runtime | Node.js 20+, Docker | Node.js 20+, Docker |
Hardware agnostic — does not require NVIDIA GPUs (though optimized for them). Supported on: GeForce RTX PCs, RTX PRO workstations, DGX Station, DGX Spark, any Linux machine with Docker.
Release Status
| Detail | Value |
|---|---|
| Announced | March 16, 2026 (GTC keynote) |
| License | Apache 2.0 |
| GitHub | github.com/NVIDIA/NemoClaw |
| Stars | ~6.7K (first 2 days) |
| Forks | 739 |
| Contributors | ~26 |
| Status | Alpha — "Expect rough edges" |
| Tech Stack | TypeScript 37.7%, Shell 30.6%, JS 25.7%, Python 4.9% |
NVIDIA's own docs: "Interfaces, APIs, and behavior may change without notice as the design iterates."
Enterprise Partnerships
Being pursued for NemoClaw integrations: Salesforce, Cisco, Google, Adobe, CrowdStrike, SAP, JFrog (supply chain security)
OpenClaw — Quick Refresher
For full details, see our Deep Research Report on Notion.
| Attribute | Detail |
|---|---|
| What | Open-source autonomous AI agent (TypeScript/Node.js) |
| GitHub Stars | 234K+ |
| License | MIT |
| Creator | Peter Steinberger (now at OpenAI) |
| Governance | Moving to open-source foundation |
| Runtime | Long-lived Gateway daemon on port 18789 |
| Messaging | 22+ platforms (WhatsApp, Signal, Telegram, Discord, iMessage, Slack, Teams, etc.) |
| AI Models | 20+ providers (Claude, GPT, Gemini, DeepSeek, Ollama, etc.) |
| Skills | 10,700+ community skills on ClawHub |
| Integrations | 50+ (chat, smart home, music, productivity, browser, cron) |
| Security Rating | CRITICAL — 512 vulns, 8+ critical CVEs, 20% malicious marketplace skills |
Why We Were Cautious
- 40K+ instances exposed on public internet — Gateway binds to 0.0.0.0
- ClawHavoc attack — 1,184 malicious skills in official marketplace (12–20% compromised)
- Prompt-based security — safety rules are instructions, not architectural boundaries
- Microsoft's warning: "Not appropriate to run on a standard personal or enterprise workstation"
- Jack's allergy rules — cannot safely move from hardcoded logic to prompt-based
The Full Stack: OpenClaw + NemoClaw + Nemotron
How They Fit Together
┌────────────────────────────────────────────────────┐
│ USER INTERFACE │
│ WhatsApp Signal Telegram Discord iMessage │
└──────────────────────┬──────────────────────┘
│
┌──────────────────────▼──────────────────────┐
│ OPENCLAW │
│ Agent Runtime · Skills · Memory · Integrations │
│ (TypeScript, Gateway daemon, port 18789) │
└──────────────────────┬──────────────────────┘
│
┌──────────────────────▼──────────────────────┐
│ NEMOCLAW │
│ Security Wrapper · Sandbox · Policy Engine │
│ │
│ ┌─────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ Network │ │ Filesystem │ │ Process │ │
│ │ Isolation │ │ Restrictions │ │ Protection │ │
│ │ (whitelist) │ │ (/sandbox │ │ (OpenShell │ │
│ │ │ │ /tmp only) │ │ K3s) │ │
│ └─────────────┘ └──────────────┘ └────────────┘ │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ PRIVACY ROUTER │ │
│ │ Sensitive → Local Non-sensitive → Cloud │ │
│ └────────────────────────────────────────────┘ │
└──────────────────────┬──────────────────────┘
│
┌──────────────┼──────────────┐
▼ ▼ ▼
┌──────────────┐ ┌──────────┐ ┌──────────────┐
│ NEMOTRON │ │ Claude │ │ GPT / etc │
│ (Local LLM) │ │ (Cloud) │ │ (Cloud) │
│ Free, fast │ │ Smart │ │ Optional │
│ Private │ │ Capable │ │ │
└──────────────┘ └──────────┘ └──────────────┘
What Each Layer Provides
| Layer | Provides | Without It |
|---|---|---|
| OpenClaw | Agent brain, messaging, skills, integrations, always-on daemon | No agent — just raw model APIs |
| NemoClaw | Security sandbox, network isolation, filesystem lock, privacy routing | OpenClaw runs naked — CRITICAL risk |
| Nemotron | Free local inference, private data stays local, no API costs for routine tasks | Pay per token to cloud providers for everything |
Why This Combination Matters
Before NemoClaw: Deploying OpenClaw required accepting CRITICAL security risk. Our evaluation said "do not deploy without full isolation" — which meant building your own sandbox, firewall rules, Docker hardening, and credential isolation manually.
NVIDIA built exactly the isolation we specified. One command gets you a sandboxed OpenClaw with: network whitelist (no more 40K exposed instances), filesystem jail (no SSH key / credential theft), process isolation (no container escape), and privacy routing (sensitive data stays on-device via Nemotron).
With Nemotron: Routine queries (scheduling, reminders, simple lookups, home automation) run on free local models. Only complex reasoning (coding, analysis, financial) routes to Claude. This dramatically reduces API costs and keeps private data off external servers.
Comparison: Our Stack vs the NVIDIA-OpenClaw Stack
Architecture Comparison
| Dimension | Our Stack (Claude Code + COO) | NVIDIA Stack (OpenClaw + NemoClaw + Nemotron) |
|---|---|---|
| Runtime | Ephemeral CLI sessions | Always-on daemon (24/7) |
| Interface | Terminal + Discord (limited) | 22+ messaging platforms |
| AI Model | Claude only (Anthropic) | Multi-model (Claude + GPT + Gemini + local Nemotron) |
| Security Model | No daemon = minimal attack surface | 4-layer sandbox (NemoClaw) |
| Privacy | All queries go to Anthropic API | Privacy router — sensitive stays local |
| Cost | Pro plan + API usage | Nemotron free locally; API only for complex tasks |
| Coding | Best-in-class (Claude Code) | Weaker — Nemotron trails Claude on coding |
| Orchestration | C-suite agent hierarchy (COO/CTO/CFO/CISO/CMO) | Flat — single agent with skills |
| Memory | File-based + session persistence | SQLite vector + daily logs + MEMORY.md |
| Smart Home | Home Assistant MCP | Home Assistant (same underlying) |
| Network Mgmt | UniFi MCP (direct UDM Pro control) | No equivalent |
| Financial | Monarch Money MCP (real bank data) | No equivalent |
| Food Safety | Hardcoded allergy rules (rosey-bot) | Prompt-based only — UNACCEPTABLE for Jack |
| Voice | None | Wake word, push-to-talk, TTS |
| Music | None | Spotify, Sonos, Shazam |
| Scheduling | Manual (pending items only) | Cron, scheduled automation |
| Browser | Firecrawl (scraping) | Full Chromium CDP automation |
| Messaging | Discord + Mattermost only | WhatsApp, Signal, Telegram, iMessage, Slack, Teams + 16 more |
Where Each Stack Wins
Our Stack Wins
- Software development and coding tasks
- Multi-agent orchestration with domain expertise
- Security review (CISO agent reviews before execution)
- Financial tracking and analysis
- Network infrastructure management
- Food safety (hardcoded rules, not prompt-based)
- Project isolation and memory management
NVIDIA Stack Wins
- Always-on availability (daemon vs CLI sessions)
- Messaging ubiquity (22+ platforms vs 2)
- Voice interaction
- Music control
- Scheduled automation / cron
- Multi-model flexibility
- Privacy (local inference for sensitive data)
- Cost efficiency (free local models for routine tasks)
- Browser automation
Use Cases for Our Household
High-Value Use Cases (NemoClaw + Nemotron + OpenClaw)
1. Family Messaging Hub
Ashley, Valentina, and family members message the AI via WhatsApp or Signal (apps they already use). No need to install Discord or learn terminal commands. Example: Ashley texts "What's for dinner tonight?" → agent checks meal plan, confirms allergen safety, responds. Privacy router keeps family conversations on local Nemotron — never hits external APIs.
2. Always-On Home Automation
"Turn off the lights at 10pm every night." "If the garage door is open after 11pm, close it and tell me." Scheduled tasks that our ephemeral CLI sessions can't do. Integrates with existing Home Assistant setup.
3. Proactive Scheduling
Morning briefing: weather, calendar, commute, school schedule. Automatic reminders for appointments, medications, school events. Cron-based recurring tasks without human initiation.
4. Voice Interface
Wake word activation for hands-free queries while cooking, driving, etc. Push-to-talk for quick questions. TTS responses — useful when hands are busy.
5. Music Control
"Play Jordan's bedtime playlist on the nursery Sonos." Spotify/Sonos integration through natural language.
6. Local AI for Private Tasks
Journal entries, personal reflections, sensitive family discussions. Nemotron processes locally — nothing leaves the house. Medical questions routed to local model, not cloud APIs.
Use Cases Where Our Stack Remains Superior
- Software Development → Claude Code + developer/reviewer agents
- Financial Analysis → CFO agent + Monarch Money MCP
- Network Management → CTO agent + UniFi MCP
- Security Review → CISO agent reviews before execution
- Meal Planning → rosey-bot with hardcoded allergy rules (NEVER move to prompt-based)
The Hybrid Approach
Run both stacks, each doing what it's best at:
| Task Type | Handled By | Why |
|---|---|---|
| Coding, development | Claude Code + COO | Best-in-class coding, agent hierarchy |
| Finance, budgets | CFO agent + Monarch | Real bank data, structured analysis |
| Network, infrastructure | CTO agent + UniFi | Direct hardware control |
| Security review | CISO agent | Architectural review before execution |
| Meal planning | rosey-bot | Hardcoded allergy safety |
| Family messaging | OpenClaw + NemoClaw | 22+ platforms, always-on |
| Home automation | OpenClaw + HA | Scheduled, always-on |
| Voice, music | OpenClaw | No equivalent in our stack |
| Private/sensitive queries | Nemotron (local) | Never leaves the machine |
| Quick lookups, reminders | OpenClaw + Nemotron | Free, fast, local |
Security Analysis
What NemoClaw Fixes
| Original Risk | Rating | NemoClaw Mitigation | Residual Risk |
|---|---|---|---|
| Network exposure (40K+ instances) | CRITICAL | Whitelist-only networking, no 0.0.0.0 binding | LOW — if policy is correctly configured |
| Filesystem access (SSH keys, creds) | CRITICAL | Write-only to /sandbox and /tmp | LOW — host filesystem isolated |
| Credential leakage to external APIs | HIGH | Privacy router, all API calls through OpenShell | MEDIUM — depends on classification accuracy |
| Arbitrary code execution | HIGH | OpenShell K3s container, digest-verified blueprints | LOW — container escape is hard |
| Prompt injection | HIGH | NOT ADDRESSED — still prompt-based security | HIGH — fundamental architectural flaw |
| Malicious marketplace skills | CRITICAL | PARTIALLY ADDRESSED — JFrog partnership for supply chain | MEDIUM — skill vetting still incomplete |
| Data at rest (memory stores PII) | HIGH | Sandbox isolation limits what's stored | MEDIUM — data in /sandbox still unencrypted |
What NemoClaw Does NOT Fix
The fundamental flaw. If a crafted message can hijack the agent's instructions, the sandbox doesn't help because the agent is already authorized to act. NemoClaw limits the blast radius but doesn't prevent the hijack.
ClawHub still has vetting issues. JFrog partnership is announced but not implemented. Installing community skills remains risky.
Moving hardcoded allergy rules to prompt-based instructions is STILL unacceptable. A prompt injection could override "never recommend foods containing almonds, sesame, milk, eggs, or peanuts." This is a life-safety issue that sandboxing does not address.
No third-party security audits. NVIDIA's security claims are design documents, not battle-tested facts.
The installation method itself is a security anti-pattern. Mitigated by reviewing the script before running, but still concerning.
Security Recommendation
If deploying NemoClaw + OpenClaw:
- Dedicated VM or container host (not EQR1)
- Tailscale-only network access (no public exposure)
- CISO agent review of sandbox policies before going live
- No shared credentials with our main stack
- Allergy-related meal planning stays in rosey-bot — NEVER delegate to OpenClaw
- Monitor NemoClaw GitHub for security advisories
- Revisit in 3–6 months when third-party audits exist
Hardware & Deployment Options
Option A: Docker on EQR2
| Spec | EQR2 Current | Requirement |
|---|---|---|
| CPU | TBD | 4+ vCPU |
| RAM | TBD | 16 GB recommended |
| GPU | None required | Optional (Nemotron runs on CPU, faster on GPU) |
| Disk | TBD | 40 GB for NemoClaw + models |
| Network | Tailscale | Already configured |
Pros: Separate machine from main infrastructure, Tailscale already set up
Cons: May not have GPU for fast Nemotron inference
Option B: Dedicated VM on EQR1
Pros: EQR1 has resources, Docker available
Cons: Shares host with critical infrastructure. Adds attack surface to primary machine.
Isolation is the whole point. Don't put the experiment next to production.
Option C: DGX Spark (New Hardware)
NVIDIA's new personal AI supercomputer. Designed specifically for NemoClaw + Nemotron.
- 128GB unified memory
- Grace Blackwell GPU
- Runs Nemotron 3 Super natively
- MSRP: ~$3,000 (pre-order)
Pros: Purpose-built, maximum Nemotron performance, dedicated hardware
Cons: $3,000, delivery timeline uncertain, may be overkill for evaluation
Option D: Cloud Instance
Spin up a cloud VM (any provider) with: 4 vCPU, 16GB RAM, 40GB disk, Docker pre-installed, Tailscale joined to our tailnet.
Pros: Zero hardware commitment, easy to tear down
Cons: Monthly cost, data leaves our network (partially offset by privacy router)
Nemotron Model Sizing for Local Inference
| Model | VRAM (FP16) | VRAM (Quantized) | CPU-Only? | Speed |
|---|---|---|---|---|
| Nano 4B | ~8 GB | ~2–4 GB | Yes (slow) | Fast on any GPU |
| Nano 30B (3B active) | ~6 GB active | ~2–3 GB active | Yes (usable) | Good on RTX 3060+ |
| Super 120B (12B active) | ~24 GB active | ~8–12 GB active | Slow | Needs RTX 4090 or better |
For our use case: Nano 30B is the sweet spot. 3B active params, runs on modest hardware, handles routine tasks well. Route complex queries to Claude via API.
Cost Analysis
Current Stack Costs
| Item | Monthly Cost |
|---|---|
| Anthropic Pro Plan | $20/mo |
| API overages (if any) | Variable |
| Total | ~$20/mo |
NemoClaw + Nemotron Added Costs
| Item | Monthly Cost |
|---|---|
| Hardware (if buying DGX Spark) | $3,000 one-time |
| Hardware (if cloud VM) | $20–50/mo |
| Hardware (if existing EQR2) | $0 |
| Nemotron models | Free (open-source) |
| NemoClaw software | Free (Apache 2.0) |
| OpenClaw software | Free (MIT) |
| Claude API for complex routing | Reduced — routine queries go to free Nemotron |
| Total (EQR2 deploy) | ~$0 additional |
| Total (cloud VM) | ~$20–50/mo additional |
Cost Savings from Privacy Router
With Nemotron handling routine queries locally:
- Simple questions, scheduling, reminders → Nemotron (free)
- Home automation commands → Nemotron (free)
- Family chat responses → Nemotron (free, private)
- Only coding, analysis, complex reasoning → Claude API (paid)
60–80% of household queries could run locally on Nemotron, significantly reducing API costs if we move beyond the Pro plan flat rate.
Recommendation
Short Term (Now — Next 30 Days)
- NemoClaw is alpha with no third-party security audits
- Monitor GitHub for security advisories and maturity signals
- Continue building claude-auto for our always-on needs
- Track the JFrog supply chain security integration
Medium Term (30–90 Days)
- Install Nemotron Nano 30B via Ollama — zero risk, just a local model
- Benchmark it against Claude for routine household queries
- Evaluate quality for: scheduling, reminders, home automation, simple Q&A
- Determine if it's "good enough" for non-critical tasks
Long Term (90+ Days, After NemoClaw Matures)
- Third-party security audit is published
- JFrog supply chain integration is live
- Ashley confirms messaging (WhatsApp/Signal) is a real pain point
- NemoClaw reaches beta or stable release
- CISO agent review approves the deployment plan
What We Should NEVER Do
- Move Jack's allergy rules from rosey-bot to OpenClaw/NemoClaw
- Deploy NemoClaw on EQR1 alongside production infrastructure
- Install unvetted skills from ClawHub
- Expose any port to the public internet
- Share API credentials between our stack and the NemoClaw stack
The Hybrid Future
The ideal end state is a dual-stack architecture:
┌──────────────────────────────────────────────────────┐ │ DANIEL'S AI INFRASTRUCTURE │ │ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ CLAUDE CODE + COO │ │ NEMOCLAW + OPENCLAW │ │ │ │ │ │ │ │ │ │ Coding │ │ Family messaging │ │ │ │ Finance │ │ Home automation │ │ │ │ Network mgmt │ │ Voice / music │ │ │ │ Security review │ │ Scheduling / cron │ │ │ │ Project mgmt │ │ Quick lookups │ │ │ │ Complex reasoning │ │ Private queries │ │ │ │ │ │ │ │ │ │ Model: Claude │ │ Models: Nemotron │ │ │ │ Interface: CLI │ │ (local) + Claude │ │ │ │ │ │ (cloud, complex) │ │ │ │ runs on: EQR1 │ │ Interface: WhatsApp │ │ │ │ │ │ Signal, Discord │ │ │ │ │ │ │ │ │ │ │ │ runs on: EQR2 or │ │ │ │ │ │ dedicated hardware │ │ │ └─────────────────────┘ └─────────────────────┘ │ │ │ │ ┌─────────────────────┐ │ │ │ ROSEY-BOT │ ← Allergy safety STAYS │ │ │ Hardcoded rules │ HERE. Never moves. │ │ │ Meal plans │ │ │ │ Discord channels │ │ │ └─────────────────────┘ │ └──────────────────────────────────────────────────────┘
Each component does what it's best at. No single point of failure. Jack's safety rules stay hardcoded. Private data stays local. Complex work uses the best model available.
Appendix: Sources
NVIDIA Nemotron
- NVIDIA Debuts Nemotron 3 Family of Open Models
- Inside Nemotron 3: Techniques, Tools, and Data
- Nemotron 3 Super: Hybrid Mamba-Transformer MoE
- NVIDIA Nemotron Developer Page
- Nemotron Foundation Models
- NVIDIA Open Model License
- Nemotron-4 340B Technical Report (arXiv)
- Nemotron Coalition Launch
- Nemotron 3 Super on Hugging Face
- Nemotron 3 Nano on NVIDIA NIM
- NVIDIA: The Only AI Model Maker That Can Afford to Give It Away
- Nemotron: Open-Source Model Champion
- DataCamp: Nemotron 3 Architecture and Benchmarks
NVIDIA NemoClaw
- NVIDIA Announces NemoClaw
- NemoClaw Product Page
- GitHub: NVIDIA/NemoClaw
- NemoClaw Developer Guide: How It Works
- NemoClaw Developer Guide: Architecture
- TechCrunch: Nvidia's OpenClaw Could Solve Its Biggest Problem
- The New Stack: NemoClaw Is OpenClaw with Guardrails
- VentureBeat: NemoClaw Brings Security, Scale
- HPCwire: Nvidia Introduces NemoClaw
- TechTarget: NemoClaw, JFrog Shore Up OpenClaw Security
OpenClaw (Prior Research)
- Deep Research Report on Notion — 60+ sources documented in research-findings.md