Penetration Testing & Auditing
"Do No Harm": Specialized Red-Teaming for High-Stakes Scientific Computing.
Protecting the Research Lifeline
A standard vulnerability scan running default policies can easily saturate a login node or crash a fragile legacy scheduler, resulting in lost research cycles. Our approach is fundamentally different: We validate the "Hard Outer Shell" and the "Soft Center" using non-disruptive, performance-aware techniques that respect the delicate scientific throughput.
1. The Targeted Strategy
| Testing Zone | Scope | Aggression | Primary Risk |
|---|---|---|---|
| Perimeter | Login Nodes, DTNs, VPNs | High | SSH Exploitation / Brute-force |
| Control Plane | Schedulers (Slurm/PBS) | Low/Manual | Scheduler Crash / Database DoS |
| Data Plane | Lustre / GPFS / NFS | Medium | IOPS Saturation / Corruption |
| Compute Fabric | Nodes, InfiniBand | Low | Latency Spikes in Simulations |
2. Comprehensive Audit Framework (White Box)
Scheduler & Storage Audit
We review the Prolog/Epilog scripts for unsafe file handling that could lead to root ownership. We also verify storage quotas and root_squash settings on NFS exports to prevent unauthorized management access.
Network Segmentation Check
Verification of the Science DMZ. We ensure high-speed ports are open for data but management ports remain isolated. We attempt "outbound curls" from compute nodes to confirm isolation from the public internet.
3. Active Penetration Testing (Red Team)
Perimeter Breach
Testing SSH resilience and attempting MFA bypasses on alternative entry points like Open OnDemand portals.
Lateral Movement
Executing Container Breakouts (Singularity/Apptainer) and shared memory snooping between multi-tenant jobs.
Privilege Escalation
Targeting internal services and unpatched kernels with specific exploits like "Dirty Cow" on legacy compute nodes.
HPC Security Toolset
| Category | Tool | Usage |
|---|---|---|
| Mass Auditing | ClusterShell / pdsh | Checking configuration drift across 1,000+ nodes in seconds. |
| Compliance | OpenSCAP | Automated checking against STIGs and NIST baselines for RHEL/CentOS. |
| Analysis | Check_Slurm | Custom fuzzing of sbatch parameters to find scheduler vulnerabilities. |
Scientific Impact Reporting
We don't just report CVEs; we report wasted potential. Our findings are framed in terms of research risk:
Stress-Test Your Fabric Safely
Download our "HPC Penetration Testing Scope Template" to define a safe and effective security audit for your cluster.
Download PenTest Guide (.docx)