Penetration Testing & Auditing

"Do No Harm": Specialized Red-Teaming for High-Stakes Scientific Computing.

Protecting the Research Lifeline

A standard vulnerability scan running default policies can easily saturate a login node or crash a fragile legacy scheduler, resulting in lost research cycles. Our approach is fundamentally different: We validate the "Hard Outer Shell" and the "Soft Center" using non-disruptive, performance-aware techniques that respect the delicate scientific throughput.

1. The Targeted Strategy

Testing Zone Scope Aggression Primary Risk
Perimeter Login Nodes, DTNs, VPNs High SSH Exploitation / Brute-force
Control Plane Schedulers (Slurm/PBS) Low/Manual Scheduler Crash / Database DoS
Data Plane Lustre / GPFS / NFS Medium IOPS Saturation / Corruption
Compute Fabric Nodes, InfiniBand Low Latency Spikes in Simulations

2. Comprehensive Audit Framework (White Box)

Scheduler & Storage Audit

We review the Prolog/Epilog scripts for unsafe file handling that could lead to root ownership. We also verify storage quotas and root_squash settings on NFS exports to prevent unauthorized management access.

Network Segmentation Check

Verification of the Science DMZ. We ensure high-speed ports are open for data but management ports remain isolated. We attempt "outbound curls" from compute nodes to confirm isolation from the public internet.

3. Active Penetration Testing (Red Team)

Perimeter Breach

Testing SSH resilience and attempting MFA bypasses on alternative entry points like Open OnDemand portals.

Lateral Movement

Executing Container Breakouts (Singularity/Apptainer) and shared memory snooping between multi-tenant jobs.

Privilege Escalation

Targeting internal services and unpatched kernels with specific exploits like "Dirty Cow" on legacy compute nodes.

HPC Security Toolset

Category Tool Usage
Mass Auditing ClusterShell / pdsh Checking configuration drift across 1,000+ nodes in seconds.
Compliance OpenSCAP Automated checking against STIGs and NIST baselines for RHEL/CentOS.
Analysis Check_Slurm Custom fuzzing of sbatch parameters to find scheduler vulnerabilities.

Scientific Impact Reporting

We don't just report CVEs; we report wasted potential. Our findings are framed in terms of research risk:

Risk Context: "Unpatched kernel on Node 50 allows users to crash the node, wasting $5,000 in compute credits and ruining a 2-week simulation."

Stress-Test Your Fabric Safely

Download our "HPC Penetration Testing Scope Template" to define a safe and effective security audit for your cluster.

Download PenTest Guide (.docx)