The Structural Gap GAO Has Now Documented Twice: Reading GAO-26-107777 on NNSA’s $30 Billion Modernization Portfolio
A project controls analysis of what the second biennial GAO assessment of NNSA’s major projects actually says — and what the M&O contractor failure mode reveals about federal capital project oversight broadly.
On February 26, 2026, the Government Accountability Office published its second biennial assessment of the National Nuclear Security Administration’s portfolio of major construction projects. The report, GAO-26-107777, evaluated 28 NNSA major projects collectively estimated to cost more than $30 billion — the research and production infrastructure modernizing the nation’s nuclear weapons stockpile, an area on GAO’s High Risk List.
The headline numbers are the part that will be quoted on the Hill. They are not the most important part of the report.
GAO found that since its first biennial assessment in 2023, NNSA’s cumulative cost overrun for projects in the execution phase grew from $2.1 billion to $4.8 billion as of June 2025. Cumulative schedule delay grew from 9 years to 30 years across the same period. Two projects at the Y-12 National Security Complex in Oak Ridge — the Uranium Processing Facility (UPF) Main Process Building and the UPF Salvage and Accountability Building — account for $3.8 billion — almost 80 percent — of the $4.8 billion in cumulative cost overruns, and a combined 12 years of schedule delay across the portfolio. Seven other execution-phase projects have incurred or expect cost overruns of more than 20 percent against their originally approved baselines.
2023 → 2025
2023 → 2025
still open as of Dec 2025
This article is a project controls analysis of what GAO-26-107777 actually documents and what its findings reveal about federal capital project oversight broadly. The objective is not commentary on NNSA program management. The agency is operating inside a procurement framework that exists, with the staffing and tooling that exists, and the people inside that framework are working on hard problems. The objective is to read the GAO findings the way a project controls professional reads them: as a description of where the structural failure happens, what it costs, and what the technical capability to close the gap would actually need to do.
What the Report Actually Says About Cause
The GAO report identifies three causal categories that NNSA officials and documents associated with the cost and schedule overruns on projects in the execution phase. The categorization is the agency’s own attribution, recorded by GAO in its findings. Each category is structurally distinct:
The management-and-operating contractor model places day-to-day project management responsibility on the contractor; weaknesses at this layer cascade through cost and schedule performance.
One layer below the M&O contractor, vendor and subcontractor performance is mediated through the M&O contractor’s oversight rather than directly visible to NNSA.
Input cost growth driven by market conditions, supply chain dynamics, and vendor pricing — the category most exogenous to project management decisions.
Two of the three categories are layers of the same structural arrangement: the M&O contractor sits between NNSA and the work, and the failures GAO documents accumulate at that boundary. The first category attributes performance issues directly to the M&O contractor’s project management. The second attributes them to entities one layer below the M&O contractor, whose performance NNSA observes only through the M&O contractor’s oversight. The third category is genuine input-cost growth and is the most exogenous of the three.
The structural condition is what matters here. NNSA does not directly manage the construction. The M&O contractor does. The federal agency’s view of project performance is mediated by the same entity whose performance is being evaluated. This is not a criticism of any particular contractor. It is a description of the architecture of federal capital project oversight under the M&O model, which exists for legitimate reasons of operational continuity and institutional knowledge at the national laboratories and weapons complex sites.
The architecture has consequences for what kinds of failures can be detected before they accumulate, and what kinds cannot.
What “Cost Overruns Doubled in Two Years” Means in Project Controls Terms
The progression from $2.1 billion in cumulative cost overrun in 2023 to $4.8 billion in 2025 is not a single event. It is the accumulated result of monthly performance reporting cycles passing through the federal oversight framework without triggering correction at scale.
In project controls terms, every month of the interval between GAO’s two biennial assessments produced an Integrated Program Management Report from each contractor. Each report contained Cost Performance Index, Schedule Performance Index, Estimate at Completion, Variance Analysis Reports for thresholds breached, and the structured cost and schedule data underneath. The reports were submitted, reviewed, and accepted. The cumulative outcome of that two-year cycle is the additional $2.7 billion in cost overrun and the additional 21 years of cumulative schedule delay GAO measured.
This is the project controls observation worth sitting with. The reporting framework was operating. The reviews were happening. The compliance documentation was being filed. And the structural performance still deteriorated by the magnitudes the report documents. The conclusion that follows is not that the framework failed because people stopped doing their jobs. It is that the framework, as currently architected, does not produce the kind of independent verification of contractor-provided data that catches manipulation patterns, baseline drift, or accumulating schedule manipulation before they become measurable in dollar and year terms.
Eight of GAO’s 21 prior recommendations remain open as of December 2025. Two consecutive biennial reports finding the same structural pattern, with prior recommendations still partially unaddressed, is not a coincidence. It is the signature of a problem the existing oversight framework cannot solve through its current tooling.
The Three Layers Where the Gap Lives
The structural gap between contractor performance reporting and federal audit findings has three distinct layers. Each one builds on the previous, and each one creates the conditions for the next. The layers correspond to specific points where independent capability could test what is being submitted, but where the current oversight framework relies on the contractor’s interpretation of the contractor’s own data.
Federal capital projects subject to DOE Order 413.3B and the EIA-748 EVMS Standard require an integrated cost-and-schedule baseline against which performance is measured. The contractor builds the schedule, builds the cost baseline, and integrates the two into the Performance Measurement Baseline. The contractor’s project controls organization owns the underlying data — the Primavera P6 schedule files, the cost engine outputs, the resource loading, the work breakdown structure, the control account assignments.
The federal program office reviews the contractor’s submission and approves it at Critical Decision 2. External Independent Reviews and Independent Cost Estimates run at the major decision gates, and they produce real findings, but they happen at gates rather than continuously. Between the baseline approval and project completion, the agency relies on the contractor’s monthly performance reporting against the baseline the contractor itself constructed.
The Integrated Program Management Report is the contractor’s monthly submission to the federal program office. Format 5 includes narrative analysis — Variance Analysis Reports authored by control account managers explaining cost and schedule variances and recovery actions. The narrative is in human language. The structured data is in the .xer schedule file and the cost engine export. The two are submitted together but, in practice, often read separately because the federal review staffing does not match the technical demand of testing the narrative claims against the structured data continuously.
This is the layer where claims about project status diverge from the data the project produces. A narrative can describe acceleration that the resource loading does not support. A narrative can describe constraints used to hold an end date when the schedule file shows zero hard constraints. A narrative can describe out-of-sequence work performed to mitigate delay when the schedule shows zero out-of-sequence tasks. The structural data tells the truth; the narrative tells the story. Without independent capability to read both simultaneously, the framework reads the story.
GAO’s biennial cadence is statutory, established by Senate Report 117-130 accompanying the National Defense Authorization Act for Fiscal Year 2023. The cadence is appropriate for the kind of strategic-level assessment GAO performs — measuring portfolio performance, identifying patterns, recommending policy responses. The cadence is not designed to detect monthly manipulation in real time. That is not GAO’s role and the report does not claim otherwise.
The implication is that the gap between monthly contractor reporting and the GAO biennial audit is a 24-month interval during which the existing oversight framework relies on the contractor’s interpretation of its own data, the External Independent Reviews at decision gates, and the federal program office’s capacity to test what arrives in the IPMR. The accumulation GAO measures — $2.1 billion to $4.8 billion in cost overrun, 9 years to 30 years in schedule delay — is the result of that interval running without continuous independent verification at the schedule-logic and cost-data layers.
What an Independent Detection Layer Would Need to Do
If a federal program office wanted to close the gap between contractor monthly reporting and the GAO biennial audit, the technical requirements are specific. The capability would need to operate at the schedule-logic layer, the cost-data layer, and the narrative-claim layer simultaneously. It would need to operate continuously rather than at decision gates. And it would need to ground its findings in the existing federal compliance framework rather than introduce a parallel evaluation regime that program offices would resist on procurement grounds.
Five capabilities follow from the structural gap. None of them require new regulation. All of them are grounded in existing standards.
The Performance Measurement Baseline is the standard against which performance is measured. Modifying the baseline retroactively — moving budget between control accounts after the fact, adjusting completion criteria for activities already in progress, changing schedule logic to rewrite the historical critical path — is the most direct form of manipulation. It is also the easiest to perform inside the contractor’s project controls software, and the hardest to detect from any single monthly report. Detection requires comparing successive baseline versions across reporting periods and surfacing mathematical differences as forensic findings rather than letting them accumulate as undocumented baseline drift. AACE 29R-03 Forensic Schedule Analysis describes this methodology for retrospective claim defense; applied prospectively and continuously, it catches baseline manipulation before it accumulates.
The DCMA 14-Point Schedule Assessment tests for individual schedule integrity violations against specific thresholds. A schedule can pass all 14 tests while still containing structural manipulation distributed under the threshold of any individual test. The most common technique is float distribution — moving total float around the schedule network to obscure which activities are actually on the critical path, so that delay events appear absorbed rather than impactful. The technique requires no individual rule violation and is invisible to threshold-based analytics. Detection requires structural analysis of the schedule as a directed graph object rather than as a list of activities.
Format 5 narrative analysis in the IPMR is text. The schedule file and cost export are structured data. They are submitted together. They are typically read separately. When a Variance Analysis Report claims that a Must Finish On constraint held the project end date and the schedule file shows zero hard constraints, the narrative contradicts the data. When the narrative claims out-of-sequence work performed to mitigate delay and the schedule shows zero out-of-sequence tasks, the narrative contradicts the data. Independent oversight requires reading both simultaneously and surfacing the contradictions as forensic findings rather than allowing the narrative to stand as the executive summary of the structured data.
Real construction work produces resource loading curves with natural variation. Crews ramp up, hit productive plateaus, demobilize, return for punch list work. The curves are uneven because the work is uneven. When a contractor’s resource loading curves are too smooth — when productivity stays at a perfect constant for months at a time — the curve has been back-fitted to match desired financial outcomes rather than reflecting real construction execution. This pattern is what makes some EVMS data sets pass every threshold check while still being mathematically improbable. Detection requires recognizing the signature of back-fitted data versus measured data.
When a contractor submits a Time Impact Analysis claiming compensable delay, the federal owner faces a high-dollar entitlement decision. AACE 29R-03 defines the methodology — including float ownership, the principle that consuming float on a near-critical path leaves remaining float on that path rather than extending the project end date. A claim for a 25-day compensable delay on a 45-day float path leaves 20 days of float; it does not extend the project completion date. The math is straightforward. The application of the math to a specific contractor TIA submission requires running the float ownership analysis against the actual schedule data. Federal owners without independent capability to run this analysis pay claims they should reject.
What This Reveals About Federal Capital Project Oversight
GAO-26-107777 is one report on one agency’s portfolio. The structural gap it documents is not unique to NNSA. The same M&O contractor model exists at DOE Office of Environmental Management, where Hanford, Savannah River, Oak Ridge, and the Idaho cleanup sites present comparable oversight challenges. The same baseline-acceptance-without-continuous-independent-verification pattern exists across DCMA-cognizant defense programs subject to DFARS 252.234-7002. The same retrospective audit cadence operates at the GAO level for federal capital project oversight broadly.
Three observations follow from the report’s findings, each one applicable beyond NNSA to the broader federal capital project oversight environment.
The technical capability to close the structural gap exists. It is not a research project. The detection methods — graph-based schedule analysis, narrative-versus-data triangulation, retroactive baseline comparison, mathematical resource curve validation, AACE 29R-03 float ownership analysis — are documented in the existing federal compliance literature. What has not existed until recently is the software architecture to run these analyses continuously at federal capital project scale.
The constraint is not regulation. The DCMA EVMS Compliance Metrics, the DOE data-driven audit metrics, the EIA-748 Guidelines, and the GAO Best Practices already define what continuous oversight should test. The constraint is the gap between what the rules require and what continuous federal capability exists to enforce them. Federal program offices know what they should be testing. The gap is technical apparatus, not policy clarity.
The economic logic of independent detection is favorable. NNSA’s $4.8 billion cumulative cost overrun did not emerge in a single moment. It accumulated across 24 months of monthly reporting cycles passing through the existing oversight framework without triggering correction at scale. If the framework had detected the manipulation patterns at the schedule-logic and cost-data layers in the first 12 to 24 months of each project, the corrective interventions would have cost a fraction of the eventual overrun. The investment math for an independent detection layer is not whether it pays for itself; it is how quickly it pays for itself across the portfolio.
The Wider Implication
NNSA is overseeing $30 billion in active major construction projects across the execution phase, with 12 additional projects in the definition phase that have not yet entered execution. DOE Office of Environmental Management oversees comparable portfolios at the legacy cleanup sites. DCMA oversees defense contractor EVMS compliance on programs valued at $50 million or more, with the most recent FAR overhaul raising the formal compliance review threshold to $100 million. The aggregate federal capital project portfolio subject to EIA-748 EVMS requirements is measured in the high hundreds of billions of dollars.
The GAO findings on NNSA describe a failure mode that is structural to the oversight model rather than specific to one agency. Two policy responses are possible. The first is to wait for the next biennial GAO report, which will measure whatever has accumulated in the interval. The second is to invest in the independent detection capability that closes the gap between contractor monthly reporting and federal audit findings.
The technical capability exists. The regulatory authority exists. What has not existed until recently is the software architecture to deploy continuous independent verification at scale, on the data formats federal capital programs actually produce, grounded in the existing compliance framework rather than introducing a parallel regime. The economic case for closing the gap is the difference between the $4.8 billion overrun GAO measured in 2026 and the $2.1 billion overrun the same methodology would have measured if independent detection had been operating continuously between 2023 and 2025.
The next biennial GAO report will publish in early 2028. It will measure whatever has accumulated in the interval. Federal program offices that invest in independent detection capability before then will produce different numbers than program offices that do not.
About This Analysis
Peveka Solutions Inc. is a Santa Barbara-based project controls AI company building tools to compress federal capital project oversight from biennial audit cadence to continuous detection. The Forensic Intelligence Engine is grounded in the full federal compliance framework governing Earned Value Management on capital programs — including OMB Circular A-11, FAR Part 34, DOE Order 413.3B, the DOE 413.3 family of Performance Baseline and Integrated Project Management guides, the DoD EVMIG, the DCMA EVMS Metrics framework and Business Practices, the NDIA IPMD Intent Guide and Surveillance Guide, the GAO Cost Estimating Guide (GAO-20-195G) and Schedule Assessment Guide (GAO-16-89G), the NDIA Planning & Scheduling Excellence Guide (PASEG), the NDIA Predictive Measures Guide, the AACE 29R-03 Forensic Schedule Analysis standard, and the IPMDAR Implementation and Tailoring Guide.
The Master Strategist mode is purpose-built for the analytical layer between contractor performance reporting and federal audit findings. It detects the patterns described in this article — retroactive baseline modifications, distributed buffer and float manipulation, narrative-versus-data contradictions, mathematically implausible resource loading, and Time Impact Analysis entitlement failures — at the structural level the conventional oversight framework was not architected to address.
If you lead capital project oversight at a federal program office or a contractor compliance organization and want to discuss how independent detection capability would integrate with your existing surveillance workflow, reach out at jwilliams@pevekasolutions.com.