:book: Consolidate v6 docs and add implementation plan by justaugustus · Pull Request #4994 · ossf/scorecard

justaugustus · 2026-04-01T09:50:57Z

What kind of change does this PR introduce?

Documentation: consolidate Scorecard v6 documents under docs/v6/ and add a
dependency-ordered implementation plan for Phase 1.

PR title follows the guidelines defined in our pull request documentation

What is the current behavior?

v6 documents are scattered across openspec/changes/osps-baseline-conformance/
and docs/. There is no implementation plan showing dependency ordering
between v6 work items.

What is the new behavior (if this is a feature change)?

All v6 documents consolidated under docs/v6/:

proposal.md — architecture and vision (from merged PR 📖 Scorecard v6: OSPS Baseline conformance proposal and 2026 roadmap #4952)
decisions.md — reviewer feedback log
osps-baseline-coverage.md — control-by-control coverage analysis
plan.md — new dependency-ordered implementation plan

The implementation plan covers Phase 1 (OSPS Baseline Level 1 conformance
evidence) with 6 steps ordered by dependency:

Step 0: OpenFeature feature flag infrastructure
Step 1: Framework abstraction (proven with existing checks)
Step 2: JSON output for conformance results
Step 3: OSPS Baseline as second framework
Step 4: Human review of L1 coverage analysis
Step 5: Complete L1 coverage (gap probes)

The plan also includes:

Feature promotion table for existing flagged features
Forge support scope (GitHub in Phase 1; GitLab and Azure DevOps deferred)
Codebase reuse map documenting existing infrastructure to extend
Recommendations pending approval (marked as such)

Cross-document links in ROADMAP.md, decisions.md, and proposal.md
updated to reflect new file locations.

Tests for the changes have been added (for bug fixes/features)

N/A — documentation only.

Which issue(s) this PR fixes

NONE

Special notes for your reviewer

This is a follow-up to PR #4952 (merged). The proposal and decisions documents
are unchanged except for link fixes. The implementation plan (plan.md) is new
content. Several recommendations are marked "pending approval" for Steering
Committee discussion.

Does this PR introduce a user-facing change?

NONE

Signed-off-by: Stephen Augustus <foo@auggie.dev>

Add dependency-ordered implementation plan for Scorecard v6 and fix broken links after docs/v6/ consolidation. Implementation steps: - Step 0: OpenFeature infrastructure (enables all v6 work) - Step 1: Evidence model + framework abstraction (core types) - Step 2: Conformance engine + applicability (core evaluation) - Step 3: Output formats, staggered (JSON → in-toto → Gemara → OSCAL) - Step 4: L1 probe coverage + metadata ingestion (parallel with Steps 2-3) - Step 5: Probe catalog extraction (downstream tool integration) - Steps 6-8: Phase 2 (release integrity, attestation, evidence bundles) - Steps 9-11: Phase 3 (enforcement detection, multi-repo, attestation GA) Key design decisions: - v6 is a clean, backwards-compatible successor (no parallel v5 maintenance) - OpenFeature for granular feature gating during v5→v6 transition - FeatureGate field on checker.Check replaces hard-coded delete list - Feature flag wrapper at internal/featureflags/ (not public API) - Explicit phase gates: Phase 1 must prove value before Phase 2 begins Link fixes: - docs/ROADMAP.md: update proposal and coverage links to docs/v6/ - docs/v6/decisions.md: update coverage link (now same directory) - docs/v6/proposal.md: update coverage link (now same directory) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Restructure Phase 1 to deliver complete OSPS Baseline Level 1 conformance evidence using existing infrastructure where possible. Key changes: **Ordering: Prove abstractions with existing code first** - Step 0: OpenFeature with existing env vars (SCORECARD_V6, SCORECARD_EXPERIMENTAL) - Step 1: Framework abstraction proven with existing checks before building OSPS - Step 2: JSON output extension (use existing format, defer other formats) - Step 3: OSPS Baseline as second framework (uses proven abstraction) - Step 4: Complete L1 coverage (all 9 gap controls closed) **Phase 1 success criteria:** - Complete L1 control coverage (all 9 gap controls + existing coverage validated) - Framework abstraction proven with checks before OSPS Baseline - Production-ready conformance results in extended JSON - Existing checks, probes, scores unchanged (v6 is additive) **Key findings from investigation:** - Probes produce findings (reusable) - Check evaluation logic produces 0-10 scores (NOT reusable for conformance) - Pattern is reusable: "take findings, apply rules, produce verdict" - Don't shoehorn - checks and conformance have different semantics - Metadata ingestion already exists via checks/fileparser/ (no new infrastructure) **Deferred to Phase 2:** - Probe catalog extraction (wait for framework abstraction to stabilize) - Additional output formats (in-toto, Gemara, OSCAL) - Cron infrastructure (storage/serving cost evaluation needed) - Level 2/3 controls, attestation, multi-repo Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Capture recommendations (pending approval) from review discussion: Feature flag changes: - Simplify to two flags: scorecard.experimental and scorecard.v6 - Add feature promotion table for existing flagged features (Webhooks, SBOM, raw format, SARIF must be promoted/migrated in v6) - Remove per-feature granular flags (deferred until actual need arises) - Add testing strategy recommendation (e2e runs twice: default + v6) JSON schema: - Add Option B recommendation: unified evaluations key instead of checks + conformance as parallel top-level keys - Preserve backward compatibility via old schema as default Control catalog: - Replace versioned data file with importing security-baseline Go package - Control definitions from upstream; probe mappings in Scorecard Coverage validation: - Add Step 3.5: human review of L1 coverage analysis before writing probes - Gap probe estimates subject to validated coverage analysis Forge support: - Document GitHub (primary), GitLab (where probes work), Azure DevOps (deferred), local directory (file-based probes only) - Controls unsupported on a forge produce UNKNOWN, not FAIL Baseline levels: - Document that L1/L2/L3 are one framework with levels, not three frameworks Housekeeping: - Remove checkmarks from Phase 1 Complete (not yet done) - Move resolved questions to Resolved decisions section - Remove probe catalog from Phase 1 (already moved to Phase 2) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Add section documenting existing infrastructure that v6 should extend rather than duplicate, based on comprehensive codebase review. - Map execution pipeline integration point (conformance evaluator consumes Result.Findings after probes run, no parallel pipeline needed) - Document 13 reusable components with file locations and how v6 uses each - Identify 4 duplication risks: Framework Result interface may over-abstract, finding.Outcome types may already cover conformance status, applicability could use existing NotApplicable outcome, gap probes may overlap existing Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

codecov · 2026-04-01T09:55:52Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.67%. Comparing base (353ed60) to head (7e8b9b0).
⚠️ Report is 337 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4994      +/-   ##
==========================================
+ Coverage   66.80%   69.67%   +2.87%     
==========================================
  Files         230      251      +21     
  Lines       16602    15654     -948     
==========================================
- Hits        11091    10907     -184     
+ Misses       4808     3873     -935     
- Partials      703      874     +171

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

JamieMagee · 2026-04-04T22:33:32Z

docs/v6/proposal.md

 3. **UNKNOWN-first honesty.** If Scorecard cannot observe a control, the
   status is UNKNOWN with an explanation — never a false PASS or FAIL.


I like this principle a lot, and I want to flag something for when non-GitHub platforms come into scope.

Probes operate on RawResults and don't know which platform produced the data. When ListReleases() returns ErrUnsupportedFeature on ADO, the raw check swallows the error, and the probe sees "no releases." It returns NotApplicable instead of UNKNOWN. Those mean different things: "I looked and there's nothing there" vs. "I couldn't look."

Doesn't need solving in Phase 1, but it'd help to say so in the principle text. Something like: "For non-GitHub platforms, distinguishing UNKNOWN from NOT_APPLICABLE requires platform capability metadata, deferred to a later phase."

JamieMagee · 2026-04-04T22:35:11Z

docs/v6/plan.md

+**Forge support in Phase 1:**
+- **GitHub:** Primary target (full L1 coverage)
+- **GitLab:** Deferred to a future phase
+- **Azure DevOps:** Deferred to a future phase
+- **Local directory:** Conformance results for file-based probes only


Deferral makes sense. A few things that'd be easier to account for now while the abstractions are still being designed:

Framework.Evaluate(findings) (Step 1) doesn't have a way to say "this probe couldn't run because the platform doesn't support it." Adding that later means changing the interface after other things depend on it.

Also, ErrUnsupportedFeature handling in raw varies: license.go falls back to file detection, security_policy.go silently skips. Picking a canonical pattern before the conformance layer builds on top would save future headaches.

And the enriched JSON schema could include a reason field on UNKNOWN statuses from day one. Cheap now, painful later once consumers depend on the shape.

JamieMagee · 2026-04-04T22:36:00Z

docs/v6/plan.md

+    // Evaluate takes probe findings and produces framework-specific results
+    Evaluate(findings []finding.Finding) (Result, error)


Thinking about how this plays out on ADO: how does the evaluation layer distinguish "finding is absent because the platform can't see it" from "the project just doesn't do this thing"? Right now branch_protection.go catches ErrUnsupportedFeature and the probe downstream sees empty data with no context.

Maybe a PlatformCapabilities input alongside findings, or a new outcome like OutcomeUnobservable with a reason string. Just so the UNKNOWN-first principle can work beyond GitHub when the time comes.

JamieMagee · 2026-04-04T22:38:55Z

docs/v6/proposal.md

 - Metadata ingestion layer v1 — Security Insights as first supported source (BR-03.01, BR-03.02, QA-04.01); architecture supports additional metadata sources
 - Scorecard control catalog extraction — Extract Scorecard checks into an in-project control framework representation that uses the same unified framework abstraction as OSPS Baseline. This enables checks and OSPS Baseline controls to be treated uniformly within the evaluation layer.

 ### Phase 2: Release integrity + Level 2 core


For the Phase 2 release integrity work: six controls depend on "releases" (BR-02.01, BR-04.01, BR-06.01, LE-02.02, LE-03.02, QA-02.02). The concept maps clean to GitHub Releases, but ADO doesn't have a real equivalent. ADO teams ship through Azure Artifacts feeds, classic release pipelines, or pipeline artifacts, which all work differently.

A probe that only understands GitHub Releases would return NotApplicable for an ADO project that ships fine through Azure Artifacts. Probably just worth a note near Phase 2: "release-related probes will need platform-specific implementations."

JamieMagee · 2026-04-04T22:39:25Z

docs/v6/plan.md

+| Webhooks check | `SCORECARD_EXPERIMENTAL` | "remove this check when v6 is released" | Promote to always-on in Phase 1 |
+| SBOM check | `SCORECARD_EXPERIMENTAL` | "remove this check when v6 is released" | Promote to always-on in Phase 1 |
+| Raw output format | `SCORECARD_V6` | none | Promote to always-on in Phase 1 |
+| Azure DevOps support | `SCORECARD_EXPERIMENTAL` | none | Keep behind `scorecard.experimental` |


More of a question: what would graduating ADO from scorecard.experimental look like?

justaugustus and others added 5 commits March 31, 2026 19:47

docs: Migrate Scorecard v6 docs into a single directory

6801a1b

Signed-off-by: Stephen Augustus <foo@auggie.dev>

github-project-automation bot added this to OpenSSF Scorecard Apr 1, 2026

justaugustus temporarily deployed to integration-test April 1, 2026 09:51 — with GitHub Actions Inactive

justaugustus temporarily deployed to gitlab April 1, 2026 09:51 — with GitHub Actions Inactive

justaugustus mentioned this pull request Apr 1, 2026

📖 Scorecard v6: OSPS Baseline conformance proposal and 2026 roadmap #4952

Merged

2 tasks

JamieMagee reviewed Apr 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📖 Consolidate v6 docs and add implementation plan#4994

📖 Consolidate v6 docs and add implementation plan#4994
justaugustus wants to merge 5 commits intoossf:mainfrom
justaugustus:v6-docs

justaugustus commented Apr 1, 2026

Uh oh!

codecov bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

JamieMagee Apr 4, 2026 •

edited

Loading

Uh oh!

JamieMagee Apr 4, 2026 •

edited

Loading

Uh oh!

JamieMagee Apr 4, 2026 •

edited

Loading

Uh oh!

JamieMagee Apr 4, 2026 •

edited

Loading

Uh oh!

JamieMagee Apr 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		3. UNKNOWN-first honesty. If Scorecard cannot observe a control, the
		status is UNKNOWN with an explanation — never a false PASS or FAIL.

		// Evaluate takes probe findings and produces framework-specific results
		Evaluate(findings []finding.Finding) (Result, error)

Conversation

justaugustus commented Apr 1, 2026

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior (if this is a feature change)?

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?

Uh oh!

codecov bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

JamieMagee Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamieMagee Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamieMagee Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamieMagee Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JamieMagee Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Apr 1, 2026 •

edited

Loading

JamieMagee Apr 4, 2026 •

edited

Loading

JamieMagee Apr 4, 2026 •

edited

Loading

JamieMagee Apr 4, 2026 •

edited

Loading

JamieMagee Apr 4, 2026 •

edited

Loading

JamieMagee Apr 4, 2026 •

edited

Loading