📖 Consolidate v6 docs and add implementation plan#4994
📖 Consolidate v6 docs and add implementation plan#4994justaugustus wants to merge 5 commits intoossf:mainfrom
Conversation
Signed-off-by: Stephen Augustus <foo@auggie.dev>
Add dependency-ordered implementation plan for Scorecard v6 and fix broken links after docs/v6/ consolidation. Implementation steps: - Step 0: OpenFeature infrastructure (enables all v6 work) - Step 1: Evidence model + framework abstraction (core types) - Step 2: Conformance engine + applicability (core evaluation) - Step 3: Output formats, staggered (JSON → in-toto → Gemara → OSCAL) - Step 4: L1 probe coverage + metadata ingestion (parallel with Steps 2-3) - Step 5: Probe catalog extraction (downstream tool integration) - Steps 6-8: Phase 2 (release integrity, attestation, evidence bundles) - Steps 9-11: Phase 3 (enforcement detection, multi-repo, attestation GA) Key design decisions: - v6 is a clean, backwards-compatible successor (no parallel v5 maintenance) - OpenFeature for granular feature gating during v5→v6 transition - FeatureGate field on checker.Check replaces hard-coded delete list - Feature flag wrapper at internal/featureflags/ (not public API) - Explicit phase gates: Phase 1 must prove value before Phase 2 begins Link fixes: - docs/ROADMAP.md: update proposal and coverage links to docs/v6/ - docs/v6/decisions.md: update coverage link (now same directory) - docs/v6/proposal.md: update coverage link (now same directory) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>
Restructure Phase 1 to deliver complete OSPS Baseline Level 1 conformance evidence using existing infrastructure where possible. Key changes: **Ordering: Prove abstractions with existing code first** - Step 0: OpenFeature with existing env vars (SCORECARD_V6, SCORECARD_EXPERIMENTAL) - Step 1: Framework abstraction proven with existing checks before building OSPS - Step 2: JSON output extension (use existing format, defer other formats) - Step 3: OSPS Baseline as second framework (uses proven abstraction) - Step 4: Complete L1 coverage (all 9 gap controls closed) **Phase 1 success criteria:** - Complete L1 control coverage (all 9 gap controls + existing coverage validated) - Framework abstraction proven with checks before OSPS Baseline - Production-ready conformance results in extended JSON - Existing checks, probes, scores unchanged (v6 is additive) **Key findings from investigation:** - Probes produce findings (reusable) - Check evaluation logic produces 0-10 scores (NOT reusable for conformance) - Pattern is reusable: "take findings, apply rules, produce verdict" - Don't shoehorn - checks and conformance have different semantics - Metadata ingestion already exists via checks/fileparser/ (no new infrastructure) **Deferred to Phase 2:** - Probe catalog extraction (wait for framework abstraction to stabilize) - Additional output formats (in-toto, Gemara, OSCAL) - Cron infrastructure (storage/serving cost evaluation needed) - Level 2/3 controls, attestation, multi-repo Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>
Capture recommendations (pending approval) from review discussion: Feature flag changes: - Simplify to two flags: scorecard.experimental and scorecard.v6 - Add feature promotion table for existing flagged features (Webhooks, SBOM, raw format, SARIF must be promoted/migrated in v6) - Remove per-feature granular flags (deferred until actual need arises) - Add testing strategy recommendation (e2e runs twice: default + v6) JSON schema: - Add Option B recommendation: unified evaluations key instead of checks + conformance as parallel top-level keys - Preserve backward compatibility via old schema as default Control catalog: - Replace versioned data file with importing security-baseline Go package - Control definitions from upstream; probe mappings in Scorecard Coverage validation: - Add Step 3.5: human review of L1 coverage analysis before writing probes - Gap probe estimates subject to validated coverage analysis Forge support: - Document GitHub (primary), GitLab (where probes work), Azure DevOps (deferred), local directory (file-based probes only) - Controls unsupported on a forge produce UNKNOWN, not FAIL Baseline levels: - Document that L1/L2/L3 are one framework with levels, not three frameworks Housekeeping: - Remove checkmarks from Phase 1 Complete (not yet done) - Move resolved questions to Resolved decisions section - Remove probe catalog from Phase 1 (already moved to Phase 2) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>
Add section documenting existing infrastructure that v6 should extend rather than duplicate, based on comprehensive codebase review. - Map execution pipeline integration point (conformance evaluator consumes Result.Findings after probes run, no parallel pipeline needed) - Document 13 reusable components with file locations and how v6 uses each - Identify 4 duplication risks: Framework Result interface may over-abstract, finding.Outcome types may already cover conformance status, applicability could use existing NotApplicable outcome, gap probes may overlap existing Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4994 +/- ##
==========================================
+ Coverage 66.80% 69.67% +2.87%
==========================================
Files 230 251 +21
Lines 16602 15654 -948
==========================================
- Hits 11091 10907 -184
+ Misses 4808 3873 -935
- Partials 703 874 +171 🚀 New features to boost your workflow:
|
| 3. **UNKNOWN-first honesty.** If Scorecard cannot observe a control, the | ||
| status is UNKNOWN with an explanation — never a false PASS or FAIL. |
There was a problem hiding this comment.
I like this principle a lot, and I want to flag something for when non-GitHub platforms come into scope.
Probes operate on RawResults and don't know which platform produced the data. When ListReleases() returns ErrUnsupportedFeature on ADO, the raw check swallows the error, and the probe sees "no releases." It returns NotApplicable instead of UNKNOWN. Those mean different things: "I looked and there's nothing there" vs. "I couldn't look."
Doesn't need solving in Phase 1, but it'd help to say so in the principle text. Something like: "For non-GitHub platforms, distinguishing UNKNOWN from NOT_APPLICABLE requires platform capability metadata, deferred to a later phase."
| **Forge support in Phase 1:** | ||
| - **GitHub:** Primary target (full L1 coverage) | ||
| - **GitLab:** Deferred to a future phase | ||
| - **Azure DevOps:** Deferred to a future phase | ||
| - **Local directory:** Conformance results for file-based probes only |
There was a problem hiding this comment.
Deferral makes sense. A few things that'd be easier to account for now while the abstractions are still being designed:
Framework.Evaluate(findings) (Step 1) doesn't have a way to say "this probe couldn't run because the platform doesn't support it." Adding that later means changing the interface after other things depend on it.
Also, ErrUnsupportedFeature handling in raw varies: license.go falls back to file detection, security_policy.go silently skips. Picking a canonical pattern before the conformance layer builds on top would save future headaches.
And the enriched JSON schema could include a reason field on UNKNOWN statuses from day one. Cheap now, painful later once consumers depend on the shape.
| // Evaluate takes probe findings and produces framework-specific results | ||
| Evaluate(findings []finding.Finding) (Result, error) |
There was a problem hiding this comment.
Thinking about how this plays out on ADO: how does the evaluation layer distinguish "finding is absent because the platform can't see it" from "the project just doesn't do this thing"? Right now branch_protection.go catches ErrUnsupportedFeature and the probe downstream sees empty data with no context.
Maybe a PlatformCapabilities input alongside findings, or a new outcome like OutcomeUnobservable with a reason string. Just so the UNKNOWN-first principle can work beyond GitHub when the time comes.
| - Metadata ingestion layer v1 — Security Insights as first supported source (BR-03.01, BR-03.02, QA-04.01); architecture supports additional metadata sources | ||
| - Scorecard control catalog extraction — Extract Scorecard checks into an in-project control framework representation that uses the same unified framework abstraction as OSPS Baseline. This enables checks and OSPS Baseline controls to be treated uniformly within the evaluation layer. | ||
|
|
||
| ### Phase 2: Release integrity + Level 2 core |
There was a problem hiding this comment.
For the Phase 2 release integrity work: six controls depend on "releases" (BR-02.01, BR-04.01, BR-06.01, LE-02.02, LE-03.02, QA-02.02). The concept maps clean to GitHub Releases, but ADO doesn't have a real equivalent. ADO teams ship through Azure Artifacts feeds, classic release pipelines, or pipeline artifacts, which all work differently.
A probe that only understands GitHub Releases would return NotApplicable for an ADO project that ships fine through Azure Artifacts. Probably just worth a note near Phase 2: "release-related probes will need platform-specific implementations."
| | Webhooks check | `SCORECARD_EXPERIMENTAL` | "remove this check when v6 is released" | Promote to always-on in Phase 1 | | ||
| | SBOM check | `SCORECARD_EXPERIMENTAL` | "remove this check when v6 is released" | Promote to always-on in Phase 1 | | ||
| | Raw output format | `SCORECARD_V6` | none | Promote to always-on in Phase 1 | | ||
| | Azure DevOps support | `SCORECARD_EXPERIMENTAL` | none | Keep behind `scorecard.experimental` | |
There was a problem hiding this comment.
More of a question: what would graduating ADO from scorecard.experimental look like?
What kind of change does this PR introduce?
Documentation: consolidate Scorecard v6 documents under
docs/v6/and add adependency-ordered implementation plan for Phase 1.
What is the current behavior?
v6 documents are scattered across
openspec/changes/osps-baseline-conformance/and
docs/. There is no implementation plan showing dependency orderingbetween v6 work items.
What is the new behavior (if this is a feature change)?
All v6 documents consolidated under
docs/v6/:proposal.md— architecture and vision (from merged PR 📖 Scorecard v6: OSPS Baseline conformance proposal and 2026 roadmap #4952)decisions.md— reviewer feedback logosps-baseline-coverage.md— control-by-control coverage analysisplan.md— new dependency-ordered implementation planThe implementation plan covers Phase 1 (OSPS Baseline Level 1 conformance
evidence) with 6 steps ordered by dependency:
The plan also includes:
Cross-document links in
ROADMAP.md,decisions.md, andproposal.mdupdated to reflect new file locations.
N/A — documentation only.
Which issue(s) this PR fixes
NONE
Special notes for your reviewer
This is a follow-up to PR #4952 (merged). The proposal and decisions documents
are unchanged except for link fixes. The implementation plan (
plan.md) is newcontent. Several recommendations are marked "pending approval" for Steering
Committee discussion.
Does this PR introduce a user-facing change?