You’re Already Being Graded on Your AI Prompts
AI works. Individual developers are faster at drafting, refactoring, and exploration.
What's changed -- quietly -- is what companies are now recording, correlating, and evaluating about that work.
Starting in 2026, Meta will formally grade every employee on "AI‑driven impact." Similar internal programs already exist at Google and Microsoft. A new measurement layer now sits between your prompts and your performance review.
What exists today:
- Every Copilot prompt can be logged and retained via Microsoft Purview (180 days by default)
- AI‑generated code can be classified at the commit level
- PR review delays, rework, and acceptance rates are tracked separately for AI vs. human code
- Tool usage, prompt context, and accessed resources are auditable by compliance teams
- These signals roll up into manager dashboards and performance frameworks
Far fewer realize how completely that usage is now visible -- and how directly it is being tied to evaluation, budget decisions, and headcount planning.
Meta will formally grade every employee on "AI‑driven impact" in performance reviews starting 2026. Sundar Pichai told Lex Fridman that Google's most important metric is how much AI increases engineering velocity at the company level (Lex Fridman Podcast). Microsoft's UK CEO said Copilot is writing roughly 40% of code internally (Financial Times). A growing ecosystem of analytics platforms -- LinearB, Exceeds AI, Plandek, and Waydev -- now makes this measurable end‑to‑end.
Your AI scorecard already exists.
You just haven't seen it yet.
This isn't an argument about whether AI works. It does. This is a map of the measurement infrastructure that now exists around AI‑assisted engineering work -- how it's being deployed, and what companies are doing with it, often without engineers fully realizing it.
The Other Thing Meta Did After AI Rolled Out: Layoffs
Between 2022 and 2024, Meta cut over 21,000 employees, framing the reduction as part of a shift toward higher‑leverage, AI‑enabled teams (Meta earnings call coverage). By early 2025, leadership explicitly emphasized fewer engineers, higher output expectations, and AI as a force multiplier.
This matters because it establishes a baseline: once AI is embedded, leadership expectations reset.
The Prompt Harvesting Economy No One Wants to Name
Prompt extraction is already happening -- quietly, by default, and under the banner of compliance. But compliance is only part of the incentive structure.
Enterprise Microsoft 365 tenants automatically log Copilot interactions via Purview audit logs, including:
- Which user prompted Copilot
- When and where the interaction occurred
- Every file, email, or document Copilot accessed
- Sensitivity labels on accessed data
- The Copilot app, context, and plugins involved
On the tooling side:
- GitHub Copilot Free / Individual tiers may use prompts for model training (GitHub Copilot privacy FAQ)
- GitHub Copilot Enterprise disables training only because customers pay to opt out
The incentives are aligned:
- Free and low‑cost tiers subsidize model improvement
- Fewer than 0.5% of users opt out of training on consumer AI tools (OpenAI usage disclosures)
- Prompts encode domain expertise, workflows, and proprietary context
- Samsung engineers leaked confidential chip designs into ChatGPT in 2023 (Bloomberg)
- Google disclosed that human reviewers read Bard conversations (Google AI blog)
- Stanford researchers showed that roughly 50 anonymized prompts are enough to re‑identify a user (Stanford Internet Observatory)
The Tooling Stack Behind the Scorecard
This measurement layer already exists off‑the‑shelf:
- Microsoft Purview -- prompt and access logging
- GitHub Copilot Enterprise -- user‑level audit logs
- GuageAI / Codemetrics / DevSpy -- AI vs human code classification
- LinearB / Plandek / Waydev -- PR throughput, rework, and acceptance tracking
- Microsoft Defender + Purview DLP -- compliance correlation
- What you prompted
- What AI generated
- What shipped
- What was rewritten
- What sensitive systems were touched
That's a forensic trail of AI‑assisted labor.

Diagram key: prompts are logged via Microsoft Purview, AI‑generated code is classified by tools like GuageAI and Codemetrics, pull request impact is measured by platforms such as LinearB, Plandek, and Waydev, and outputs roll up into internal manager dashboards and performance reviews.
Most engineers will never see this map.
But their companies already have it.
And that's the point.