ai-pm-eval v2.0

$ initializing eval pipeline...

$ loading 8 production scenarios

$ scoring rubric: 5 dimensions × 5-point scale

$ judge model: calibrated LLM panel

$ mode: open-ended structured response evaluation

Ready. Awaiting candidate input. ▊

The AI PM Eval

This isn't a quiz. There are no multiple choice answers to guess.

8 production scenarios. Each one breaks down into focused sub-questions that scaffold your thinking — exactly the way a senior AI PM would structure a response. Then an LLM judge panel scores you across 5 dimensions, the same way you'd evaluate an AI agent's output.

Scoring Rubric — 5 Dimensions × 5-Point Scale

🔄Systems Thinking

Do you see the whole picture or just the immediate problem?

⚙️Technical Depth

Do you understand the underlying architecture?

⚖️Trade-off Awareness

Do you acknowledge what you're giving up?

🏗️Actionability

Could an engineer build from this?

🛡️Risk Awareness

Do you think about what goes wrong?

Each response is scored 1-5 on every dimension. 40 individual scores total. Average calibrated around 2.8-3.2 for a typical PM.

Both are free. Both get scored by the same LLM judge panel.