The Deep Dive

So... What Exactly
Do You Do Here?

Great question. Let us walk you through exactly what happens when you sit down for a DoYouPrompt assessment. Spoiler: it’s not a quiz.

Imagine you just got hired. It’s your first day. Your boss walks in and says: “Here’s a pile of messy data, a confused stakeholder email, and an AI assistant. Make sense of it by 5pm.” That’s what we simulate. Except we also watch how you do it.
The Walkthrough

A Day in the Lab

Here’s exactly what happens from the moment you click “Start” to the moment you get your results.

1

Pick Your Track

First, you choose what kind of AI professional you are (or want to be). Each track has different tasks tailored to that role. A Prompt Engineer gets different challenges than a Research Analyst.

Prompt Engineer
AI Product Manager
AI Ops Specialist
Content & Marketing
Research Analyst
Support AI Designer
Developer Using LLMs

Each track has 3–5 tasks at calibrated difficulty levels. Pick the one closest to your role.

2

The Workspace Opens

You enter a professional 3-pane workspace. Think of it as your AI command center. Everything you need is right there — no alt-tabbing, no distractions.

DoYouPrompt Workspace — Task 2 of 5
LEFT
Task Brief
“Your mission, should you choose to accept it...”
Source Materials
vendor_proposals.csv
requirements_doc.md
stakeholder_email.txt
CENTER
AI Assistant
Compare the three vendor proposals on cost, timeline, and SOC2 compliance. Flag any contradictions.
Based on my analysis of the three proposals:

Vendor A: Lowest cost ($42k), but 6-week timeline exceeds your deadline. SOC2 Type II certified.
Vendor B: Mid-range ($58k), 3-week delivery. Claims SOC2 but no certificate provided…
Good. Now weight compliance more heavily — we’re fintech, SOC2 is non-negotiable.
Write a prompt…
RIGHT
Scratchpad
Key decision factors:
- SOC2 compliance (must-have)
- Timeline < 4 weeks
- Integration w/ existing stack

Note: Vendor B’s claim needs
independent verification…

TODO: Ask AI about add-on costs
12:34 remaining
1,847 / 4,000 tokens used
Confidence:
Low Med High
3

You Get a Mission

Each task is a realistic business scenario. Not trivia. Not “what temperature should I use?” Real work that real people face every day.

Debugging

A broken prompt is causing your company’s product descriptions to hallucinate features. Fix it before the clients revolt.

Research

Three vendors sent proposals. They all contradict each other. Your VP needs a recommendation by Friday.

Trust & Safety

The AI chatbot just told a customer something wildly inaccurate. Figure out what went wrong and design guardrails.

4

You Prompt, Iterate, Refine

Here’s where the magic happens. You write a prompt. The AI responds. You read the response. Think about it. Then you decide: is this good enough? Or can I do better?

Live Conversation — Vendor Analysis Task
You
You
Summarize the three vendor proposals, comparing them on cost, timeline, and compliance status.
Iteration 1
AI
AI Assistant
Here’s a comparison of the three proposals: Vendor A offers the lowest price at $42k with a 6-week timeline. Vendor B comes in at $58k with 3-week delivery. Vendor C is $61k but includes ongoing support. All three claim compliance readiness…
You notice something off
You
You
Wait — Vendor B’s pricing doesn’t include implementation add-ons. Factor those in and recalculate the total cost of ownership for each vendor over 12 months.
Iteration 2
AI
AI Assistant
Good catch. With add-ons factored in, the 12-month TCO changes significantly: Vendor A: $42k + $8k support = $50k. Vendor B: $58k + $22k add-ons = $80k. Vendor C: $61k all-inclusive = $61k. This reshuffles the ranking entirely…
Getting closer
You
You
Now add a risk section. Highlight the contradictions between Vendor B’s compliance claims and the actual documentation. Format the final output as an executive memo.
Final Iteration
AI
AI Assistant
EXECUTIVE MEMO: Vendor Evaluation
Recommendation: Vendor C (with conditions).
Risk Flag: Vendor B claims SOC2 Type II but provided no certificate. Their add-on costs inflate TCO by 38%…
See that? Three iterations. Each one better than the last. That’s what we measure.
5

We Watch Everything (In a Good Way)

While you work, the platform quietly records your process. Not to spy — to understand your thinking. Here’s what we’re looking at:

How long you think before your first prompt

Do you read the brief first, or fire immediately?

Planning

Whether you actually read the source materials

The data is right there. Did you use it?

Diligence

How you react when the AI gives you garbage

Do you accept it, or push back and iterate?

Resilience

Whether you verify facts or just trust the AI blindly

The AI makes things up sometimes. Do you catch it?

Trustworthiness

How many attempts it takes you to converge

Fewer, more deliberate iterations > scattered guessing.

Efficiency

Whether your confidence matches your accuracy

Knowing what you don’t know is a superpower.

Calibration
6

Submit & Get Scored

When you’re done, you submit your answer with a confidence rating: High, Medium, or Low. Fun fact: saying “High” when you’re wrong hurts more than saying “Low” when you’re right. We reward honest self-assessment.

35%

Performance

Is your final output accurate, complete, and useful?

30%

Process

Did you plan, iterate, and recover from failures?

20%

Trustworthiness

Did you verify claims and flag uncertainty?

15%

Consistency

Can you perform reliably across different tasks?

The Truth

Spoiler: It’s Not Who Writes
the Fanciest Prompt

We see the same three archetypes over and over again. Guess which one gets hired.

The Lucky Guesser

  • Fires one prompt, gets lucky
  • Submits immediately without reading the response carefully
  • No verification, no source checking
  • Says “High confidence” every time
Score: Mediocre

The Template Robot

  • Pastes the same elaborate template every time
  • Doesn’t adapt when the task changes
  • Ignores source materials, relies on template
  • Output looks fancy but misses the point
Score: Below Average

The Thoughtful Operator

  • Reads the brief carefully, plans approach
  • Iterates deliberately, each prompt building on the last
  • Verifies claims against source materials
  • Adapts strategy when things go wrong
Score: Excellent
We’re looking for Column 3. Are you Column 3?

Think You’re Good? Prove It.

Every account gets one free assessment. No credit card. No commitment. Just you, an AI, and some real tasks.

Take the Challenge
Average completion time: 45–90 minutes. Best done with coffee.
22 Unique Tasks
7 Tracks
4 Scoring Dimensions
0 Multiple Choice Questions
100% Real Work Scenarios
For Everyone

But What If I’m Not a “Prompt Engineer”?

You don’t need to be. If you use ChatGPT, Claude, or any AI tool at work, you have prompting skills. The question is: how good are they?

Product managers, marketers, researchers, customer support leads, developers — everyone who works with AI can benefit from knowing where they stand.

Product Managers Marketers Researchers Support Leads Developers Operations Teams
FAQ

Frequently Asked Questions

Is this a test?
Yes, but it’s designed to feel like real work, not a school exam. You get realistic business scenarios, a proper workspace, and an AI assistant. Think of it as a simulated first day on the job, not a pop quiz.
How long does it take?
45–90 minutes depending on the track and how thorough you are. Each task has its own timer, so you can manage your pace. Most people find it goes faster than expected because the work is engaging.
Can I use my own AI tools?
No — you use the built-in AI assistant. This ensures fairness: everyone gets the same model (Claude by Anthropic), the same token budget, and the same starting conditions. It’s about your skill, not your tool subscription.
What if I fail?
There’s no pass/fail. You get a skill profile showing your strengths and areas to improve across four dimensions: Performance, Process, Trustworthiness, and Consistency. Even a low score gives you actionable insight into how to get better.
Is my data shared?
Only if you’re invited by a recruiter. If you register on your own, your data stays private. Recruiter-invited candidates’ results are shared with the recruiter who invited them — but you’ll always know exactly who can see your results.
Can I retake it?
Your first run is free. Additional runs require a subscription or an invitation from a recruiter who has granted you extra runs. This keeps the assessment meaningful — it’s measuring real skill, not test-prep.
What AI model do you use?
Claude by Anthropic. The same model for everyone, ensuring a level playing field. We chose Claude for its strong reasoning capabilities, consistent behavior, and safety characteristics. No one gets an unfair advantage from using a different or more powerful model.