We don’t need a “gotcha” culture. We need assessment that makes thinking visible.
AI is already embedded in student life and not just by or for the students themselves.
Surveys show that most educators have now tried AI tools themselves, a sign the shift is already well underway (see Microsoft’s 2025 AI in Education Report). But relying on plagiarism detectors is a dead end. Accuracy is inconsistent, false positives disproportionately harm multilingual students, and “humanizing” tactics bypass most tools. Rather than policing, higher education needs to redesign assessments to make thinking visible.
That means oral defenses, iterative drafts, AI-dialogue reflections, authentic problem scenarios, and portfolios. This isn’t just a defensive move, it’s a chance to bring back curiosity, creativity, and real-time reasoning into classrooms. As one of our EDU AI Mixer panelists put it: “Cheap detectors are not the answer. They are unreliable, ineffective, and only likely to get worse rather than better.” Instead, let’s look at platforms like Hyperspace that are enabling faculty to trial new approaches: viva-style assessments with AI avatars, dynamic skill challenges, and process logs that make learning transparent. Integrity isn’t about banning AI, it’s about designing work where deep thinking can’t be faked.
Why detectors can’t carry integrity
Are you an educator or a surveillance officer?
While an uncomfortable question, it is one forced by detectors onto faculty. As a result, instead of mentoring or guiding, we end up policing. And that happens with tools that, in many cases, don’t even hold up under scrutiny. Moreover, accuracy swings wildly, and false positives stack up against multilingual students, while clever prompt-tweaks slip past the filters.
Even so, if detection rates do climb, as one panelist put it-“Even if we get the accuracy up to 80 or 90%, that would mean unfairly accused students 10–20% of the time, and that is totally unacceptable.” — Mark Jacobs, Clark University (EDU AI Mixer panelist)
Similarly, surveys echo this shift: plagiarism and cheating remain a top concern, but the conversation is moving toward AI literacy, clear policy, and training. (see Microsoft’s 2025 AI in Education Report).
Ultimately, integrity is about building trust, not playing cat-and-mouse with flawed software.
So here’s the key evidence:
- Variable accuracy, vulnerable to “AI washing” tactics
- Higher false positives for ESL students
- Risk of turning faculty into police instead of mentors
Recent studies: sector research and peer-reviewed evaluations come to confirm this:
“Peer-reviewed studies find detector success rates range from 20% to 90%, with false positives disproportionately affecting non-native writers. Once ‘humanizing’ edits are applied, accuracy can drop below 20%.”

Oral defenses: the old idea that works better than ever
The viva isn’t new. What’s new is that AI makes it essential. A student can paste a prompt into ChatGPT and spit out a passable essay. But can they explain their choices, defend their sources, and adjust under questioning? That’s where the value lies. A two-page brief followed by a three-minute oral defense, live or even with an AI avatar, forces students to show their thinking. It also turns assessment back into conversation, and that is where deep learning happens.
So in short, here are six assessment patterns that work
1. Oral defense (“brief + 3-min viva”)
Students submit a 1–2 page brief, then defend it orally in a timed session — live or with an AI avatar. Follow with a short “what I changed and why” addendum.
Where this approach shines in capstones, policy, ethics, design decisions, lit reviews.
And as an educator, this gives you an immediate and clear view on whether the student owns the argument or just the prose.
2. AI-dialogue log + reflection
Make AI use visible and and most importantly, accountable. Require the AI conversation history (edited for clarity) plus a reflective memo on what was validated, corrected, or rejected, and why.
Here’s an assignment sketch:
Part A: Submit the cleaned prompt chain (≤ 2 pages).
Part B: 400-word memo: (1) where the AI was right; (2) where it hallucinated or oversimplified; (3) what you did to verify; (4) what you wrote yourself.
Cite the AI: name, model, date, core prompts used.
What does this signal to students: We’re grading method and judgment, not just outcomes.
And here’s the bonus: You’ll instantly spot “one-and-done” prompt dumps vs. iterative, critical use.
3. Staged iteration with checkpoints
From draft to feedback to revision, each timestamped. Shows growth while blocking last-minute AI substitutions.
💡Pro tip: to run it without drowning
- Week 2 (10%): One-page outline + 3 sources
- Week 4 (20%): Draft 1 + margin comments answering your questions
- Week 6 (20%): Peer review receipts (two peers, two questions answered)
- Week 8 (50%): Final + 200-word change log (“what changed & why”)
In practice, you will grade fewer “mystery finals” and more visible thinking.
4. Problem-based scenario tasks
Pose real-world dilemmas, for example, policy A vs. policy B under specific constraints. Grade the reasoning, and not regurgitation.
💡Example prompt you can use
“You’re advising a school board on adopting an AI-assisted writing coach.
- Option A: District-wide rollout next term (equity boost, training costs).
- Option B: Two-school pilot (lower cost, slower equity).
- Deliver: a 700-word recommendation + a 3-minute verbal rationale.
Must include: budget implications, data privacy guardrails, teacher workload impact, and a contingency if outcomes disappoint.”
This works because when you force a real choice under real constraints, generic chat output falls apart. It tends to smooth over the hard bits, dodge local details, and offer ‘both-sides’ summaries instead of committing to a position with evidence. Students must both choose and justify.
5. Portfolio + mini-vivas
Students build portfolios across a term, think: drafts, artifacts, missteps, capped by short oral checks to confirm authorship and understanding.
💡Pro tip to keep it light
- Mini-vivas are 90 seconds. Ask one “why this, not that?” question per artifact
- Use a simple 3-point scale: demonstrates understanding / partial / not yet
- Let students nominate one artifact they want you to probe. It invites ownership
This will actually reduce anxiety of being caught and be more open about it – it will no longer be about hiding the process, but about actually showing it.
6. Two-lane policy assessments
Mix AI-prohibited, supervised in-class assessments and tasks with AI-permitted assignments that require disclosure.
Combine invigilated, no-AI tasks for certain outcomes with AI-permitted assignments that require full disclosure and citation of tools.
💡Pro tip on the design note: Tie each lane to a learning outcome. Example: closed-AI for recall/formal proofs; open-AI for critique, application, and synthesis.
And here’s a disclosure mini-policy:
When AI tools are permitted, cite them like a source: tool + model + date + your key prompts + what parts of the work it shaped. Undeclared use where disclosure is required counts as a policy violation.”
What changes for classroom climate
The thing nobody says out loud.
Students are already using AI, and this isn’t hypothetical; “A 2025 UK survey found 92% of undergraduates report using AI tools in their studies (HEPI/Kortext). Pretending they aren’t just punishes the honest ones. Microsoft’s 2025 AI in Education Report likewise finds usage is widespread across students, educators, and leaders, with concerns shifting from “if” to “how”.
When we build assignments that expect and shape AI use, the energy shifts. Students talk more openly about process. They ask better questions. You get fewer “Is this allowed?” emails and more “Here’s how I used it—does that meet the brief?

💡A word on equity
Integrity policies can accidentally penalize multilingual writers and students with differing access to tools. Two guardrails help:
- Clarity: spell out what’s allowed, always with examples
- Competence, not confession: when disclosure is permitted, don’t grade students down for using AI, instead grade them for how they used it (validation, synthesis, judgment).
To account for those that don’t use AI – though, let’s be fair, the numbers here are less and less, we’re talking about digital natives here and even more, soon enough AI natives.
If a student chooses not to use AI: they’re not penalized. Our rubrics grade outcomes and reasoning the same way in both lanes. In the open-AI lane, students who do use AI must disclose how they used it (tool/model, prompts, what it influenced). In the closed/proctored lane, students complete the task without AI. Either path can earn full credit—the measure is quality of argument, evidence, and judgment, not tool choice.
In Hyperspace, these formats are easy to pilot; viva-style defenses with AI avatars, automatic capture of dialogue logs, and time-stamped checkpoints, all in one place.
Hyperspace in action
Theory is nice, but faculty do need places to test it without re-platforming an entire course. You can not launch something that sounds good in theory, not when it comes to education of our young minds. And that’s where platforms like Hyperspace come into play and prove incredibly useful: AI avatars for viva-style defenses, dynamic scenarios that adapt to a learner’s level, and automatic capture of dialogue logs and checkpoints in one space.

At our EDU AI Mixer, Danny Stefanic, CEO of Hyperspace, summarized it: “You can build challenges dynamically to meet people’s skill sets… empathy boosted, for example. And each challenge can be different, never boring.”
An example of this is the work we’re doing with one of our university clients. The project is essentially a language-assessment pilot: students talking to AI avatars in immersive environments that set the scene, about their own interests, learning topics, etc., while the system quietly evaluates comprehension and reasoning.
And the results are clear, the proof is in the pudding as they say: Less fear of cheating, more proof of learning and more enjoyment in the process.
💡Here are three fast ways to try it in Hyperspace:
- Viva rehearsal pod: Students practice a 3-minute defense with an AI avatar that asks one factual and one judgment question. Transcript auto-saves.
- Scenario room: Two policy options on the wall, constraints on the floor, timer above the screen. Students pick, then record a short rationale inside the space.
- Process kiosk: Students drop in their AI prompt chains and tag them to the artifact they influenced. You skim method at a glance.
Implementation checklist (print-friendly)
- Add an AI-use policy to your syllabus (what’s allowed, where disclosure is required)
- Convert one major written task into brief + 3-min defense
- Add iteration checkpoints (outline, draft, peer review, change log)
- Pilot an AI-dialogue + reflection assignment
- If you must run detectors, document they’re advisory only—never sole evidence
- Add one portfolio mini-viva at term’s end (90 seconds per student)
Plus a tiny rubric you can adapt (for defense)
A tiny rubric you can adapt (for defenses)
10 points total
- Thesis clarity (2): crisp claim with scope.
Evidence (3): credible sources, correctly represented. - Reasoning (3): handles counter-arguments; explains trade-offs.
- Revision note (2): concrete changes post-defense, not vague platitudes.
FAQ
As an educator, should I still run detectors?
If your institution requires it, you could treat the output as one signal among many artifacts (drafts, logs, viva). Don’t sanction on a score alone. Ever.
How should students cite AI?
Pretty much like any source: Tool/Model, Date. Include a brief note: the prompts used and which sections it influenced (e.g., outline only; generated search terms; rephrased transitions).
Isn’t this more work for faculty?
Absolutely, at first it will create extra work, yes – week 1 – 2. After that, checkpoints and short vivas reduce last-minute panics and grade appeals. In fact, many faculty report fewer repetitive questions and a lighter planning/grading load when AI handles routine queries and drafts first passes (lesson plans, rubrics).
Large-scale surveys back this up: For example a Gallup and Walton Family Foundation surveys in 2024–25 confirm teachers save ~5.9 hours per week when using AI weekly, on planning, grading, and admin.
What if a student freezes in a viva?
Offer practice slots.Or go a step further and let them record asynchronously with an AI avatar — Hyperspace supports this out of the box. One thing to remember is we’re assessing reasoning, not stage presence.
Do you have any questions this article doesn’t answer? Feel free to reach out, we’re working with schools and universities that have already successfully embraced AI. We can help.
Closing thought
Integrity in the AI era isn’t about catching cheats. It’s about designing assessments that demand judgment, defense, and curiosity. Platforms like Hyperspace let institutions trial these models today -from AI-assisted vivas to empathy simulations -and build a culture where learning is visible, not assumed.
Ready to explore authentic AI-era assessment? Watch our EDU AI Mixer on-demand or connect with us to see how Hyperspace through our dedicated Immersive Learning Platform, supports faculty pilots.





