Advancing a Language Learning Platform With Real-Time AI Conversations

Unpacking the project

Who’s the client?

A global language learning platform offering multi-language courses to millions of users worldwide.

What they were looking for

Boosting retention, increasing conversions, and sharpening their competitive edge through an AI conversation assistant.

What Oxagile delivered

The solution allowing learners to engage in natural, AI-driven dialogue simulations while receiving personalized feedback powered by LLMs.

Numerical narrative

of conversations reach a logical goal-oriented close — proving the strength of the dialogue design

of users complete their speaking tasks, showing high engagement and content relevance

system latency ensures near real-time interactions that keep users fully immersed across the full conversation loop

Strategic goals behind the initiative

The project aimed to deliver meaningful educational value, while simultaneously supporting key business objectives:

Enhance users' confidence in speaking and easing their fear of errors
Boost conversion rates from free users to subscribers
Increase retention by adding meaningful value to existing subscriptions
Strengthen the product’s positioning with AI-driven learning tools

What we built — the core feature set

AI-powered speaking simulation

Delivered a feature for realistic, scenario-based speaking practice with live, adaptive AI responses and instant feedback.
Custom guardrails for AI safety

Implemented proprietary guardrails to restrict off-topic or inappropriate LLM responses, tailored for age-appropriate and educational use.

Context-aware speech recognition

Enhanced audio-to-text accuracy by passing scenario-specific keywords and pronunciation data to the recognition model.
Near real-time multi-step pipeline

A modular streaming pipeline processes user audio through microservices for real-time speech-to-text, AI response, and feedback.

Obstacles crushed for a flawless experience

Use case #1
Use case #2
Use case #3
Use case #4

Mispronunciations impact transcription accuracy

Early learners often mispronounce words, causing transcription errors.

Solution #1

Context-driven speech recognition

Oxagile injected scenario-specific keywords and pronunciation data as contextual hyperparameters into the speech-to-text API, boosting recognition accuracy by focusing on relevant terms despite imperfect audio.

Dialogues that drag or lack focus

Some conversations may extend unnecessarily without a clear direction, resulting in reduced efficiency and failure to drive users toward defined learning objectives or meaningful outcomes.

Solution #2

Dynamic dialogue management

Each user input triggers a fresh LLM prompt, enabling dynamic, unscripted AI responses. To avoid endless conversations, Oxagile set hard limits (e.g., limited max exchanges per scenario), developed supporting analytics, and embedded logic to recognize when learning goals are met, guiding dialogues to purposeful, satisfying conclusions.

Complex AI pipeline causing delays

The end-to-end feature involved multiple sequential steps: capturing audio, transcribing it, processing the result through an LLM, and returning a contextual reply — yet had to respond instantly.

Solution #3

Optimized real-time pipeline

Oxagile tested providers and chose OpenAI for best latency and throughput, delivering replies within 2–3 seconds to ensure smooth, real-time interaction.

Insufficient default guardrails

Off-the-shelf LLM guardrails were not sufficient: topics and phrases acceptable in general use cases could still be inappropriate in an educational environment. False positives were a greater concern as well, as they risked blocking valid, paid content due to overly restrictive filtering.

Solution #4

Custom guardrails for safe learning

We built proprietary guardrails with scenario-specific rules and ethical filters, allowing precise LLM output control. Continuous monitoring and feedback loops keep safeguards aligned with evolving user behavior and product needs.

Oxagile’s role: Not just hands, but a partner with product mindset

“From the start, we stepped in as true product partners. We collaborated directly with the client’s product managers and content creators, helping shape the strategy and execution. When our analytics revealed content gaps or spotted conversational dead-ends without logical closure, we fed insights back to the content team. Such feedback loops between tech and content ensured both sides evolved in sync. From strategy through deployment, we shared ownership and impact, driving the product forward together.”

The same client,
different project:

The same client, different project
Find out more

What was the task

The client required a solution to enable learning designers to efficiently create, organize, localize, and publish language‑learning content for an online platform.

What was done

Oxagile delivered a custom CMS featuring multi‑language support, structured course creation (levels, chapters, activities), placement testing tools, grammar review capabilities, user roles, content reuse/tagging, previews across devices, and reporting features to streamline content management.

Tech stack that powered the solution

More scenarios where AI amplifies our clients’ offering

AI-powered ad generation tool Automating digital advertising strategies→

Live basketball action stats Potent combination of CV and analytics→

Real-time soccer highlights ML compilation solution→

Cookieless user identity solution Rich media ads precision→

LLM-powered news aggregation Reading between the lines and introducing context awareness→

Brand reputation management solution How an ML Model for NLP Chases Negativity→

Visitor age verification solution Reliably detecting a person's age→

One powerful dating app Limitless possibilities for cozy connections→

AI browser assistant Streamlined ownership costs and rapid feature delivery→

Video analysis platform Next-gen computer vision-based solution→

Handwritten text recognition From cropping to smart optical character recognition→

AI-driven parking lot detection 85% recognition accuracy→

Your learners deserve more than static scripts

A simulated dialogue shouldn’t feel robotic. With the right LLM orchestration and contextual intelligence, AI can talk with nuance, purpose, and presence.

Bring AI alive