Pearson - Education

Personalizing Education for 50,000+ Students

How we built an adaptive learning engine that personalizes curriculum for every student on Pearson's online platform, increasing engagement by 35%.

Duration

12 weeks

Team

3 engineers, 1 data scientist, 1 PM

Tech Stack

PythonFastAPIPyTorchNeo4jGraphQLReactNext.jsTypeScriptGoogle Cloud RunCloud SQL (PostgreSQL)BigQueryBERTRedisDocker

The Challenge

Pearson's online learning platform serves 50,000+ K-12 students across 120 school districts. Despite having a library of 15,000 learning modules (videos, interactive exercises, quizzes, and reading materials), every student in a given grade received the same content sequence. Teachers could manually adjust assignments for individual students, but with class sizes of 25-35, meaningful personalization was impossible. The result was predictable: advanced students were bored, struggling students fell further behind, and engagement metrics showed a steady decline - average session time had dropped 18% year-over-year, and course completion rates sat at 61%.

Pearson had previously tried two approaches to personalization. The first was a simple branching logic system where quiz scores above or below certain thresholds would route students to harder or easier content. This was too coarse - it treated all wrong answers the same and couldn't distinguish between a student who didn't understand the concept and one who made a careless arithmetic mistake. The second attempt was an off-the-shelf adaptive learning plugin from an ed-tech vendor, but it required content to be tagged in a proprietary format, couldn't integrate with Pearson's existing content management system, and only supported multiple-choice assessments. After 6 months of integration work, it was used by fewer than 5% of teachers and was quietly deprecated.

Our Approach

We started by working with Pearson's curriculum team and a panel of 12 teachers to build a knowledge graph mapping the prerequisite relationships between 2,200 discrete learning objectives across math, science, and ELA for grades 3-8. This graph became the backbone of the adaptive engine - rather than just adjusting difficulty, the system could identify specific knowledge gaps and route students to the exact prerequisite content they needed.

For the student modeling layer, we evaluated several approaches: Bayesian Knowledge Tracing (BKT), Deep Knowledge Tracing (DKT) using LSTMs, and a custom model we called Attentive Knowledge Tracing (AKT) that used transformer-based attention over a student's interaction history. After testing on 18 months of historical student interaction data (approximately 42 million learning events), the AKT model outperformed BKT by 23% and DKT by 11% on predicting next-question correctness (AUC 0.847). The key advantage was its ability to model forgetting curves and capture long-range dependencies - for example, recognizing that a student who struggled with fractions in October might need a brief refresher before tackling ratios in February.

The recommendation engine uses a multi-armed bandit approach (Thompson Sampling) to balance exploitation (serving content predicted to be at the student's optimal difficulty) with exploration (occasionally testing harder or different-format content to refine the student model). We tuned the exploration rate through A/B testing with a cohort of 3,000 students over 4 weeks, finding that a 15% exploration rate maximized both learning gains and engagement.

The Solution

The adaptive engine is deployed as a set of microservices on Google Cloud Platform. The knowledge graph is stored in Neo4j and queried via a GraphQL API. The AKT student model runs on Cloud Run with GPU support, updating each student's mastery estimates in real-time as they interact with content. The recommendation service (Python, FastAPI) pulls from the student model and knowledge graph to generate a ranked list of next-best learning activities, which is served to the React frontend via a REST API. A teacher dashboard built with Next.js provides class-level and individual student views showing mastery heatmaps across the knowledge graph, predicted areas of difficulty, and suggested intervention points. All student data is encrypted and the system is FERPA and COPPA compliant. The content tagging system uses a semi-automated pipeline: a fine-tuned BERT model tags new content with learning objectives at 91% accuracy, with curriculum specialists reviewing and correcting tags before content enters the adaptive pool.

Results

35% increase in student engagement (average session time from 22 min to 29.7 min), measured across 50,000+ students over the first semester post-launch
28% improvement in course completion rates (61% to 78%), with the largest gains among students who were previously in the bottom quartile
Teachers save an average of 10.4 hours per week on personalization and progress monitoring, based on self-reported surveys from 340 teachers
Student satisfaction increased from 3.4 to 4.6 out of 5 - driven primarily by students feeling that content was 'just right' in difficulty (up from 34% to 71%)

Key Insight

The knowledge graph of prerequisite relationships was more valuable than the ML model itself - without it, the system could adjust difficulty but couldn't diagnose why a student was struggling or prescribe the right remediation.

“The completion rate jump from 61% to 78% is the number our board cares about, but what actually changed the product was the teacher dashboard. Teachers went from feeling like the AI was a black box to using it as their primary planning tool. Three districts renewed specifically because of that feature.”
DT
Dave Treat
CTO at Pearson

More Case Studies

Healthcare

Reducing Patient Intake Time by 73%

How we built an AI-powered patient intake system for Google Cloud's healthcare platform that processes insurance verification, medical history, and registration in under 3 minutes.

Finance

Catching 99.2% of Fraud with 40% Fewer False Positives

How we built a real-time fraud detection system that processes millions of transactions daily with sub-10ms latency.

Ready to build your AI advantage?

Stop researching. Start building. Book a free consultation and discover how custom AI can transform your business.

Personalizing Education for 50,000+ StudentsPersonalizing Education for 50,000+ Students