SA.

Work case

Academic Data Platform (Japan Learning Center)

Architected pipelines and analytics surfaces that processed 10K+ student records monthly, combining NestJS and Next.js services with Kafka, Airflow, PySpark, and LangChain on OpenAI.

Role
Software Architect
Published
Tags
edtech · data-pipeline · etl · ai · analytics

Student records

10K+ / month

Monthly academic data processing volume

Architecture team

12

Cross-functional delivery for pipelines and BI

Architecture placeholder for the Japan EdTech data pipeline platform

Problem

Academic data was fragmented, and teams lacked a reliable way to turn operational records into usable insights. Manual reporting slowed decision-making and made it hard to understand learning patterns across classrooms.

Solution

Academic data platform architecture

As software architect, I shaped a platform that combined NestJS services and a Next.js analytics experience with Kafka, Airflow, and PySpark pipelines. Processed datasets landed in AWS S3 and MySQL-backed stores for reporting, while LangChain on OpenAI supported classification and automated insight summaries surfaced in BI dashboards.

Architecture decisions

  • Kafka decoupled ingestion from batch and serving paths so upstream changes did not destabilize downstream consumers.
  • Airflow orchestrated PySpark workloads and dependencies with explicit retries and scheduling.
  • S3 acted as a durable lake-style landing zone before relational serving in MySQL, keeping heavy transforms off transactional paths.
  • LangChain structured prompts and tooling around OpenAI for repeatable classification and narrative insights fed into dashboards.

Impact

  • Processed 10K+ student records monthly for the learning center client.
  • Delivered BI dashboards plus AI-assisted classification and insights on top of standardized pipelines.
  • Coordinated delivery with an architecture-focused team of 12 across data, backend, and analytics surfaces.