Shuntaro Yada

矢田 竣太郎

My profile image

I am an associate professor working on library and information science and medical language processing at the University of Tsukuba in Japan, leading theKnowledge and Language Computing Laboratory (KaLC Lab).

日本語

Notion

yadaslis.tsukuba.ac.jp

contactshuntaroy.com

News

About

Research Theme

He specialises in social computing on social media and the application of natural language processing, and is currently engaged in medical language processing, drawing on his experience. In particular, he is researching and developing a system that extracts medical information from clinical documents (e.g. electronic health records; EHRs) as graphs to facilitate information sharing among healthcare professionals or between patients and them. Examples of such systems include a system that searches for radiology reports with similar diagnostic content and the HeaRT system (Japanese patent pending) which creates a timeline of a patient's treatment history from written text in EHRs. Annotated training data (corpora) are essential for the language processing models behind these systems, the annotations of which he has designed and built over several projects.

In his library and information science major, he is interested in creating a digital environment where people can encounter valuable information even passively. Serendy, a word-of-mouth book recommendation system that inspires those who are unsure of what to read, promotes chance encounters by delivering information on books mentioned by acquaintances and friends on social media such as Twitter. BookReach, a system that helps school librarians support classes, allows them to search through related materials according to class units, providing opportunities to encounter unexpectedly useful books. From 2023, we are also developing and operating a book search engine in Japan, which enables searching for books that provide emotional support for people with cancer and other intractable diseases.

Since 2025, as a principal investigator, he has established a new lab called the Knowledge and Language Computing Laboratory (KaLC Lab) at the University of Tsukuba. Building on his expertise in the domains of medicine and education, he aims to expand his research into other knowledge areas such as science, industry, law, economics, culture, and art, leveraging language processing technology as a core methodology.

Education

Graduate School of Education, the University of Tokyo

PhD (Education) | April 2020

Graduate School of Education, the University of Tokyo

Master of Arts (Education) | March 2016

The University of Tokyo

Bachelor of Arts (Education) | March 2014

Employment

National Diet Library Digital Library Division (Kansai-kan)

Part-time Researcher | April 2025–Present

University of Tsukuba Institute of Library, Information and Media Science

Associate Professor | Oct 2024–Present

University of Tsukuba Office of Online International Education

Vice Director | Oct 2024–Present

Nara Institute of Science and Technology Social Computing Laboratory

Affiliate Associate Professor | Oct 2024–Present

Assistant Professor | Nov 2020–Sep 2024

Postdoctoral Fellow | May–Oct 2020

Researcher | September 2019–April 2020

CSIRO (Australia) Data61 Language and Social Computing Team

Visiting Scientist | May–August 2019

The University of Tokyo Library and Information Science Laboratory

Researcher | April–August 2019

CSIRO (Australia) Data61 Language and Social Computing Team

Visiting Scientist | March–May 2018

KDDI Research (Japan)

Student Intern | October 2016–June 2017

Skills

Natural Languages

Japanese
Native speaker
English
Academic level (able to guide PhD students and give lectures in English)

Programming Languages

Python 3
My main programming language, constantly using since 2014 with the following experiences:
  • Natural Language Processing (Japanese and English)
  • Machine Learning (sklearn, tensorflow, pytorch, & transformers)
  • Data Analysis (polars, pandas, numpy, & scipy)
  • Data Visualisation (matplotlib, seaborn, & plotly)
  • Web API (Flask & FastAPI)
Elm
Web front-end development such as components and single page applications since 2020 (e.g. BookReach UI)
JavaScript
D3.js and jQuery (e.g. for designing a dashboard UI of user statistics visualisation)
Ruby
Intermittently using until 2015 for building simple web applications (with Ruby and Rails or Sinatra), and pre-processing textual data
R
Statistical hypothesis testing (e.g. t-test and ANOVA) and statistical modelling (including generalised linear mixed models)
Others
I have written small codes with Nim, Rust, Go, Haskell, Purescript

Markup Languages

HTML (and CSS)
Building web sites like this page (which is based only on a CSS framework, without using any fancy CV templates)
LaTeX
Typesetting academic articles

Tools

Dev tools
VS Code, Vim (Neovim), tmux, Git, & Docker
Server ops
Web servers (Apache, Nginx, & Caddy), basic web security (e.g. SSL/TLS), & computing resource management (for HPC) on Debian (Ubuntu) Linux
Databases
MongoDB, SQLite3, MySQL, & PostgreSQL (with Hasura GraphQL)
CMS
WordPress (building and operating the web sites of labs and workshops) & DokuWiki
Cloud
Google Cloud Platform (Compute Engine, Firebase, & AI Platform)
Video editing
  • Premiere Pro: experience in creating a workshop PV and short independent films
  • iMovie: for home movies
Vector graphics
Illustrator & Affinity Designer: Designing academic posters and business cards for myself and others