Jasivan Sivakumar


About me

I am a PhD candidate in Natural Language Processing (NLP) at the University of Sheffield. I started at the Speech and Language Technology (SLT) Centre for Doctoral training (CDT) in September 2021. My focus lies in numerical reasoning especially how to evaluate and improve smaller accessible general-prupose model to enhance their numerical understanding in downstream tasks. I'm also interested in encoding numbers, tokenisation and NLP for low-resource languages. My background is in mathematics and previously I was a mathematics teacher in a secondary school. I also taught French and English in Colombia.

Outside my work, I enjoy salsa dancing and other latino style dances. I have also started to knit. But above all, I love travelling and get to experience different cultures.


  • Jan 24 - Emergency Reviewer: Reviewed for LREC-Coling 2024, tracks "Less-Resourced/Endangered/Less-studied Languages", "Lexicon and Semantics" and "Social Media Processing".

  • Nov 23 - Huggingface Workshop: Delivered first workshop to undergraduates explaining how to train a model on custom dataset using huggingface.

  • Nov 23 - Supervision: Started jointly supervising two undergraduate students on multilingual numerical reasoning and number decoding.

  • Jul 23 - Youtube fame: Feature in AI Coffee Break youtube channel in ACL video at 8 mins 51 at ACL main conference, Toronto, Canada.

  • Jul 23 - Conference: Attended ACL 2023 in Toronto where I also presented FERMAT at the main conference.

  • Jun 23 - Quizmaster: Entertained the attendees of UKSpeech 2023 at their social event with a quiz.

  • Jun 23 - Invited talk: Presented FERMAT at the third annual Center for Doctoral Training in Speech and Language Technologies in Sheffield, UK.

  • May 23 - Field trip: Travelled to San Basilio de Palenque, Colombia to meet the afro-colombian Palenquero people and collect their native language data.

  • Feb 23 - Examiner: Marked third year undergraduate scripts for Professional Issues course.

  • Jun 22 - Conference Organiser: Student organiser of panel discussion at the Sheffield CDT second annual conferrence.


Curriculum Vitae

Find a full CV here.

Research Activities

NLP for Endangered Language Revitalisation

Sep 21 - Present
  • Collect and digitise text data using OCR for an endangered afro-colombia language, Palenquero.
  • Deploy low-resource language NLP research to develop pedagogical resources.

Speech Technology and NLP to improve Oral History Search Functionality

Nov 21 - Jun 22
  • Project with oral history video archive to train ASR systems for transcription.
  • Adapted NER systems to identify naval domain specific terms and improve transcription search.
  • Pipeline ASR, diarisation and NER systems to provide annotated transcript used to give desired timestamps with videos.


PhD Computer Science (Natural Language Processing)

Sep 20 - Sep 25
University of Sheffield, UK

MA Computational Linguistics - Distinction

Sep 20 - Sep 21
University of Wolverhampton, UK

PGCE Secondary Education (Mathematics)

Sep 17 - Sep 18
University of Cambridge, UK

Bachelors (MMath) Mathematics - First Class

Oct 12 - Jul 16
University of Warwick, UK
(Year abroad: Université Pierre et Marie Curie, France)

Collaborators and Affiliations



Department of Computer Science, Regent Court (DCS), 211 Portobello, Sheffield, S1 4DP, United Kingdom