Jasivan Sivakumar

I

About me

I am a PhD candidate in Natural Language Processing (NLP) at the University of Sheffield. I started at the Speech and Language Technology (SLT) Centre for Doctoral training (CDT) in September 2021. My focus lies in numerical reasoning especially how to evaluate and improve smaller accessible general-prupose model to enhance their numerical understanding in downstream tasks. I'm also interested in encoding numbers, tokenisation and NLP for low-resource languages. My background is in mathematics and previously I was a mathematics teacher in a secondary school. I also taught French and English in Colombia.



Outside my work, I enjoy salsa dancing and other latino style dances. I have also started to knit. But above all, I love travelling and get to experience different cultures.

News

  • Nov 24 - Invited Talk: Presented a high-level functioning of LLMs in Chatbot to undergraduate law students discussing risk and benefits highlighting ethical and legal considerations..

  • Jan 24 - Emergency Reviewer: Reviewed for LREC-Coling 2024, tracks "Less-Resourced/Endangered/Less-studied Languages", "Lexicon and Semantics" and "Social Media Processing".

  • Nov 23 - Huggingface Workshop: Delivered first workshop to undergraduates explaining how to train a model on custom dataset using huggingface.

  • Nov 23 - Supervision: Started jointly supervising two undergraduate students on multilingual numerical reasoning and number decoding.

  • Jul 23 - Youtube fame: Feature in AI Coffee Break youtube channel in ACL video at 8 mins 51 at ACL main conference, Toronto, Canada.

  • Jul 23 - Conference: Attended ACL 2023 in Toronto where I also presented FERMAT at the main conference.

  • Jun 23 - Quizmaster: Entertained the attendees of UKSpeech 2023 at their social event with a quiz.

  • Jun 23 - Invited talk: Presented FERMAT at the third annual Center for Doctoral Training in Speech and Language Technologies in Sheffield, UK.

  • May 23 - Field trip: Travelled to San Basilio de Palenque, Colombia to meet the afro-colombian Palenquero people and collect their native language data.

  • Feb 23 - Examiner: Marked third year undergraduate scripts for Professional Issues course.

  • Jun 22 - Conference Organiser: Student organiser of panel discussion at the Sheffield CDT second annual conferrence.

Publications

Curriculum Vitae

Find a full CV here.

Education

PhD Computer Science (Natural Language Processing)

Sep 20 - Sep 25
University of Sheffield, UK

MA Computational Linguistics - Distinction

Sep 20 - Sep 21
University of Wolverhampton, UK

PGCE Secondary Education (Mathematics)

Sep 17 - Sep 18
University of Cambridge, UK

Bachelors (MMath) Mathematics - First Class

Oct 12 - Jul 16
University of Warwick, UK
(Year abroad: Université Pierre et Marie Curie, France)

Research Activities

NLP for Endangered Language Revitalisation

Sep 21 - Present
  • Collect and digitise text data using OCR for an endangered afro-colombia language, Palenquero.
  • Deploy low-resource language NLP research to develop pedagogical resources.

Speech Technology and NLP to improve Oral History Search Functionality

Nov 21 - Jun 22
  • Project with oral history video archive to train ASR systems for transcription.
  • Adapted NER systems to identify naval domain specific terms and improve transcription search.
  • Pipeline ASR, diarisation and NER systems to provide annotated transcript used to give desired timestamps with videos.

Professional Experience

Applied Scientist Intern (Amazon - Alexa International)

Jul 24 - Oct 24

Bellevue, Washington, USA

A benchmark was created using AWS tools with a human-in-the-loop approach to assess relevancy. Prompts for foundational models, such as Claude, Llama, and Mistral, were optimised for specific tasks. Documentation was presented to cross-functional partners to explain the need for benchmarking. Collaboration was carried out with software engineers to review code and deploy updates to production.

Consultant (University of Sheffield - Student Marketing / Elevate)

Jan 24 - Jun 24

Sheffield, South Yorkshire, UK

A Retrieval-Augmented Generation (RAG) solution was researched and explained to automate responses for high-volume queries. A topic modeling solution was delivered to analyse large and complex document sources. Clients were advised on factors such as cost, efficiency, scalability, stability, and the technical skills required.

Teacher of Mathematics (Abbey College)

Sep 18 - Aug 21

Ramsey, Cambridgeshire, UK

Lessons for students aged 11 to 18 were prepared and taught, with pupils' well-being also cared for as part of the role. Key Stage 5 courses were led by developing assessments and resources, with a focus on preparing students for university, including entrance exams. Support was provided to teachers through upskilling initiatives, departmental training in teaching and learning, and mentoring of trainee teachers.

Language Teacher (Universidad de Sucre)

Aug 17 - Jun 18

Sincelejo, Sucre, Colombia

Modules on French and English culture and civilisation, as well as language skills courses ranging from A1 to C1 levels, were taught. Assistance was provided to student teachers with practice program action-research projects. Students were also prepared for certification in both French and English.

Collaborators and Affiliations

Contact

Location:

Department of Computer Science, Regent Court (DCS), 211 Portobello, Sheffield, S1 4DP, United Kingdom