Blaise Cruz
Samsung Research Philippines
Mabuhay! đź‘‹
I’m a researcher at Samsung Research Philippines where I specialize in problems at the intersection of Multilinguality and Low-resource Languages.
Particularly, I am interested in understanding the behavior of models when constrained under low-resource multilingual domains. I’ve collaborated with many talented colleagues on various topics under this umbrella, including:
- Code Switching – Multilingual speakers naturally code-switch in two or more languages when speaking to peers, but multilingual models are still lacking in capabilities to understand and execute this phenomenon.
- Resources & Evaluation – More data is often the best remedy to “very little data”. In addition to working on 🇵🇠Filipino resources, I have also done work for Southeast Asian Languages and beyond.
- Applications in Low-resource – Employing creative techniques to improve performance in tasks such as Multilingual Translation, Question Generation, Fake News Detection, and more – all constrained under low-resource settings.
Previously, I’ve also been affiliated with UP Diliman, DLSU CeLT, and Senti AI.
If you’re interested in collaborating or if you want to chat about low-resource languages, feel free to get in touch! You may reach me through my email me (at) blaisecruz (dot) com
.
News
Jun 17, 2024 | The preprint for our paper SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages is out! |
---|---|
Jun 12, 2024 | The preprint for our paper CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark is out! |
May 15, 2024 | I’ll be joining the Mohammed bin Zayed University of Artificial Intelligence as a PhD student this Fall 2024! |
Mar 06, 2024 | The SEACrowd Data Catalogue – the main consolidated repositority for all datasets collected by the SEACrowd Project – is now live! |
Latest Posts
Jun 12, 2024 | Welcome! |
---|
Selected Publications
-
WMTData Processing Matters: SRPH-Konvergen AI’s Machine Translation System for WMT’21In Proceedings of the Sixth Conference on Machine Translation , 2021