Technologies Used: Pure C#, .NET Core (no external dependencies)
Originally made as part of the NUS Ripper project. In the NUS Ripper project, a part of the work was to clean and normalize the oft-mismatched and messy titles
from game publishers for submission to the No-Intro database.
This cleaning involved romanizing Japanese, Chinese, and Korean text. Romanization is the process of converting text written in a different alphabet into a Latin alphabet - the kind you're reading right now. A good example of this would be converting “どうぶつのもり” into “Dōbutsu no Mori” - it's not translation, since that clearly didn't become English, but the process makes it pronounceable to English readers.
I could have perhaps gone with someone else's library for the task or made some quick code to do it as part of the NUS Ripper project, but instead I opted to make it it's own specialized library, and this is the result. I still plan to add additional languages and romanization systems in the future, but at the moment it already supports Japanese katakana, hiragana, and Kanji, Chinese Hànzì, and Korean Hangeul and Hanja.
For each romanization system and language I researched the language & system (for days in some cases) to make sure I had a solid understanding of how exactly the everything worked, to make the romanizations as accurate as possible.
I love this project - it gives me an excuse to learn more about linguistics and individual languages, and it's really satisfying inputting some text in a language you can't even pronounce, and getting readable, pronounceable text out the other side.
Another really nice thing about working on this project was that it gave me an opportunity to get some solid practice in with learning CI/CD, and how to set them up efficiently.