Preserving the Maltese Language
One audio clip at a time
The Maltese Language is dying. AI can preserve it for us.
To achieve this, more data is needed. We need the voice of the people, il-VUÄŠI tal-poplu, to make this possible.
This way, the language will live on forever.
This model will also unlock multiple technological capabilities for the language. Notes and studies can be transcribed, recordings of oral stories, speeches or sermons can happen in real-time with exceeding accuracy.
ContributeAbout the Project
The Mission
Bringing the Maltese language into the digital age by creating a high-quality, open-source dataset of Maltese speech to enable better language technology.
Current Status
Recruiting a diverse set of volunteers to provide voice recordings and to transcribe various existing voice clips.
Open Source
All data collected will be available for the purpose of training AI models. All models derived from this dataset will be freely available for research and commercial purposes.
The Progess so far
Initial technical implementation
A purpose built transcription platform is being built for this project.
It will be open-source once completed.
Drafting the dataset licence
Your voice will remain yours. A custom dataset licence is being drafted that will allow you to contribute while retaining copyright.
The data can only be used to train AI models and any models using this dataset must be available for both research and commercial purposes.
Raising awareness
Active communication is ongoing with Maltese content creators, archivists, academics as well as spreading the word on social media and other platforms.
How to Contribute
Register
Register your interest by filling in the form here. You will receive an email once the transcription platform is live.
Spread the word
Tell your friends, family and colleagues. The more data, the better the model will perform.
Contribute (coming soon)
When the transcription platform is live you can either record short clips of your voice using any smartphone, or transcribe existing audio clips.