Models Overview
At Kid Smart AI, we believe
- Transparency is the key to building trust with both our partners and parents and guardians
- Child developmental experts must be involved from the beginning
- Continuous improvement is needed, so feedback from partners is invaluable
We use a both pretrained and proprietary models to ensure best possible outcome for each application.
We currently leverage the following open source pre-trained models:
Purpose | Model Name | Pretrained Model Developer | Model Host | Model Card |
---|---|---|---|---|
Speech Recognition | Whisper V3 | OpenAI | Kid Smart AI | model card |
Text Generation | llama-3 | Meta | Kid Smart AI | model card |
Image Generation | Stable Diffusion XL (SDXL) | Stable Diffusion | Kid Smart AI | paper |
We leverage pre-trained models rather than building our own models because, for instance, teaching a model to read and write english is very expensive (it can cost millions of dollars to train these models). Due to the cost of training, some companies then "open-source" their models so that individuals and businesses can modify them for their own needs.
For each of these pretrained models, we modify them heavily for the young child use case. This includes all of the following:
- Use case specific modification of the model inputs or representations (eg tokenizer modification or ReFT)
- Adjustment of the model weights (techniques like RLHF, continued pre-training or LoRA)
- Add additional layers and/or remove layers
When no pre-trained alternatives exist, we train our own proprietary models, like below.
Purpose | Model Name | Model Developer | Model Host | Model Card |
---|---|---|---|---|
Fluency Error Classification | fluency_v1 | Kid Smart AI | Kid Smart AI | In progress |
Proununciation | pronunciation_v1 | Kid Smart AI | Kid Smart AI | In progress |
Word Recognition | recognition_v1 | Kid Smart AI | Kid Smart AI | In progress |
We use other models, like Open AI GPT-4, Anthropic, Meta Wav2Vec, and OpenAI Whisper V2 for internal purposes.