Skip to main content

Word Recognition API

Word Recognition Assessment

  • Please see playground examples to see the differences.

Recognition API Request (POST)

curl https://api.kidsmart.ai/v1/audio/recognition \
-H "x-api-key:$KIDSMART_API_KEY" \
-F "file=@$AUDIO_FILE" \
-F "user_token=$USER_ID" \
-F "reference_text=@$REFERENCE_TEXT" \
-F "model_id=$MODEL_ID"
-H "Content-Type: multipart/form-data"
NameTypeDescription
x-api-keyheaderYour API authentication key
filefieldAudio file (WAV format, max 30 seconds duration)
reference_textfieldExpected word or phrase for recognition
user_tokenfieldUnique identifier for the speaker
model_idfieldModel ID (from Kid Smart AI)
webhook_urlfield(Optional) URL to receive results via webhook

Recognition API Response Structure

The API returns a JSON response containing the recognition analysis:

{
"assessment_id": "b2e4df18-fdee-4b07-a687-3ef56abad050",
"reference_text": "plume",
"feedback": "feedback feature coming soon",
"audio_duration": 1.7066666666666668,
"model_id": "continuous_medium",
"user_id": "USER_ID",
"prediction": "plum",
"phoneme_details": [
{"phoneme": "P", "original": null, "timestamp": 1.18},
{"phoneme": "L", "original": null, "timestamp": 1.37},
{"phoneme": "AH", "original": null, "timestamp": 1.43},
{"phoneme": "M", "original": null, "timestamp": 1.63}
],
"correct": false,
"confidence": "High"
}
FieldDescription
assessment_idUnique identifier for this assessment
reference_textThe expected word/phrase that was tested against
predictionThe word/phrase that was recognized in the audio
correctBoolean indicating if the pronunciation was correct
confidenceConfidence level of the recognition (High/Medium/Low)
phoneme_detailsIf the child uttered the word or phrase incorrectly, the detailed breakdown of recognized phonemes and timing
audio_durationLength of the audio file in seconds
feedbackAdditional feedback about the recognition (if any, coming soon)

If correct == True (the child uttered the word or phrase correctly), the phoneme details are not returned.

New Feature: Webhooks

All audio endpoints now support webhooks for asynchronous result delivery. Add the optional webhook_url parameter to receive results via POST callback instead of polling:

-F "webhook_url=https://your-domain.com/webhook-endpoint"

When a webhook URL is provided, the API response will include a webhook notification:

{
"id": "e09ecf55-36b5-4936-83f4-ff3439223ed4",
"webhook_notification": "Results will be sent to the provided webhook URL upon completion",
"url": "https://api.kidsmart.ai/v1/audio/recognition/result/e09ecf55-36b5-4936-83f4-ff3439223ed4/"
}

Webhook responses are typically delivered within 30 seconds of the initial request. See webhooks documentation for more details