Build a Custom Voice AI Recruiting System
The best voice AI recruiting solutions are custom systems for automated phone screens. They ask role-specific questions and score candidate responses against a predefined rubric.
The build complexity depends on the number of roles and the depth of screening questions. A single-role screen with 5 questions is a 2-week build. Screening for 10 distinct roles with branching logic requires a 4-week build to create separate scoring rubrics.
We built a screening system for a 12-person recruiting firm that processed 400 applicants monthly. Recruiters spent 15 minutes per candidate on initial phone calls. The automated system completes the screen in 5 minutes and delivers a scored transcript, reducing recruiter time on initial screens by 90%.
What Problem Does This Solve?
Recruiting teams often try off-the-shelf applicant tracking systems (ATS) like Greenhouse or Lever that have 'AI' features. These tools are good for keyword matching on resumes but cannot conduct or evaluate a live phone conversation. They identify candidates who used the right words, not those who can articulate their experience.
Consider a 15-person staffing agency screening for a warehouse associate role. They use a tool like MyInterview, which provides pre-recorded video questions. The problem is that candidates can re-record answers, and recruiters still have to manually watch every 3-minute video. With 100 applicants, that is 5 hours of video review, which is no faster than doing phone screens.
Another approach is using API-driven services like AssemblyAI for transcription and then feeding the text to a GPT model. This breaks down because the raw transcript lacks context. The system cannot distinguish between a candidate's confident explanation and a hesitant, rambling answer. Without a fine-tuned model for conversational analysis, the scores are meaningless and recruiters do not trust the output.
How Does It Work?
We start by analyzing your 3-5 key screening questions for a high-volume role. We use the Claude API to generate a structured interview script that includes follow-up probes for incomplete answers. All candidate audio files and their transcriptions are stored in a Supabase database, providing a single source for data and retraining.
The core system is a FastAPI application deployed on AWS Lambda. When a candidate calls the dedicated number, the service records their responses. We use an audio processing library to clean the recording before sending it to a transcription service. The full transcript, which takes about 90 seconds to generate for a 5-minute call, is then passed to our scoring module.
The scoring module is a Python script that uses the Claude API with a detailed rubric prompt. It scores each answer from 1-5 on criteria like clarity, relevance, and experience. The final report, delivered as a PDF and a webhook to your ATS, includes the full transcript, individual scores, and a summary score. Total processing time from call end to report delivery is under 3 minutes.
This entire system runs for under $40 per month in usage costs for up to 500 screenings. Since we use serverless functions via AWS Lambda, you only pay for compute time when a screening is actively being processed. This avoids the high monthly fees of SaaS platforms that charge per seat, regardless of usage.
What Are the Key Benefits?
From Kickoff to Live in 3 Weeks
We deploy a production-ready screening system in 15 business days. Your recruiters can start sending candidates to the automated line immediately, not next quarter.
Pay for Usage, Not for Seats
A one-time build fee and low monthly API costs. Your cost scales with applicant volume, not your number of recruiters.
You Get the Full Source Code
The complete Python codebase is delivered to your company's GitHub repository. You are never locked into a proprietary platform.
Alerts When a Transcript Fails
We use structlog for structured logging. If a transcription or scoring API call fails, it sends an immediate alert so we can re-process the audio file manually.
Integrates Directly with Your ATS
A webhook sends the final score and transcript link directly to your existing ATS, like Greenhouse or Ashby. No new dashboard for your team to check.
What Does the Process Look Like?
Scoping & Scripting (Week 1)
You provide your current screening questions and access to your ATS. We draft the automated script and define the scoring rubric for your approval.
Core System Build (Week 2)
We build the FastAPI service, set up the Supabase database, and configure the transcription and scoring logic. You receive a private endpoint for testing.
Integration & Deployment (Week 3)
We connect the system to your ATS via webhook and deploy it to your AWS account. You receive the full source code in your GitHub.
Monitoring & Handoff (Weeks 4-8)
We monitor the first 100 live candidates for accuracy and performance. You receive a runbook detailing how to update scripts and manage the system.
Frequently Asked Questions
- How much does a custom voice screening system cost?
- The cost depends on the number of unique roles to screen and the complexity of the scoring rubric. A single-role system is a straightforward build. A system for five roles with branching interview questions requires more development. We provide a fixed-price quote after a 30-minute discovery call where we map out the exact requirements.
- What happens if a candidate's audio is unclear or the transcription is wrong?
- The system flags transcripts with a confidence score below 90% for manual review. A notification is sent with the audio file attached, allowing a recruiter to listen and override the score. This ensures poor audio quality does not unfairly penalize a good candidate. The process is documented in the runbook we deliver.
- How is this different from a platform like Talkpush?
- Talkpush is a full-featured conversational recruiting platform with a monthly subscription fee. We build a specific component for one part of your workflow: the initial phone screen. Our solution has no user interface to learn and plugs directly into your ATS. You own the code and only pay for API usage, not for recruiter seats.
- Can this system handle different languages?
- Yes, the transcription and analysis models we use support dozens of languages. We can configure the system to detect the language from the initial prompt or have separate phone numbers for different languages. The scoring rubric would need to be translated, which we scope as part of the initial build.
- What technical skills are needed to maintain this system?
- You do not need a dedicated AI engineer. The system is deployed on serverless infrastructure that requires minimal oversight. The runbook covers common tasks like changing an interview question, which involves editing a text file. For any code-level changes, you can engage us on a flat monthly maintenance plan.
- Is the candidate's data secure?
- Yes. All data is processed within your own cloud environment (AWS). Audio files are stored temporarily and can be set to auto-delete after a specified period, like 30 days, to comply with data retention policies. We use Supabase for its row-level security to ensure data is handled securely from end to end.
Related Solutions
Ready to Automate Your Small Business Operations?
Book a call to discuss how we can implement ai automation for your small business business.
Book a Call