
Open
Posted
•
Ends in 20 hours
Paid on delivery
We are looking for high-quality pre-recorded call center conversation datasets for AI/ASR model training purposes. Languages Required: • English – 500 Hours • Hindi – 500 Hours • Spanish – 500 Hours Dataset Requirements: • Call center conversation audio • Clean audio without background noise • No long silence segments • WAV audio format • 16 kHz, 16-bit or higher • Unidirectional audio preferred • Transcription text required Please provide the following information with your proposal: 1. Transcription Details • Is transcription text available? • Is the transcription AI-generated or human-annotated? • If AI-generated, can it be manually reviewed/refined? 2. Metadata & Accuracy Information Please confirm whether the dataset includes: • Speaker tags • Timestamp information • Speaker diarization details Also share accuracy metrics for: • Speaker diarization • WER (Word Error Rate) • Timestamp alignment 3. Audio Specifications • Confirm sampling rate and audio quality details. 4. Data Privacy & Compliance • Has all personal or sensitive information been de-identified/anonymized? • Please explain the de-identification process used. 5. Usage Scope • The dataset will be used strictly for internal AI/model training purposes only. Important Notes: • We are only looking for pre-recorded/off-the-shelf datasets. • Fresh recordings or newly collected data are not required for this project. • Please share sample files, pricing, delivery timeline, and licensing/commercial usage terms if available. If you have relevant datasets available, feel free to send details via message or proposal.
Project ID: 40475844
8 proposals
Open for bidding
Remote project
Active 3 days ago
Set your budget and timeframe
Get paid for your work
Outline your proposal
It's free to sign up and bid on jobs
8 freelancers are bidding on average $2,601 USD for this job

I can help source and evaluate pre-recorded call-center datasets for English, Hindi, and Spanish that meet ASR/AI training requirements. My focus would be on validating: • Audio quality and formatting • Transcription availability and quality • Speaker diarization and timestamps • WER accuracy metrics • PII anonymization/compliance • Licensing suitability for internal AI training I also understand the common quality problems in commercial datasets—poor diarization, noisy recordings, unusable silent segments, or weak transcription alignment—and can help assess datasets before commitment. Happy to review available sources and help identify the strongest options for your training pipeline.
$2,100 USD in 5 days
1.5
1.5

⭐ I handled a similar project ⭐, Happy to show you what works before you commit. I developed a high-quality pre-recorded call center conversation dataset for AI training. Aligned with your need for pre-recorded call center conversation datasets for AI/ASR model training. Understand the importance of clean audio, accurate transcriptions, and speaker diarization. Specialized in creating datasets with a focus on performance, security, and user experience. Worst case, you walk away with a free consultation and a clearer understanding of your project. Kind regards, Foraxis
$3,700 USD in 7 days
0.0
0.0

Hello, I can help provide and prepare pre recorded call center conversation datasets for AI and ASR model training in English, Hindi, and Spanish. I understand the main concern is not only total hours, but clean audio quality, accurate transcripts, metadata, speaker tags, timestamps, diarization details, privacy compliance, and clear commercial usage terms. I can share sample files first, then provide full dataset details including WAV format, sampling rate, transcription method, annotation quality, WER, timestamp alignment, anonymization process, licensing scope, pricing, and delivery timeline. I can also help organize the dataset properly for model training, with clean folder structure, audio files, transcripts, metadata sheets, and documentation so your team can review and ingest the data easily. Once you confirm the required quality level and licensing terms, I can send the sample package for validation. Best regards Ankit
$1,000 USD in 15 days
1.0
1.0

With a track record of over 9 years in web and mobile development, my team and I are well-equipped to undertake this project. Our diversified skill set doesn't just stop at app and web development - we are also proficient in AI Development which makes us apt for curating datasets tailored to your specific needs. Over the years, we have honed our abilities in creating accurate, high-quality, and well-organized data resources which strictly adhere to the specified requirements. To address key aspects of your project, our transcription details include human-annotated texts as opposed to AI-generated ones. This ensures higher accuracy and eliminates any errors that an AI-based system might introduce. We can provide speaker tags, timestamp information, and necessary speaker diarization details as per your requirement. We also guarantee that all personal and sensitive information will be effectively anonymized for data privacy and compliance. Moreover, I assure you that our dataset collection will be comprehensively well-suited for internal AI/model training purposes only. We understand the importance of pricing, timely delivery, sample files for evaluation, and licensing/commercial usage terms in a project like this and we're glad to assist you on those fronts too. Don't wait any longer; let our proven expertise take your AI model-training aspirations to new heights
$2,505 USD in 7 days
0.0
0.0

Zhob, Pakistan
Payment method verified
Member since Apr 4, 2022
$30-250 USD
$10-30 USD
$10-5000 USD
$750-1500 USD
$30-250 USD
₹600-1500 INR
₹12500-37500 INR
₹12500-37500 INR
$30-250 CAD
$15-25 USD / hour
€6-12 EUR / hour
$48 USD
$2-8 CAD / hour
$2-8 USD / hour
$8-15 USD / hour
$250-750 USD
$2-8 USD / hour
$30-250 CAD
₹750-1250 INR / hour
₹750-1250 INR / hour
$110 AUD
$15-25 USD / hour
$40 USD
₹12500-37500 INR
₹12500-37500 INR