Data Services — Arabic.AI

/ 01 What we deliver

Arabic data, across every modality.

Text, audio, image, video, and fine-tuning datasets — all with native-speaker review, dialect coverage, and QA pipelines that match the rigor of the biggest Western labeling programs.

Core annotation services

Text

Text annotation & labeling.

Named Entity Recognition (general + domain), text classification and taxonomy mapping, sentiment and stance, intent detection, keyword/phrase extraction, document span labeling, passage relevance, summarization QA.

Multilingual

Multilingual & cross-dialect.

Arabic dialect coverage across MSA, Gulf, Levantine, Egyptian, and Maghrebi; English and other target languages; cultural localization; parallel corpus creation; code-switching handling.

Audio

Audio & speech.

Transcription (verbatim or clean), diarization, speaker ID, acoustic event labeling, emotion and tone tags, domain lexicon normalization, QA with WER/CER metrics.

Vision

Image & video.

Bounding boxes, polygons, semantic segmentation, keypoints and landmarks, frame-level action and event labels, object tracking, geospatial overlays for satellite and drone imagery.

Advanced fine-tuning services

Instruction tuning

Instruction-tuning datasets.

Prompt-to-ideal-response pairs; task-specific instructions for customer support, legal, finance, healthcare, public sector; multi-turn dialogue authoring with context continuity.

RLHF / RLAIF

Human preference data.

Pairwise comparisons, Likert ratings, ranking across helpfulness, harmlessness, truthfulness, and cultural appropriateness; structured qualitative feedback for reward model training.

Red teaming

Red teaming & safety evaluation.

Adversarial prompt sets, jailbreak testing, bias and fairness probes, hallucination assessment, PII leakage checks, safety policy calibration against your compliance standard.

Synthetic

Synthetic data generation.

Rule-based generators, model-assisted augmentation, privacy-preserving synthesis, scenario simulation for rare or long-tail events you can't collect naturally.

Corpus curation

Domain-specific corpus curation.

Content sourcing (public or proprietary), cleansing and normalization, deduplication, decontamination, semantic clustering and topical coverage analysis for pre-training.

Conversational

Conversational AI data.

Intent and entity schemas, slot-filling, persona-based dialogues, escalation flows, knowledge-grounded responses, and retrieval checks for RAG evaluation.

/ 02 Arabic-specific differentiators

Labeling Arabic correctly, not translating first.

Most labeling vendors handle Arabic by translating to English, annotating in English, and translating back. We don't. Native annotators. Native tooling. Native QA.

Morphology & diacritization.

Specialized tasks for tokenization alignment, root extraction, and diacritics restoration or verification. Arabic's non-concatenative morphology requires annotators who understand the difference between surface form and morpheme.

Regional & cultural context.

Islamic content expertise, local business and legal conventions, sensitive-content handling frameworks. Annotation guidelines are authored by domain experts from the target market, not generic offshore teams.

Code-switching realism.

Gulf/Levantine English intermix, Arabizi and romanized Arabic normalization. The dataset reflects how people actually write in WhatsApp, Slack, and customer tickets — not how grammar books say they should.

Right-to-left tooling.

Native RTL annotation interfaces and QA procedures. No bidirectional text bugs, no mirrored shortcuts, no layout shifts between annotator view and reviewer view.

/ 03 Quality & governance

Every dataset, traceable to the annotator and the rule.

A labeling program is only as good as its QA. Ours was built for customers who need to defend every decision in front of a regulator or an auditor — not just pass a vendor scorecard.

Guideline authoring.

Illustrated manuals with edge cases, decision trees, and do/don't examples. Every annotator is trained against the same rubric, and guidelines are versioned and auditable.

Training & calibration.

Annotator onboarding, gold-set calibration, and periodic drift checks. Anyone whose inter-annotator agreement falls below threshold is retrained or rotated off the project.

Multi-annotator validation.

Majority vote or consensus scoring, with expert arbitration whenever inter-annotator agreement (IAA) falls below a configurable threshold per task.

Feedback loops.

Continuous error analysis, guideline updates, and model-in-the-loop improvements. Your evaluation feedback feeds directly into the next batch, not a report nobody reads.

/ 04 Delivery models

Three ways to work together.

From self-service portal access for teams that want to own the schema, to fully managed programs with dedicated project leads. Pick what matches your team's capacity.

Self-service (T-Portal).

Spin up projects, configure schemas, invite reviewers, track KPIs, and export via API. For teams that want to drive their own labeling program.

Managed programs.

Dedicated PM, scalable teams from dozens to thousands of annotators, weekly reporting, SLA commitments, risk logs. For enterprise programs where delivery is non-negotiable.

Hybrid human-AI.

Pre-annotation and active learning, automated quality checks, human arbitration for edge cases. Faster turnaround without sacrificing the audit trail.

/ 05 Engagement flow

From scoping to handover.

Scoping & success criteria.

Use cases, acceptance thresholds, and privacy constraints nailed down before a single row is labeled.

Schema & guideline design.

Pilot annotation, gold sets, and calibration. Annotators trained against explicit edge cases, not vibes.

Production at scale.

Batches, QA gates, dashboards. Per-batch quality reports with IAA scores and common error categories.

Handover & integration.

Exports, lineage docs, and the final evaluation report. Optional fine-tuning support with your modeling team.

Indicative timelines: Pilot 2–4 weeks. Scale-up ongoing in sprints.

/ 06 Packages

Four tiers. Pick what fits your scale.

ENTRY

Entry

Best for

Pilots and proofs of concept.

Text annotation (NER, sentiment, intent)
Basic QA and weekly reports
T-Portal access
Optional small audio set

Get started

STANDARD

Standard

Best for

Multi-modal programs that need breadth.

Entry package plus
Audio / video annotation
Multilingual & dialect coverage
Advanced QA & dashboards
Guideline design included
Model-in-the-loop pre-annotation

Get started

ENTERPRISE

Enterprise

Best for

Regulated and high-scale operations.

Standard package plus
RLHF / RLAIF preference data
Red teaming & safety evals
Corpus curation
Dedicated team & SLAs
Data residency & on-prem options

Get started

CONSULTING

Consulting

Best for

Strategy and enablement for in-house teams.

Data strategy
Taxonomy & ontology design
Evaluator & rubric design
Annotator training programs
Optional on-prem / private VPC

Get started

Security, privacy, and compliance.

Data handling is the reason most regulated teams avoid labeling vendors. It's the reason they move to us.

Data minimization, role-based access, least privilege, and complete audit trails

Encryption in transit and at rest. Segregated VPCs for sensitive projects

Regional data residency options (MENA / EU) and client-managed keys on request

GDPR-aligned processing. HIPAA-like safeguards for PHI projects on request

Background-checked staff and secure facilities with controlled physical access

Redaction pipelines for PII, with PII-aware schemas built into the guideline

End-to-end Arabic data, safety-checked by 40k+ experts.

Arabic data, across every modality.

Text annotation & labeling.

Multilingual & cross-dialect.

Audio & speech.

Image & video.

Instruction-tuning datasets.

Human preference data.

Red teaming & safety evaluation.

Synthetic data generation.

Domain-specific corpus curation.

Conversational AI data.

Labeling Arabic correctly, not translating first.

Morphology & diacritization.

Regional & cultural context.

Code-switching realism.

Right-to-left tooling.

Every dataset, traceable to the annotator and the rule.

Guideline authoring.

Training & calibration.

Multi-annotator validation.

Feedback loops.

Three ways to work together.

Self-service (T-Portal).

Managed programs.

Hybrid human-AI.

From scoping to handover.

Scoping & success criteria.

Schema & guideline design.

Production at scale.

Handover & integration.

Four tiers. Pick what fits your scale.

Entry

Standard

Enterprise

Consulting

Security, privacy, and compliance.

Build your Arabic dataset.