From training to production — end to end

من التدريب إلى الإنتاج — من الصفر إلى النهاية

Arabic
Complaint
Intelligence

تحليل
الشكاوى
العربية

I fine-tuned three MARBERTv2 models on Arabic complaints — sentiment, topic, and intent. Then wired them into a deterministic pipeline that routes multi-dialect input to the right action.

ضبطتُ دقيقاً ثلاثة نماذج MARBERTv2 على شكاوى عربية — المشاعر، الموضوع، والنية. ثم ربطتها في خط أنابيب محدد يُوجّه مدخلات متعددة اللهجات إلى الإجراء الصحيح.

Try it live ↓ جرّبه الآن ↓ How it's built كيف بُني

3 MARBERT models fine-tuned نماذج MARBERT مضبوطة دقيقاً

0.77·0.99·0.99 Macro F1 · sentiment · topic · intent (3-seed test) F1 الكلي · مشاعر · موضوع · نية (اختبار، ٣ بذور)

68% Sentiment OOD · Arabic tweets (n=500) OOD المشاعر · تغريدات عربية (٥٠٠)

96ms p50 inference · CPU, 3 classifiers p50 استدلال · CPU، ٣ نماذج

01 — The challenge

٠١ — التحدي

Arabic complaints are hard to classify

الشكاوى العربية صعبة التصنيف

Arabic spans dozens of dialects. Complaint text is informal, emotionally charged, and messy. Generic models trained on Standard Arabic fail badly on real-world complaints. I wanted to build something that actually works.

اللغة العربية تمتد عبر عشرات اللهجات. نص الشكاوى غير رسمي ومشحون عاطفياً. النماذج العامة المدرَّبة على الفصحى تفشل في التعامل مع الشكاوى الحقيقية. أردت أن أبني شيئاً يعمل فعلاً.

🗺️

Multiple dialects, one model

لهجات متعددة، نموذج واحد

Egyptian, Gulf, Levantine, Moroccan — MARBERTv2 was pre-trained on massive multilingual Arabic corpora, making it the right base to fine-tune on real complaints.

مصري، خليجي، شامي، مغربي — MARBERTv2 مدرَّب مسبقاً على كميات ضخمة من النصوص العربية متعددة اللهجات، مما يجعله الأساس المثالي للضبط الدقيق على الشكاوى الحقيقية.

🎯

Three tasks, not one

ثلاث مهام، ليست واحدة

Understanding a complaint means knowing how the user feels (sentiment), what it's about (topic), and what they want (intent). Each required its own fine-tuned model.

فهم الشكوى يعني معرفة شعور المستخدم (المشاعر)، وما تتحدث عنه (الموضوع)، وما يريده (النية). كل مهمة تطلبت نموذجاً مضبوطاً دقيقاً خاصاً بها.

02 — The models

٠٢ — النماذج

Three models, fine-tuned on MARBERTv2

ثلاثة نماذج، مضبوطة دقيقاً على MARBERTv2

I fine-tuned UBC-NLP/MARBERTv2 three separate times — once per task. Each model was trained for up to 20 epochs with early stopping (patience 2), evaluated on held-out test data across seeds 42, 123, and 2024, and published on Hugging Face (seed-42 checkpoint deployed in production).

ضبطتُ UBC-NLP/MARBERTv2 دقيقاً ثلاث مرات منفصلة — مرة لكل مهمة. دُرِّب كل نموذج حتى ٢٠ حقبة مع إيقاف مبكر (صبر ٢)، وقُيِّم على بيانات اختبار محجوزة عبر البذور 42 و123 و2024، ونُشر على Hugging Face (نموذج البذرة 42 في الإنتاج).

MODEL 01 / SENTIMENT

Sentiment

المشاعر

Detects how the user feels — negative, neutral, or positive. The first signal the pipeline reads.

يكتشف شعور المستخدم — سلبي أو محايد أو إيجابي. الإشارة الأولى التي يقرأها خط الأنابيب.

NEG NEU POS

MODEL 02 / TOPIC

Topic

الموضوع

Classifies what the complaint is about — routes it to the right department before a human reads it.

يصنّف موضوع الشكوى — يوجّهها للقسم الصحيح قبل أن يقرأها أي إنسان.

CONTENT TECHNICAL POLICY_SECURITY FINANCIAL

MODEL 03 / INTENT

Intent (action classifier)

النية (مصنّف الإجراء)

Understands what the user wants — a bug fix, a content change, or logging a note. Trained as the action task; exposed as intent in the API.

يفهم ما يريده المستخدم — إصلاح خطأ، تغيير محتوى، أو تسجيل ملاحظة. مدرَّب كمهمة action؛ يظهر كـ intent في الـ API.

REPORT_BUG USER_REQUEST GENERAL_NOTE

Macro F1 on held-out test sets (80/10/10 split), mean over seeds 42, 123, 2024. Sentiment · topic · intent order: 0.77 · 0.99 · 0.99. Production API loads the seed-42 Hugging Face checkpoints (sentiment test F1 0.76). OOD sentiment benchmark: 68% accuracy on Arabic tweets (n=500).

F1 الكلي على مجموعات اختبار محجوزة (تقسيم 80/10/10)، متوسط البذور 42 و123 و2024. ترتيب المشاعر · الموضوع · النية: 0.77 · 0.99 · 0.99. الـ API في الإنتاج يحمّل نماذج Hugging Face (بذرة 42؛ F1 المشاعر 0.76 على الاختبار). معيار OOD للمشاعر: 68% دقة على تغريدات عربية (٥٠٠).

03 — The pipeline

٠٣ — خط الأنابيب

From complaint to action

من الشكوى إلى الإجراء

The three models feed a deterministic rule engine that combines their outputs into a single routed action. A confidence guard catches uncertain predictions. An optional LLM explains the decision — but never changes it.

تُغذّي النماذج الثلاثة محرك قواعد محدداً يجمع مخرجاتها في إجراء موجَّه واحد. حارس الثقة يُمسك بالتنبؤات غير المؤكدة. ذكاء اصطناعي اختياري يشرح القرار — لكنه لا يغيّره أبداً.

STEP 01

📝

Arabic Input

المدخل العربي

Multi-dialect Arabic, validated and cleaned before inference

عربية متعددة اللهجات، تُصادَق عليها وتُنظَّف قبل الاستدلال

Pydantic

STEP 02

🧠

3× MARBERTv2

Sentiment, topic, and intent run in sequence — each returns a label + confidence

المشاعر والموضوع والنية تعمل بالتسلسل — كل منها يُرجع تصنيفاً ونسبة ثقة

Fine-tuned

STEP 03

⚙️

Rule Engine

محرك القواعد

Deterministic logic maps the 3 labels to one final action

منطق محدد يُرجع الـ ٣ تصنيفات لإجراء نهائي واحد

Deterministic

STEP 04

🛡️

Confidence Guard

حارس الثقة

Any model below threshold → MANUAL_REVIEW instead of silent failure

أي نموذج تحت العتبة → MANUAL_REVIEW بدلاً من الفشل الصامت

Threshold guard

STEP 05

✅

Action + Explanation

إجراء + تفسير

Routed action returned with optional LLM explanation in JSON

الإجراء الموجَّه مع تفسير اختياري بالذكاء الاصطناعي بصيغة JSON

FastAPI

04 — Try it

٠٤ — جرّبه

Live demo

تجربة مباشرة

Type any Arabic complaint — or pick one of the examples below — and the pipeline classifies it in real time. Classification is always free and deterministic. One free AI explanation per visitor (optional).

اكتب أي شكوى عربية — أو اختر من الأمثلة أدناه — وخط الأنابيب يصنّفها في الوقت الفعلي. التصنيف مجاني دائماً ومحدد. تفسير واحد مجاني بالذكاء الاصطناعي لكل زائر (اختياري).

API server → خادم الـ API →

Ctrl+Enter to run Ctrl+Enter للتشغيل AI explanation (1 free try) تفسير بالذكاء الاصطناعي (محاولة مجانية واحدة)

Running pipeline… جارٍ تشغيل خط الأنابيب…

ArabicComplaintIntelligence

تحليلالشكاوىالعربية