We’re working with historians digitizing Arabic manuscript records. The transcription data contains multiple formatting inconsistencies, and we need a cleaned dataset to support further linguistic analysis.
Requirements:
Familiarity with handling historical or textual data
Experience working with non-Latin characters a plus
Strong Excel and text manipulation skills
To Apply:
Send a sample of past work with textual or cultural datasets, especially with diverse languages or historical formats.
مراحل الوظيفة
Project Deliverables
Tasks Required:
Normalize manuscript IDs
Standardize language labels
Clean and align Arabic text field (no merged/partial entries)
Format manuscript dates consistently and keep both systems (Gregorian + Hijri if present)
Flag incomplete rows
Deliverables:
Clean Excel file with consistent fields
Tab for notes on flagged/incomplete rows