Arabic Election Results Table Extraction from Mednine PDF
This PDF has a data table showing election results from the Tunisian region of Mednine. Challenges include spanning header cells and rotated headers. It has Arabic script.
# Remove spaces from numbers and convert to int
numeric_cols = df.columns[0:4]
df[numeric_cols] = df[numeric_cols].replace(r"\s+", "", regex=True).astype(int)
df
View full example →