ICE Detention Facilities Compliance Report Extraction
This PDF is an ICE report on compliance among detention facilities over the last 20-30 years. Our aim is to extract facility statuses and contract signatories' names and dates. Challenges include strange redactions, blobby text, poor contrast, and ineffective OCR. It has handwritten signatures and dates that are redacted.
from natural_pdf import Judge
judge = Judge("checkboxes", labels=["checked", "unchecked"])
judge.forget(delete=True)
judge.add(region1, "checked")
judge.add(region2, "unchecked")
View full example →