trim

1 usage across 1 PDF

ICE Detention Facilities Compliance Report Extraction

This PDF is an ICE report on compliance among detention facilities over the last 20-30 years. Our aim is to extract facility statuses and contract signatories' names and dates. Challenges include strange redactions, blobby text, poor contrast, and ineffective OCR. It has handwritten signatures and dates that are redacted.

with left_col.within() as col:
    label = left_col.find("text:closest(Previous Rating)")
    answer = label.below(until='text')
checkbox_region = answer.expand(5).trim()
checkbox_region.show(crop=True)
View full example →