Extracting Economic Data from Brazil's Central Bank PDF
This PDF is the weekly “Focus” report from Brazil’s central bank with economic projections and statistics. Challenges include commas instead of decimal points, images showing projection changes, and tables without border lines that merge during extraction.
.extract_table('stream')
.to_df(header=False)
.dropna(axis=0, how='all')
.assign(
year=section.find('text[size~=10]:regex(\d\d\d\d)').extract_text(),
value=headers
)
View full example →