Geocoding Brazilian data with {geocodebr}

Fast, open-source geocoding for Brazil

I’m super glad to share {geocodebr}, our new R package for geocoding Brazilian data.

Two key contributions

  1. Truly open and public-sector ready: {geocodebr} is the first fully free and open-source geocoder built entirely on official Brazilian address data. Because all source data and code are public, results can be audited, reproduced, and improved by anyone—an essential feature for government workflows and academic research.

  2. Blazing performance: The package is written in high-performance R streaming data through Arrow and DuckDB backends. In our benchmarks we geocoded the entire Cadastro Único (CadÚnico) register — over 43 million addresses — in about 65 minutes. That’s orders of magnitude faster than traditional approaches using Google Maps or ArcGIS and comes with zero per-request fees.

Why another geocoder if one can use Google Maps, ArcGis or Nominatim? Brazil already has rich, official address cadaster (CNEFE) but there was no free, programmatic tool that could leverage it at scale. Existing commercial APIs are costly, impose usage limits, and often lack transparency about how results are produced. We needed something better for our research at Ipea’s Access to Opportunities Lab (AOP-Lab), where we routinely process tens of millions data points of Brazilian administrative records.

An invitation to the community

Our initial motivation was to accelerate our own research — geocoding Brazil’s administrative records so we can study how location shapes access to jobs, schools, and public services. By releasing {geocodebr} on CRAN we hope to empower other government institutions, researchers, and civic-tech practitioners, who will be able use this package to address (pun intended) several other issues that require geocoded information.

The package is live on CRAN, the source is on GitHub, and extensive examples are in the vignette. Give it a spin, file issues, and let us know what you build. Hopefully, we can help raise the bar for spatial data quality across Brazil.

Happy geocoding!

comments powered by Disqus

Related