When I joined Mechanismic as a data intern, the team spent 6 hours every Monday manually pulling data from 500+ sources. I built a Python pipeline that reduced it to 12 minutes. Here's the architecture, the gotchas, and the code that made it happen.