- A leading management consulting company based out of USA wanted to collect US County level legal cases data on a monthly basis.
- It was manually mining data from multiple court websites and was looking for an optimized solution.
- For mining a single case, customer had to visit multiple web pages and tabs. Approximately nine tabs for each case and each tab had its own unique complexities and formats.
- Expected output was – a ready-to-upload excel file which supports to create their in-house database.
Process automation and web data mining together reduced substantial time and cost.
APPROACH TO SOLUTION
- AIMLEAP – Outsource Big Data Team built custom BOTS for web data scraping from multiple sources
- Only US IP (Internet Protocol) was able to load case data from the web source, hence we deployed US IP to access the data
- Scheduled monthly scraping process to scrap new cases added to the court websites
- Custom built web crawler was able to mine the legal case data in almost close to the template format provided by the customer and this reduced manual intervention substantially
Above 98% data quality consistently
An approximate 75% cost saving