Process automation and web data mining together reduced substantial time and cost
Client Background
- A leading management consulting company based out of USA wanted to collect US County level legal cases data on a monthly basis.
- It was manually mining data from multiple court websites and was looking for an optimized solution.
- For mining a single case, customer had to visit multiple web pages and tabs. Approximately nine tabs for each case and each tab had its own unique complexities and formats.
- Expected output was – a ready-to-upload excel file which supports to create their in-house database.
Approach To Solution
- AIMLEAP – Outsource Big Data Team built custom BOTS for web data scraping from multiple sources.
- Only US IP (Internet Protocol) was able to load case data from the web source, hence we deployed US IP to access the data.
- Scheduled monthly scraping process to scrap new cases added to the court websites.
- Custom built web crawler was able to mine the legal case data in almost close to the template format provided by the customer and this reduced manual intervention substantially
98%
Above 98% data quality consistently
75%
An approximate 75% cost saving
Turnaround
Quick turnaround