Automated pipeline that standardizes, deduplicates, and validates 247 customer records in seconds.
| Date | Name | Phone | State | Revenue | |
|---|---|---|---|---|---|
| 3/15/2024 | john smith | 5551234567 | john@ | California | $4,200 |
| March 15, 2024 | JANE DOE | (555) 123 4567 | jane.doe@gmail | calif. | N/A |
| 2024-03-15 | Robert jones | 555.123.4567 | [email protected] | CA | $8,750 |
| 15-Mar-24 | sarah WILLIAMS | 555-123-4567 | not-an-email | Ca | $3,100 |
| 4/2/2024 | Michael Brown | 5559876543 | [email protected] | New York | $6,400 |
| 4/2/2024DUP | Michael Brown | 5559876543 | [email protected] | New York | $6,400 |
| 05/20/2024 | LISA CHEN | - | [email protected] | texas | $12,300 |
| null | David Kim | (555) 222-3344 | [email protected] | Florida | |
| 05/20/2024DUP | LISA CHEN | - | [email protected] | texas | $12,300 |
| June 1 2024 | AMY GARCIA | 555 444 5566 | amy.garcia@ | IL | $5,900 |
| Date | Name | Phone | State | Revenue | |
|---|---|---|---|---|---|
| 2024-03-15 | John Smith | (555) 123-4567 | [email protected] | CA | $4,200 |
| 2024-03-15 | Jane Doe | (555) 123-4567 | [email protected] | CA | $5,680 |
| 2024-03-15 | Robert Jones | (555) 123-4567 | [email protected] | CA | $8,750 |
| 2024-03-15 | Sarah Williams | (555) 123-4567 | [email protected] | CA | $3,100 |
| 2024-04-02 | Michael Brown | (555) 987-6543 | [email protected] | NY | $6,400 |
| 2024-05-20 | Lisa Chen | (555) 333-7788 | [email protected] | TX | $12,300 |
| 2024-05-28 | David Kim | (555) 222-3344 | [email protected] | FL | $7,250 |
| 2024-06-01 | Amy Garcia | (555) 444-5566 | [email protected] | IL | $5,900 |
Seven automated operations applied to the raw dataset
Parsed 4 inconsistent date formats and normalized all entries to ISO 8601.
Converted ALL CAPS, lowercase, and mixed-case names to consistent Title Case.
Identified and removed exact-match duplicate rows for Michael Brown and Lisa Chen.
Resolved N/A, null, empty, and placeholder entries via CRM lookup and median imputation.
Reformatted 5 non-standard phone numbers to the (XXX) XXX-XXXX pattern.
Flagged 3 invalid emails (missing domain/TLD) and resolved via customer lookup.
Mapped full names, partial abbreviations, and casing variants to 2-letter USPS codes.