Data Engineer IV

Metro Systems Inc
Metro Systems Inc

Job Overview

MSI is working with one of the worldrsquos largest, fastest growing, best companies in the world to hire top tier individuals to join their team. The Data Engineer will be working as part of a dynamic team and will play a critical role in a best in class, Fortune Top 10 company. Overview Validating all Datanet jobs are operating in optimal fashion, follow namingcode standards, are associated with correct user accounts and checked into source control (2 weeks) Consolidation of G2GBIS code to remove duplication, split into individual jobs that operate in cascading fashion to make troubleshootingrestartingpartial runs easier (2 weeks) Setup of automation of Brady G2GBIS code that will consume the S3 CSV file (uploaded manually) instead of requiring manual ad-hoc running of the code on demand (1 week) Clean and consolidate the “holding” section of Datanet. Catalog old jobs t hat have been deprecated and verify they are in source control, documentation about their configuration is confirmed and descriptions of their last functionality is created (1 week) Cleanupremove unused objects in Redshfit cluster. Move backup data into backup schemas and only leave production data in main schemas (1 week) Cleanup groupsLDAPHammerstonePrivilegesetc. to automate as much as possible. Removing redundantunused items and users and document their current use (1 week) Create Tableau visualizations to enable us to more easily perform security audits. Show all users who have access to confidential and diversity data, whether they are using it, and whether they are still in the org (2 weeks) Keep up with the Phoenix V2 beta changes. Consume their beta datasets as quickly as possible to determine if their rollout will cause any issues so we can address them as quickly as possible. Integrate new V2 when the goto production (12 weeks) Cleanup our core datasets in Tableau to minimize the duplicate columns and standardize all column namestypes (12 weeks) Cleanup our core datasets in Redshift objects to minimize the duplicate columns and standardize all column namestypes (remove any out of date or duplicate objects or replace with a view to the new sources) (12 weeks) Cleanup our code base to have a standardized format and make it easier to compare like code. Lots of copypasted sections of code with many formatting styles makes it difficult to see code similaritiesbugsduplications (4 weeks) Identify duplication of code in Datanet jobs where code could be removed as it is unnecessary, already done elsewhere, or is inconsistent. Use this to document actions which should be moved to earlier pre-processing and separation of business and display logic (12 weeks) Identify logical inconsistencies in our code base (using mcchcmap row must exist vs a direct join on key columns or using Point in Time values vs Current values) and document the issues and other areas that have the same or similar logicissues (12 weeks) Required Skills Experience with complex SQLETL Experience with (technical) requirements gathering Experience with Data transformation pipelining Experience with big data large data warehouse Experience with optimizing performance of business-critical queries Experience with joining and cleaning data Preferred Skills Tableau Domain working experience in HR Recruiting business andor for clients in that segments Working experience with Data scientists, BIE and Business Analysts to design and develop data infrastructure Experience with AWS technologies including Redshift, RDS, S3 or similar solutions Strong analytical skills Strong communication (written verbal) Working experience with remote co-workers

View More
Job Detail
Shortlist Never pay anyone for job application test or interview.