Parse unstructured data, semi structured data such XML etc.
Design and develop efficient Mapping and workflows to load data to Data Marts
Map XML DTD schema in Python (customized table definitions)
Write efficient queries and reports in Hive or Impala to extract data on ad hoc basis for data analysis.
Identify the performance bottlenecks in ETL Jobs and tune their performance by enhancing or redesigning them.
Responsible for performance tuning of ETL mappings and queries.
import tables and all necessary lookup tables to facilitate the ETL process required to process daily XML files in addition to processing the very large (multi-terabytes) historical XML data files
14 Freelancer bieten im Durchschnitt $546 für diesen Job
Hi - This job looks like a good fit with my skill set and experience. I hold Bachelor of Computer Science and Master of Data Science Please see my profile and reviews for references.