For projects involving customer data analysis, how do you balance data transformation steps (like aggregation, enrichment and deduplication) to maintain both data quality and model performance, especially when using tools like Alteryx before feeding the data into machine learning models?

For projects involving customer data analysis, how do you balance data transformation steps (like aggregation, enrichment and deduplication) to maintain both data quality and model performance, especially when using tools like Alteryx before feeding the data into machine learning models?