The Tracker Harmonizer merges multiple waves of survey data, ensuring that variable labels and values align correctly.
1. Schema Alignment
We compare the column headers and value maps (e.g., 1=Male, 2=Female) across all uploaded files.
2. Fuzzy Matching
If labels don't match exactly (e.g., "Q1. Gender" vs "Gender"), we use Levenshtein distance to suggest potential matches to the user.
3. Stacking
Once the schema is harmonized, we stack the datasets vertically (row-binding), adding a "Wave" or "Source" variable to distinguish the origin of each row.