Inconsistent city names in datasets can lead to confusion and errors in data analysis. By using AI-driven standardization, you can ensure uniformity in city names across your entire dataset.
Ensuring uniform city names in survey responses for accurate geographical analysis.
Standardizing city names in historical data for consistency in longitudinal studies.
Business Intelligence
Unifying city names in customer databases for more precise regional sales analysis.
Standardizing city names in supply chain data to improve logistics planning.
Government and Public Services
Ensuring consistent city names in census data for better policy-making and resource allocation.
Standardizing city names in emergency response databases to enhance coordination.
Step-by-step instruction
Step 1. Set Input Dataset
Consider a dataset containing various entries of city names that need to be standardized. Here is a sample dataset before standardization:
{{line}}
Step 2. Define the AI Prompt
The most crucial aspect of leveraging AI effectively is crafting a precise and relevant prompt. A well-defined prompt ensures the AI understands the task clearly, leading to accurate and useful outputs. This involves being specific about the desired outcome, providing necessary context, and avoiding ambiguity.
Prompt Example
Standardize the city name in the @City_Name column to its full and proper form. Ensure consistent capitalization and correct any abbreviations or informal versions of the city names
Why This Prompt Is Good
Clearly states the task (standardization) and the target column (City_Name).
Emphasizes consistency in capitalization, ensuring uniformity across the dataset.
Addresses the need to correct abbreviations and informal versions, enhancing data quality.
{{line}}
Step 3. Configure the Flow Designer
Add the input dataset to the flow designer.
Select the AI Column node from the tools panel and enter the prompt.
Start with a row-by-row execution to fine-tune your prompt.
Correct your prompt, regenerate any single row, or remove all previous results.
Once you satisfied with the prompt, apply the AI Column Node to all rows (it will be applied only for empty cells).
For very large datasets that are bigger than 10,000 rows, run the flow for runtime processing over the whole dataset. Be aware that it can be costly for a large amount of data.
{{line}}
Step 4. Get Final Result
Here is the dataset after using the AI Column Node to standardize the city names: