Problem
When importing data files into MoEngage, you see the error message: "Please save the file with UTF-8 encoding and remove any unsupported characters."
This error occurs because the file was not saved with UTF-8 encoding, which is required for the MoEngage platform. It commonly happens when your data includes special characters, accents, emojis, or symbols from various languages.
Explanation
What is UTF-8?
UTF-8 is a universal character encoding standard that can represent almost any character from any language. It is the dominant encoding for the World Wide Web. MoEngage requires data files to be in UTF-8 to ensure that all user attributes, regardless of language or special characters, are processed and stored correctly.
Files saved in other encodings, such as ASCII or ISO-8859-1, have a more limited set of characters and cannot properly handle the diverse data you might have for your users. Using the wrong encoding can lead to data corruption, where characters appear as strange symbols (for example, "" or "é").
How to Identify and Check File Encoding
If you are unsure of your file's encoding or suspect it contains unsupported characters, you can use the following methods to check.
A. Identify the File's Current Encoding
Using a modern text editor is the most reliable way to check a file's encoding.
- Using a Text Editor (for example, VSCode, Notepad++): Open your data file in an editor like Visual Studio Code or Notepad++. The current file encoding is typically displayed in the status bar at the bottom-right corner of the window.
B. Check for Non-UTF-8 Characters
- Visual Inspection in a Text Editor: The easiest way to find problematic characters is to open the file in a text editor that is set to UTF-8 encoding. Any characters that are not valid in UTF-8 will often appear as a black diamond with a question mark () or other garbled symbols.
- Using an Online Validator: You can use online tools to check for non-UTF-8 characters. Simply copy the contents of your file and paste them into an online ASCII or UTF-8 checker. The tool will highlight any characters that are not compliant.
Solution: How to Save a File with UTF-8 Encoding
You can easily fix this issue by re-saving your file in the correct UTF-8 format using common spreadsheet software.
A. Use Microsoft Excel
There are two reliable ways to save a file as UTF-8 in Microsoft Excel.
Method 1: Save as CSV UTF-8 (Recommended for newer Microsoft Excel versions)
- Open your file in Microsoft Excel.
- Go to File Save As.
- In the Save as type drop-down menu, select CSV UTF-8 (Comma delimited) (*.csv).
- Click Save.
Method 2: Using the Text Import Wizard (For older Microsoft Excel versions)
If your version of Microsoft Excel doesn't have the CSV UTF-8 option, you can use this method.
- Open a new, blank workbook in Microsoft Excel.
- Go to the Data tab and click From Text/CSV.
- Select your CSV file and click Import.
- In the dialog box that appears, Microsoft Excel will show a preview of your data. In the File Origin drop-down list, select an encoding that makes your text appear correctly in the preview (for example, 1252: Western European (Windows)).
- Click Load to import the data into the blank sheet.
- Now, go to File Save As.
- Click the Tools drop-down next to the Save button and select Web Options.
- Navigate to the Encoding tab.
- Under Save this document as:, select Unicode (UTF-8) from the list.
- Click OK, and then Save your file.
B. Use Google Sheets (Easiest Method)
Google Sheets automatically handles UTF-8 encoding, making it a very simple and reliable option.
- Open Google Sheets.
- Go to File Import and upload your file.
- Once the file is open and you've verified the data looks correct, go to File Download.
- Select Comma Separated Values (.csv). The file will automatically be downloaded in UTF-8 format.
C. Use Notepad (for Windows)
- Open your file with the Notepad application.
- Go to File Save As.
- At the bottom of the Save As dialog box, you will see an "Encoding" drop-down menu.
- Select UTF-8 from the list and click Save.
Supported and Unsupported Characters
The most important step is ensuring your file is UTF-8 encoded. However, it's also helpful to know what types of data and characters are supported.
Supported Characters
With a correctly formatted UTF-8 file, MoEngage supports:
- Standard alphanumeric characters: a-z, A-Z, 0-9.
- Common symbols: @, ., _, -, +, ! and others typically found in names, emails, and addresses.
- International characters and accents: ñ, é, ü, ç.
- Symbols and Emojis: €, ₹, 😊, ✅.
Unsupported Formats and Characters
The following can cause import errors even if the file is UTF-8 encoded:
- Complex Attribute Types: You cannot import complex data like JSON objects, dictionaries, or lists directly into a single CSV field.
- Incorrect Newline Characters: The only supported newline character is
\n
(Line Feed). Files created on Windows sometimes use\r\n
(Carriage Return + Line Feed), which can cause parsing issues. Saving through Google Sheets or a text editor often resolves this. - Hidden Control Characters: Non-printable characters (like null characters or other control characters) can corrupt the file and cause the import to fail. These are often introduced by exporting data from certain systems.
- Very Small Files: If your file contains fewer than five rows of data, you may encounter an error. If this happens, add some dummy rows to meet the minimum threshold for import.