Skip to main content
MediaBeacon University

Character Encoding

This article details character encoding as it relates to Import/Export Wizard functions.

Character encodings are the standards used to encode/decode binary data to/from specific text characters. Older standards tend to define smaller lists of characters, focused on letters, numbers, and symbols used primarily in English. Modern encodings, on the other hand encompass much larger lists of characters, and are standardized across computer platforms, national borders, and languages.

Text in a less-compliant encoding may contain "extended" characters that can become damaged when extracting metadata from an original source, or be unable to represent properly characters from a more compliant encoding.

Examples of Extended Characters

  • Characters outside of the standard ASCII character set (only 256 characters)
  • Characters with diacritics (ÀàÂâÇçÈèÉéÊêËëÎîÏïÔôÙùÛûÜüŸÿ)
  • Latin Extended characters (þÞðÐƐƑƒƓƔ)
  • Characters in non-latin writing systems (한글, لْعَرَبِيَّة, 漢字, 平仮名, кириллица, עברית, Ελληνικά)
  • Curly quotes and symbols
  • Characters with diacritics
  • Curly quotes (‘ ’ “ ” )
  • Symbols (€¢£¥§©®°¶)

MediaBeacon is UTF-8 compliant which supports the full range of characters in the Unicode specification, and maintaining that encoding in all steps of data processing is our best practice standard. Other character encodings cannot accurately record all UTF-8 characters. Here are some examples of issues that can arise when text that contains extended characters are saved in a non-UTF-8 encoding, and importing them, but assuming they are in UTF-8:

  • UTF-8 Characters: åç∂éîøü
  • Opened as ASCII: åç∂éîøü
  • Opened as MacRoman: √•√ß‚àÇ√©√Æ√∏√º
  • Opened as ANSI: åç∂éîøü

Various Diacritics and Character Sets

French Diacritics: ÀàÂâÇçÈèÉéÊêËëÎîÏïÔôÙùÛûÜüŸÿ

Spanish Diacritics: ÁáÉéÍíÑñÓóÚúÜü

Italian Diacritics: ÈèÉéÌìÎîÒòÙù

Portuguese Diacritics: ÁáÂâÃãÀàÇçÉéÊêÍíÓóÔôÕõÚú

Greek Characters & Diacritics:
ΑαΒβΓγΔδΕεΖζΗηΘθΙιΚκΛλΜμΝνΞξΟοΠπΡρΣσ/ςΤτΥυΦφΧχΨψΩωΆάΈέΉήΊίΌόΎύΏώΐΰΪϊΫϋ

Cyrillic Characters & Diacritics:
АаӐӑӒӓӘәӚӛӔӕБбВвГ㥴ЃѓҒғӶӷҔҕДдЂђЕеЀѐЁёӖӗҼҽҾҿЄєЖжӁӂҖҗӜӝЗзЗ́з́ҘҙӞӟӠӡЅѕИиЍѝӤӥӢӣІіЇїӀӏЙйҊҋЈјКкҚқҞҟҠҡӃӄҜҝЛлӅӆЉљМмӍӎНнӉӊҢңӇӈҤҥЊњОоӦӧӨөӪӫҨҩПпҦҧРрҎҏСсС́с́ҪҫТтҬҭЋћЌќУуЎўӲӳӰӱӮӯҮүҰұФфХхҲҳҺһЦцҴҵЧчӴӵҶҷӋӌҸҹЏџШшЩщЪъЫыӸӹЬьҌҍЭэӬӭЮюЯя

  • Was this article helpful?