» devunicode
This site relies heavily on Javascript. You should enable it if you want the full experience. Learn more.

devunicode

acl(admin devvvv vvvvgroup)

discussion now on this forum

links

for replacing TLabel, TEdit, TMemo see

    http://www.tntware.com/delphicontrols/unicode/downloads.htm http://www.lmdinnovative.com/products/lmdelpack/

another Unicode Library

    http://www.lischke-online.de/UnicodeLibrary.php

BOM option for Clean

A good encoding converter will also offer options for adding or removing the BOM:
Unconditionally prefix the output text with U+FEFF.
Prefix the output text with U+FEFF unless it is already there.
Remove the first character if it is U+FEFF.

Encode

nodes for coding various 7bit and 8bit encodings to unicode and back. this allows for some historical perspectives, and solves most mac/unix/pc issues with text files.

note that most of the needed code below is already implemented in indy and the Open XML Utility Library

Desirable Encodings

  • ASCII-1963 X3.4 (7bit) seehttp://www.wps.com/projects/code
  • ASCII-1967 (7bit, in UnicodeConv.pas)
  • ISO 646 national variants (7bit)
  • ISO 8859 1-15 (8bit, in UnicodeConv.pas)
  • MAC 10000-10081 (8bit, in UnicodeConv.pas)
  • KOI8_R (8bit, russian, in UnicodeConv.pas)
  • JIS_X0201
  • Various MSDOS Codepages (8bit, in UnicodeConv.pas) including IBMPC,EBCDIC seehttp://www.sferyx.com/htmleditor/supportedencodings.htm
  • NextStep
  • Petsci
  • UTF-8 (see also JCL)
  • UTF-8 / BOM
  • UTF-16 BE
  • UTF-16 BE / BOM
  • UTF-16 SE
  • UTF-16 SE / BOM
  • UTF-7
  • XML
  • TikiWiki (see also kalle patches)

Pins

  • DEFAULT_CHAR for all nonrepresentable chars

Linefeed converter node (unabhängig von utf-8 und unicode)

  • CP/M, Microsoft DOS and Windows benutzen die aus den Zeiten der Fernschreiber gewohnte Folge 0D 0A (CR LF);
  • Apple bzw. Mac nutzen 0D (CR).
  • Unter UNIX und LINUX wird der Standardumbruch 0A (LF) benutzt.

Encoding links

Software Defect Patterns which Break Text Integrity

In the world of Internationalization software engineering, one of the most common defect behavior patterns is Garbage Text or Garbled Text. Sometimes, people also refer to it as Mojibake, which is a transliteration of a Japanese term that means garbage. After we take a closer look at different Garbage Text, we observe different kinds of defect behavior sub-patterns:http://people.netscape.com/ftang/paper/unicode25/a302_v1.htm

utf8 --> RFC 3629

http://www.cl.cam.ac.uk/~mgk25/unicode.htmlhttp://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txthttp://acspro.atari.org/KeyTab/Normal/006024.html
async pro -> Adxbase.pas

Ebcdic

http://www.legacyj.com/cobol/ebcdic.html

Petsci

http://www.df.lth.se/~triad/krad/recode/petscii.html

UTF-7

http://www.zeitungsjunge.de/delphi/unicode/ kann auch utf7 http://www.faqs.org/rfcs/rfc2152.htmlhttp://acspro.atari.org/KeyTab/Normal/006027.html

ISO 646-DE etc.

http://www.soziologie.uni-halle.de/unger/scripts/workshop_internet/ref_char_646.htmlhttp://www.ecma-international.org/publications/files/ECMA-ST/Ecma-006.pdf

Iso8859

  • Iso8859-1 Latin1 (West European)
  • Iso8859-2 Latin2 (East European)
  • Iso8859-3 Latin3 (South European)
  • Iso8859-4 Latin4 (North European)
  • Iso8859-5 Cyrillic
  • Iso8859-6 Arabic
  • Iso8859-7 Greek
  • Iso8859-8 Hebrew
  • Iso8859-9 Latin5 (Turkish)
  • Iso8859-10 Latin6 (Nordic)
  • Iso8859-13 Latin7 Baltic Rim
  • Iso8859-14 Latin8 Gaelic and Welsh
  • Iso8859-15 Latin9 replacing the less needed symbols ¦¨´¸¼½¾ with forgotten French and Finnish letters and added euro

function US_ASCIIToUTF16Str(const S: string): WideString;
function Iso8859_1ToUTF16Str(const S: string): WideString;
function Iso8859_2ToUTF16Str(const S: string): WideString;
function Iso8859_3ToUTF16Str(const S: string): WideString;
function Iso8859_4ToUTF16Str(const S: string): WideString;
function Iso8859_5ToUTF16Str(const S: string): WideString;
function Iso8859_6ToUTF16Str(const S: string): WideString;
function Iso8859_7ToUTF16Str(const S: string): WideString;
function Iso8859_8ToUTF16Str(const S: string): WideString;
function Iso8859_9ToUTF16Str(const S: string): WideString;
function Iso8859_10ToUTF16Str(const S: string): WideString;
function Iso8859_13ToUTF16Str(const S: string): WideString;
function Iso8859_14ToUTF16Str(const S: string): WideString;
function Iso8859_15ToUTF16Str(const S: string): WideString;

function KOI8_RToUTF16Str(const S: string): WideString; russian
function JIS_X0201ToUTF16Str(const S: string): WideString;
function nextStepToUTF16Str(const S: string): WideString;

function cp10000_MacRomanToUTF16Str(const S: string): WideString;
function cp10006_MacGreekToUTF16Str(const S: string): WideString;
function cp10007_MacCyrillicToUTF16Str(const S: string): WideString;
function cp10029_MacLatin2ToUTF16Str(const S: string): WideString;
function cp10079_MacIcelandicToUTF16Str(const S: string): WideString;
function cp10081_MacTurkishToUTF16Str(const S: string): WideString;

function cp037ToUTF16Str(const S: string): WideString; // ebcdic-cp-us
function cp424ToUTF16Str(const S: string): WideString; // x-EBCDIC-Hebrew
function cp437ToUTF16Str(const S: string): WideString; // original IBMPC with box chars
function cp437_DOSLatinUSToUTF16Str(const S: string): WideString;
function cp500ToUTF16Str(const S: string): WideString; // EBCDIC 500V1
function cp737_DOSGreekToUTF16Str(const S: string): WideString; // PC Greek
function cp775_DOSBaltRimToUTF16Str(const S: string): WideString; // PC Baltic
function cp850ToUTF16Str(const S: string): WideString;// MS-DOS Latin-1
function cp850_DOSLatin1ToUTF16Str(const S: string): WideString;
function cp852ToUTF16Str(const S: string): WideString; // MS-DOS Latin-2
function cp852_DOSLatin2ToUTF16Str(const S: string): WideString;
function cp855ToUTF16Str(const S: string): WideString; // EBCDIC-cyrillic
function cp855_DOSCyrillicToUTF16Str(const S: string): WideString;
function cp856_Hebrew_PCToUTF16Str(const S: string): WideString;
function cp857ToUTF16Str(const S: string): WideString; // IBM Turkish
function cp857_DOSTurkishToUTF16Str(const S: string): WideString;
function cp860ToUTF16Str(const S: string): WideString; // MS-DOS Portuguese
function cp860_DOSPortugueseToUTF16Str(const S: string): WideString;
function cp861ToUTF16Str(const S: string): WideString;
function cp861_DOSIcelandicToUTF16Str(const S: string): WideString;
function cp862ToUTF16Str(const S: string): WideString;
function cp862_DOSHebrewToUTF16Str(const S: string): WideString;
function cp863ToUTF16Str(const S: string): WideString;
function cp863_DOSCanadaFToUTF16Str(const S: string): WideString;
function cp864ToUTF16Str(const S: string): WideString;
function cp864_DOSArabicToUTF16Str(const S: string): WideString;
function cp865ToUTF16Str(const S: string): WideString;
function cp865_DOSNordicToUTF16Str(const S: string): WideString;
function cp866ToUTF16Str(const S: string): WideString;
function cp866_DOSCyrillicRussianToUTF16Str(const S: string): WideString;
function cp869ToUTF16Str(const S: string): WideString;
function cp869_DOSGreek2ToUTF16Str(const S: string): WideString;

function cp874ToUTF16Str(const S: string): WideString; EBCDIC-Thai
function cp875ToUTF16Str(const S: string): WideString;
function cp932ToUTF16Str(const S: string): WideString;
function cp936ToUTF16Str(const S: string): WideString;
function cp949ToUTF16Str(const S: string): WideString;
function cp950ToUTF16Str(const S: string): WideString;
function cp1006ToUTF16Str(const S: string): WideString;
function cp1026ToUTF16Str(const S: string): WideString;
function cp1250ToUTF16Str(const S: string): WideString;
function cp1251ToUTF16Str(const S: string): WideString;
function cp1252ToUTF16Str(const S: string): WideString;
function cp1253ToUTF16Str(const S: string): WideString;
function cp1254ToUTF16Str(const S: string): WideString;
function cp1255ToUTF16Str(const S: string): WideString;
function cp1256ToUTF16Str(const S: string): WideString;
function cp1257ToUTF16Str(const S: string): WideString;
function cp1258ToUTF16Str(const S: string): WideString;

function UTF8ToUTF16BEStr(const S: string): WideString;

anonymous user login

Shoutbox

~2d ago

joreg: vvvvTv S0204 is out: Custom Widgets with Dear ImGui: https://youtube.com/live/nrXfpn5V9h0

~2d ago

joreg: New user registration is currently disabled as we're moving to a new login provider: https://visualprogramming.net/blog/2024/reclaiming-vvvv.org/

~10d ago

joreg: vvvvTv S02E03 is out: Logging: https://youtube.com/live/OpUrJjTXBxM

~12d ago

~13d ago

joreg: Follow TobyK on his Advent of Code: https://www.twitch.tv/tobyklight

~17d ago

joreg: vvvvTv S02E02 is out: Saving & Loading UI State: https://www.youtube.com/live/GJQGVxA1pIQ

~17d ago

joreg: We now have a presence on LinkedIn: https://www.linkedin.com/company/vvvv-group

~24d ago

joreg: vvvvTv S02E01 is out: Buttons & Sliders with Dear ImGui: https://www.youtube.com/live/PuuTilbqd9w

~1mth ago

joreg: vvvvTv S02E00 is out: Sensors & Servos with Arduino: https://visualprogramming.net/blog/2024/vvvvtv-is-back-with-season-2/

~1mth ago