Sophie

Sophie

distrib > Mageia > 5 > x86_64 > media > core-release > by-pkgid > 13eec89779171a321fe518ddb0e0fec6 > files > 523

freetds-doc-0.91-8.mga5.x86_64.rpm

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<HTML
><HEAD
><TITLE
>Localization and TDS 7.0</TITLE
><META
NAME="GENERATOR"
CONTENT="Modular DocBook HTML Stylesheet Version 1.79"><LINK
REL="HOME"
TITLE="FreeTDS User Guide"
HREF="index.htm"><LINK
REL="UP"
TITLE="Advanced Configurations"
HREF="configs.htm"><LINK
REL="PREVIOUS"
TITLE="Advanced Configurations"
HREF="configs.htm"><LINK
REL="NEXT"
TITLE="Domain Logins"
HREF="domains.htm"><LINK
REL="STYLESHEET"
TYPE="text/css"
HREF="userguide.css"><META
HTTP-EQUIV="Content-Type"
CONTENT="text/html; charset=utf-8"></HEAD
><BODY
CLASS="SECT1"
BGCOLOR="#FFFFFF"
TEXT="#000000"
LINK="#0000FF"
VLINK="#840084"
ALINK="#0000FF"
><DIV
CLASS="NAVHEADER"
><TABLE
SUMMARY="Header navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TH
COLSPAN="3"
ALIGN="center"
><SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> User Guide: A Guide to Installing, Configuring, and Running <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
></TH
></TR
><TR
><TD
WIDTH="10%"
ALIGN="left"
VALIGN="bottom"
><A
HREF="configs.htm"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="80%"
ALIGN="center"
VALIGN="bottom"
>Chapter 5. Advanced Configurations</TD
><TD
WIDTH="10%"
ALIGN="right"
VALIGN="bottom"
><A
HREF="domains.htm"
ACCESSKEY="N"
>Next</A
></TD
></TR
></TABLE
><HR
ALIGN="LEFT"
WIDTH="100%"></DIV
><DIV
CLASS="SECT1"
><H1
CLASS="SECT1"
><A
NAME="LOCALIZATION"
>Localization and <ACRONYM
CLASS="ACRONYM"
>TDS</ACRONYM
> 7.0</A
></H1
><P
><ACRONYM
CLASS="ACRONYM"
>TDS</ACRONYM
> 7.0 uses 2-byte Unicode (technically, <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>) to transfer character data between servers and clients.  Included in <SPAN
CLASS="QUOTE"
>"character data"</SPAN
> are query text (i.e., <ACRONYM
CLASS="ACRONYM"
>SQL</ACRONYM
>), metadata (table names and such), and <I
CLASS="FOREIGNPHRASE"
>bona fide</I
> data of datatypes <TT
CLASS="LITERAL"
>nchar</TT
>, <TT
CLASS="LITERAL"
>nvarchar</TT
>, and <TT
CLASS="LITERAL"
>ntext</TT
>.  (Background information on Unicode and how it affects <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> can be found in the <A
HREF="unicode.htm"
>appendix</A
>.)</P
><P
>Because most Unix tools and environments do not support <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>, <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> provides for conversion by the client to other character sets.  The mechanism used is determined by the <TT
CLASS="FILENAME"
>configure</TT
> script, which looks for a <CODE
CLASS="FUNCTION"
>iconv(3)</CODE
> function, an implementation of the <A
HREF="http://www.opengroup.org/onlinepubs/7908799/xsh/iconv.html"
TARGET="_top"
>iconv</A
> standard.    If no <CODE
CLASS="FUNCTION"
>iconv</CODE
> library is found, or if it is explicitly disabled, <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> will use its built-in <CODE
CLASS="FUNCTION"
>iconv</CODE
> substitute, and will be capable of converting among only <ACRONYM
CLASS="ACRONYM"
>ISO 8859-1</ACRONYM
>, <ACRONYM
CLASS="ACRONYM"
>UTF-8</ACRONYM
>, and <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>.</P
><P
>To learn what character set the client wants, <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> prefers the applicable <A
HREF="freetdsconf.htm#CLIENTCHARSET"
><TT
CLASS="FILENAME"
>freetds.conf</TT
></A
> <TT
CLASS="LITERAL"
>client charset</TT
> property.  If that is not set, it parses the <CODE
CLASS="ENVAR"
>LANG</CODE
> environment variable.  In either case, the found string is passed to <CODE
CLASS="FUNCTION"
>iconv</CODE
>(3) (or its built-in replacment).  <A
NAME="AEN2150"
HREF="#FTN.AEN2150"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
>.  If neither is found, <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
> data are converted to <ACRONYM
CLASS="ACRONYM"
>ISO 8859-1</ACRONYM
>.</P
><P
>To list all supported iconv character sets try <B
CLASS="COMMAND"
>iconv</B
>(1).  GNU's does:</P
><PRE
CLASS="SCREEN"
>	<SAMP
CLASS="PROMPT"
>$ </SAMP
><KBD
CLASS="USERINPUT"
>iconv --list</KBD
></PRE
><P
>For other systems, consult your documentation (most likely <B
CLASS="COMMAND"
>man iconv</B
> will give you some hints).</P
><P
>In this example a server named <TT
CLASS="LITERAL"
>mssql</TT
> will return data encoded in the GREEK character set.</P
><DIV
CLASS="EXAMPLE"
><A
NAME="E.G.GREEK"
></A
><P
><B
>Example 5-2. Configuring for GREEK <TT
CLASS="FILENAME"
>freetds.conf</TT
> setting</B
></P
><PRE
CLASS="PROGRAMLISTING"
>	[mssql]
	host = ntbox.mydomain.com
	port = 1433
	tds version = 7.0
	client charset = GREEK
	</PRE
></DIV
><P
>If <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> runs into a character it can not convert, its behavior varies according to the severity of the problem.  On retrieving data from the server, <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> substitutes an <ACRONYM
CLASS="ACRONYM"
>ASCII</ACRONYM
> '?' in the character's place, and emits a warning message stating that some characters could not be converted.  On sending data to the server, <SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> aborts the query and emits an error message.  It is well to ensure that the data contained in the database is representable in the client's character set.</P
><P
>If you have a mix of character data that can not be contained in a single-byte character set, you may wish to use <ACRONYM
CLASS="ACRONYM"
>UTF-8</ACRONYM
>.  <ACRONYM
CLASS="ACRONYM"
>UTF-8</ACRONYM
> is a variable length unicode encoding that is compatible with <ACRONYM
CLASS="ACRONYM"
>ASCII</ACRONYM
> in the range 0 to 127.  With <ACRONYM
CLASS="ACRONYM"
>UTF-8</ACRONYM
>, you are guaranteed to never have an unconvertible character.</P
><DIV
CLASS="IMPORTANT"
><P
></P
><TABLE
CLASS="IMPORTANT"
WIDTH="100%"
BORDER="0"
><TR
><TD
WIDTH="25"
ALIGN="CENTER"
VALIGN="TOP"
><IMG
SRC="../images/important.gif"
HSPACE="5"
ALT="Important"></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
><P
><SPAN
CLASS="PRODUCTNAME"
>FreeTDS</SPAN
> is not fully compatible with multi-byte character sets such as <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>.  You must use an ASCII-extension charset (e.g., UTF-8, ISO-8859-*)<A
NAME="AEN2184"
HREF="#FTN.AEN2184"
><SPAN
CLASS="footnote"
>[2]</SPAN
></A
>. Great care should be taken testing applications using these encodings. Specifically, many applications do not expect the number of characters returned to exceed the column size (in bytes).</P
></TD
></TR
></TABLE
></DIV
><P
>In the following example, a server named <TT
CLASS="LITERAL"
>mssql</TT
> will return data encoded in the <ACRONYM
CLASS="ACRONYM"
>UTF-8</ACRONYM
> character set.</P
><DIV
CLASS="EXAMPLE"
><A
NAME="E.G.UTF8"
></A
><P
><B
>Example 5-3. Configuring for <ACRONYM
CLASS="ACRONYM"
>UTF-8</ACRONYM
> <TT
CLASS="FILENAME"
>freetds.conf</TT
> setting</B
></P
><PRE
CLASS="PROGRAMLISTING"
>	[mssql]
	host = ntbox.mydomain.com
	port = 1433
	tds version = 7.0
	client charset = UTF-8
	</PRE
></DIV
><P
>It is also worth clarifying that <ACRONYM
CLASS="ACRONYM"
>TDS 7.0</ACRONYM
> and above do not accept any specified character set during login, as 4.2 does.  A <ACRONYM
CLASS="ACRONYM"
>TDS 7.0</ACRONYM
> login packet uses <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>.</P
><DIV
CLASS="SECT2"
><H2
CLASS="SECT2"
><A
NAME="LOCALIZATION.SERVERNOTE"
>Microsoft Server Note</A
></H2
><P
>String literals in SQL must be prefixed with 'N' unless the enclosed string can be represented in the server's <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>single-byte</I
></SPAN
> character set, irrespective of the column's datatype.  For example, in the SQL statement
				
<DIV
CLASS="INFORMALEXAMPLE"
><P
></P
><A
NAME="AEN2202"
></A
><PRE
CLASS="SCREEN"
>	INSERT INTO tablename (greeting) VALUES ('Hall&aring;')</PRE
><P
></P
></DIV
>
	
				the string is subject to somewhat surprising treatment by the server.</P
><P
>When the server parses the SQL, it extracts the data values for insertion (or update, or comparison, etc.)  Unprefixed strings are converted to the single-byte character set of the server/database.<A
NAME="AEN2205"
HREF="#FTN.AEN2205"
><SPAN
CLASS="footnote"
>[3]</SPAN
></A
>   Inserted data are then of course stored in the column.  In the case of UCS-2 columns &mdash; <TT
CLASS="LITERAL"
>nchar</TT
>, <TT
CLASS="LITERAL"
>nvarchar</TT
>, and <TT
CLASS="LITERAL"
>ntext</TT
> &mdash; the value stored is that which results from a <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>second</I
></SPAN
> conversion: from the single-byte form to the <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
> form.</P
><P
>The <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>only</I
></SPAN
> safe way to enclose strings in SQL text is with an 'N' prefix:
				
<DIV
CLASS="INFORMALEXAMPLE"
><P
></P
><A
NAME="AEN2214"
></A
><PRE
CLASS="SCREEN"
>	INSERT INTO tablename (greeting) VALUES (N'Hall&aring;')</PRE
><P
></P
></DIV
>				</P
><H3
CLASS="BRIDGEHEAD"
><A
NAME="AEN2216"
></A
>Commentary</H3
><P
>What's surprising about this?  Versions 7.0 and later of the TDS protocol use UCS-2 to send SQL text.  No matter how your local client is configured &mdash; with <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
> or <ACRONYM
CLASS="ACRONYM"
>ISO 8859-1</ACRONYM
> or anything else &mdash; it's converted to <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
> before it's sent to the server.  And obviously arrives at the server as <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>.  If the column into which it's being inserted is also <ACRONYM
CLASS="ACRONYM"
>UCS-2</ACRONYM
>, there's no need of <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>any</I
></SPAN
> conversion, much less two, and <SPAN
CLASS="emphasis"
><I
CLASS="EMPHASIS"
>certainly</I
></SPAN
> no need to lose infomation.</P
><P
>Why this happens is anyone's guess.  Here's one: it makes the datatype of the column unimportant.  Regardless of whether you use char/varchar/text or nchar/nvarchar/ntext or a mixture of the two, the arriving SQL (if na&iuml;vely written) will store exactly the same characters.</P
></DIV
></DIV
><H3
CLASS="FOOTNOTES"
>Notes</H3
><TABLE
BORDER="0"
CLASS="FOOTNOTES"
WIDTH="100%"
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN2150"
HREF="localization.htm#AEN2150"
><SPAN
CLASS="footnote"
>[1]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>The built-in replacement expects GNU iconv names: <TT
CLASS="LITERAL"
>ISO-8859-1</TT
>, <TT
CLASS="LITERAL"
>US-ASCII</TT
>, or <TT
CLASS="LITERAL"
>UTF-8</TT
>.</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN2184"
HREF="localization.htm#AEN2184"
><SPAN
CLASS="footnote"
>[2]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>not EBCDIC or other weird charsets</P
></TD
></TR
><TR
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="5%"
><A
NAME="FTN.AEN2205"
HREF="localization.htm#AEN2205"
><SPAN
CLASS="footnote"
>[3]</SPAN
></A
></TD
><TD
ALIGN="LEFT"
VALIGN="TOP"
WIDTH="95%"
><P
>The precise rules are unknown to the author.</P
></TD
></TR
></TABLE
><DIV
CLASS="NAVFOOTER"
><HR
ALIGN="LEFT"
WIDTH="100%"><TABLE
SUMMARY="Footer navigation table"
WIDTH="100%"
BORDER="0"
CELLPADDING="0"
CELLSPACING="0"
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
><A
HREF="configs.htm"
ACCESSKEY="P"
>Prev</A
></TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="index.htm"
ACCESSKEY="H"
>Home</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
><A
HREF="domains.htm"
ACCESSKEY="N"
>Next</A
></TD
></TR
><TR
><TD
WIDTH="33%"
ALIGN="left"
VALIGN="top"
>Advanced Configurations</TD
><TD
WIDTH="34%"
ALIGN="center"
VALIGN="top"
><A
HREF="configs.htm"
ACCESSKEY="U"
>Up</A
></TD
><TD
WIDTH="33%"
ALIGN="right"
VALIGN="top"
>Domain Logins</TD
></TR
></TABLE
></DIV
></BODY
></HTML
>