{"id":36,"date":"2015-05-23T17:20:00","date_gmt":"2015-05-23T17:20:00","guid":{"rendered":"https:\/\/ahm.basfinans.com\/index.php\/2015\/05\/23\/java-and-arabic-support\/"},"modified":"2015-05-23T17:20:00","modified_gmt":"2015-05-23T17:20:00","slug":"java-and-arabic-support","status":"publish","type":"post","link":"https:\/\/ahm.basfinans.com\/index.php\/2015\/05\/23\/java-and-arabic-support\/","title":{"rendered":"Java and Arabic Support"},"content":{"rendered":"<div itemprop=\"description articleBody\" style=\"font-size: 15px; line-height: 1.4; position: relative; width: 578px;\">\n[THIS IS AN OLD ARTICLE, I republish for the sake of benefit to friends]<\/div>\n<div itemprop=\"description articleBody\" style=\"font-size: 15px; line-height: 1.4; position: relative; width: 578px;\">\n<br \/>\n<strong><\/strong>Java uses Unicode as native encoding, so any text will be converted to Unicode for proper handling. Java already has support almost to all known encodings, see:&nbsp;<a href=\"http:\/\/java.sun.com\/products\/jdk\/1.1\/docs\/guide\/intl\/encoding.doc.html.\" rel=\"nofollow\" style=\"color: #6699cc; text-decoration: none;\">http:\/\/java.sun.com\/products\/jdk\/1.1\/docs\/guide\/intl\/encoding.doc.html.<\/a><\/p>\n<p>Our involvement is how to adjust the input and the output; the Input will be from request parameters (in case of web development), files, Properties, and JDBC. The output will be to the browser through the HttpServletResponse object or a file, &#8230;<\/p>\n<p><span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Converting text strings:<\/strong><\/span><\/p>\n<ul style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; list-style-image: initial; list-style-position: initial; margin: 0.5em 0px; padding: 0px 2.5em;\"><span><\/p>\n<li style=\"border: none; margin: 0px 0px 0.25em; padding: 0px;\">String class already has support to conversion to Unicode; see String constructors that take encoding as a parameter.<\/li>\n<li style=\"border: none; margin: 0px 0px 0.25em; padding: 0px;\">String class can convert to any encoding; see getBytes() function that take encoding parameter.<\/li>\n<li style=\"border: none; margin: 0px 0px 0.25em; padding: 0px;\">String class content at anytime must be Unicode, so it can convert non-Unicode input to Unicode or give you the non-Unicode bytes upon request by getBytes().<\/li>\n<li style=\"border: none; margin: 0px 0px 0.25em; padding: 0px;\">A running examples is available at&nbsp;<a href=\"http:\/\/java.sun.com\/docs\/books\/tutorial\/i18n\/text\/string.html\" rel=\"nofollow\" style=\"color: #6699cc; text-decoration: none;\">http:\/\/java.sun.com\/docs\/books\/tutorial\/i18n\/text\/string.html<\/a><\/li>\n<li style=\"border: none; margin: 0px 0px 0.25em; padding: 0px;\">You can also use Charset, CharsetDecoder and CharsetEncoder.<\/li>\n<p><\/span><\/ul>\n<p><span><br \/>\n<br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Accessing files: (Input\/Output)<\/strong><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">You can access files, using normal Java classes but be aware that if you did not specify certain encoding, the file input classes will read the system property file.encoding and convert the file content to Unicode based on it.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">To know the system file encoding, read System.getProperty(&#8220;file.encoding&#8221;)<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">So if you are writing I18n (Internationalization) applications, you should specify the encoding when you are reading or writing to files:<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Use InputStreamReader and InputStreamWriter to specify encoding wanted, see Character and Byte Streams at:&nbsp;<\/span><a href=\"http:\/\/java.sun.com\/docs\/books\/tutorial\/i18n\/text\/stream.html\" rel=\"nofollow\" style=\"color: #6699cc; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; text-decoration: none;\">http:\/\/java.sun.com\/docs\/books\/tutorial\/i18n\/text\/stream.html<\/a><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Request Input Arabic Parameters:<\/strong><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">If you want to pass Arabic text in URLs, you will get the Arabic as the default system encoding, Cp1252 in a Unicode string, you should convert the parameter back to Cp1252, then to Unicode as an Cp1256, see the example:<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">String aParUnicode = request.getParameter(&#8220;apar&#8221;);<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">byte [] by1252 = aParUnicode.getBytes(&#8220;Cp1252&#8221;);<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">aParUnicode = new String(by1252, &#8220;Cp1256&#8221;);<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">By the introduction of Servlet 2.3 you can set the encoding of the request and get the parameters as Unicode correctly.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">request.setCharacterEncoding(&#8220;Cp1256&#8221;);<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Response and Locale Object:<\/strong><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Each running application already have Locale object, server applications should specify Locale that matched its language, this will give support in automatic conversion to strings based on this locale, please look at Locale class. The ServletHttpResponse already have a member to set locale called, setLocale<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">By default Locale is set to en (English) even if your default Windows language is Arabic, so all Unicode will be converted to 8859_1 (Latin1). When you set Locale to ar (Arabic), all Unicode will be converted to 8859_6 (ISO Latin\/Arabic Alphabet) and will be displayed correctly as Arabic (ISO).<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">ISO8859_6 is better than Cp1256, as its compatible with Unix\/Mac and Windows, not just Windows.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Properties class:<\/strong><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Properties class are hard coded to read files that encoded as 8859_1 any other characters should be written as Unicode escape character sequence. You can still write Arabic in key values, by it will be converted to Unicode as an 8859_1, so you should convert the Unicode back to 8859_1 and convert to Unicode as Cp1256 or whatever the file real encoding.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">String Arabic1256 = new String(latin8859_1.getBytes(&#8220;8859_1&#8221;), &#8220;Cp1256&#8221;);<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">The above example convert the string from Unicode back to 9959_1 and then back to Cp1256, and assumes the properties file was written using ANSI Cp1256.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><\/span><\/p>\n<ul style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; list-style-image: initial; list-style-position: initial; margin: 0.5em 0px; padding: 0px 2.5em;\"><span><\/p>\n<li style=\"border: none; margin: 0px 0px 0.25em; padding: 0px;\">An alternative solution is to use NetBeans IDE to write Arabic in keys and the NetBeans automatically will convert it to Unicode escape character sequences back and forth, so you always see it Arabic and always stored in the properties file as an escape Unicode character sequences.<\/li>\n<p><\/span><\/ul>\n<p><span><br \/>\n<br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">JDBC:<\/strong><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Once you have the characters as String Unicode , its already converted to Unicode, so you can convert back to the original encoding and convert to Unicode correctly, some JDBC drivers give you the option to set the charset for the JDBC to make proper conversion like MySQL, the SQL server driver convert to Unicode correctly based on the dbase encoding as selected when you create the dbase.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Be careful if you are using JDBC-ODBC bridge with Access dbase, the driver usually failed to convert Arabic characters to Unicode if your default Windows language is not Arabic.<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">References<\/span><strong style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">:<\/strong><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Java Tutorial Internationalization:<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><a href=\"http:\/\/java.sun.com\/docs\/books\/tutorial\/i18n\/index.html\" rel=\"nofollow\" style=\"color: #6699cc; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; text-decoration: none;\">http:\/\/java.sun.com\/docs\/books\/tutorial\/i18n\/index.html<\/a><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">Good ole&#8217; ASCII :<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><a href=\"http:\/\/czyborra.com\/charsets\/iso646.html\" rel=\"nofollow\" style=\"color: #6699cc; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; text-decoration: none;\">http:\/\/czyborra.com\/charsets\/iso646.html<\/a><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">The ISO 8859 Alphabet Soup:<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><a href=\"http:\/\/czyborra.com\/charsets\/iso8859.html\" rel=\"nofollow\" style=\"color: #6699cc; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; text-decoration: none;\">http:\/\/czyborra.com\/charsets\/iso8859.html<\/a><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">J2sdk1.4.0 documentationa and API doc:<\/span><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><a href=\"http:\/\/java.sun.com\/j2se\/1.4\/docs\/guide\/intl\/index.html\" rel=\"nofollow\" style=\"color: #6699cc; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px; text-decoration: none;\">http:\/\/java.sun.com\/j2se\/1.4\/docs\/guide\/intl\/index.html<\/a><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><br style=\"color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\" \/><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\">&gt;Originally written in 22\/5\/2002<\/span><\/span><\/div>\n<div>\n<span><span style=\"background-color: white; color: #333333; font-family: Arial, Tahoma, Helvetica, FreeSans, sans-serif; line-height: 20px;\"><br \/><\/span><\/span><\/div>\n<div>From ahm507.blogspot.com<\/div>\n","protected":false},"excerpt":{"rendered":"<p>[THIS IS AN OLD ARTICLE, I republish for the sake of benefit to friends] Java uses Unicode as native encoding, so any text will be converted to Unicode for proper handling. Java already has support almost to all known encodings, see:&nbsp;http:\/\/java.sun.com\/products\/jdk\/1.1\/docs\/guide\/intl\/encoding.doc.html. Our involvement is how to adjust the input and the output; the Input will [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"_links":{"self":[{"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/posts\/36"}],"collection":[{"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/comments?post=36"}],"version-history":[{"count":0,"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/posts\/36\/revisions"}],"wp:attachment":[{"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/media?parent=36"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/categories?post=36"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/ahm.basfinans.com\/index.php\/wp-json\/wp\/v2\/tags?post=36"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}