From owner-man-jp@jp.freebsd.org  Sat Mar 11 15:11:09 2000
Received: (from daemon@localhost)
	by castle.jp.freebsd.org (8.9.3+3.2W/8.7.3) id PAA81184;
	Sat, 11 Mar 2000 15:11:09 +0900 (JST)
	(envelope-from owner-man-jp@jp.FreeBSD.org)
Received: from mailgw1.be.to (mailgw1.be.to [210.235.212.5])
	by castle.jp.freebsd.org (8.9.3+3.2W/8.7.3) with ESMTP id PAA81179
	for <man-jp@jp.freebsd.org>; Sat, 11 Mar 2000 15:11:08 +0900 (JST)
	(envelope-from okazaki@be.to)
Received: from mail1.be.to (point1.be.to [210.235.212.29])
	by mailgw1.be.to (8.9.3+3.2W/BETO.2.1-2000030802000035) with ESMTP id PAA16783
	for <man-jp@jp.freebsd.org>; Sat, 11 Mar 2000 15:11:05 +0900
Received: from acidrain (ppp32-Mobara1.mtci.ne.jp [210.172.1.234])
	by mail1.be.to (8.8.8+3.0Wbeta13/BETO.2.0-1999110714000000) with SMTP id PAA10114
	for <man-jp@jp.freebsd.org>; Sat, 11 Mar 2000 15:10:56 +0900
Received: (qmail 4176 invoked from network); 11 Mar 2000 06:10:35 -0000
Received: from localhost (HELO acidrain.localnet) (127.0.0.1)
  by localhost with SMTP; 11 Mar 2000 06:10:35 -0000
Date: Sat, 11 Mar 2000 15:10:34 +0900
Message-ID: <86wvnayr39.wl@dolphin.be.to>
From: OKAZAKI Tetsurou <okazaki@be.to>
To: tech-jp@jp.freebsd.org
CC: man-jp@jp.freebsd.org
User-Agent: Wanderlust/1.1.0 (Overjoyed-pre3) REMI/1.14.1
 (=?ISO-8859-4?Q?Mushigawa=F2sugi?=) Chao/1.14.0 (Momoyama) APEL/10.2
 Emacs/20.5 (i386--freebsd) MULE/4.0 (HANANOEN)
Organization: Unknown
MIME-Version: 1.0 (generated by REMI 1.14.1 - =?ISO-8859-4?Q?=22Mushigawa=F2?=
 =?ISO-8859-4?Q?sugi=22?=)
Content-Type: multipart/mixed;
 boundary="Multipart_Sat_Mar_11_15:10:34_2000-1"
Reply-To: man-jp@jp.freebsd.org
Precedence: list
X-Distribute: distribute version 2.1 (Alpha) patchlevel 24e+990727
X-Sequence: man-jp 2199
Subject: [man-jp 2199] Forward: [Groff] Unicode, EBCDIC, Latin-2, JIS for groff
Errors-To: owner-man-jp@jp.freebsd.org
Sender: owner-man-jp@jp.freebsd.org
X-Originator: okazaki@be.to

--Multipart_Sat_Mar_11_15:10:34_2000-1
Content-Type: text/plain; charset=ISO-2022-JP

ja-groff port maintainer $B$N2,:j$G$9!#(B

To: tech-jp@jp.freebsd.org
CC: man-jp@jp.freebsd.org

$B$G$9!#(B

GNU troff discussion list <groff@ffii.org> $B$,(B
groff $B$N(B I18N $B$K$D$$$F!"0U8+$rJg$C$F$$$^$9!#(B
I18N expert $B$JJ}$O$<$R%3%a%s%H$7$F$"$2$F$/$@$5$$!#(B_o_



--Multipart_Sat_Mar_11_15:10:34_2000-1
Content-Type: message/rfc822

Return-Path: <groff-admin@ffii.org>
To: enf@pobox.com, S.Barany@infosys.tuwien.ac.at,
        Yoshiaki Yanagihara
 <yochi@debian.or.jp>,
        "S. Ciszewski" <grciszew@kinga.cyf-kr.edu.pl>
Cc: groff@ffii.org
From: Werner LEMBERG <sx0005@sx2.hrz.uni-dortmund.de>
In-Reply-To: <20000229224216W.sx0005@sx2.hrz.uni-dortmund.de>
References: <200002291920.NAA66959@216-80-13-65.d.enteract.com>
	<20000229224216W.sx0005@sx2.hrz.uni-dortmund.de>
X-Mailer: Mew version 1.94 on Emacs 20.6 / Mule 4.0 (HANANOEN)
Reply-To: Werner LEMBERG <wl@gnu.org>
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-Id: <20000310183900D.sx0005@sx2.hrz.uni-dortmund.de>
Date: Fri, 10 Mar 2000 18:39:00 GMT
X-Dispatcher: imput version 990905(IM130)
Lines: 35
Subject: [Groff] Unicode, EBCDIC, Latin-2, JIS for groff
Sender: groff-admin@ffii.org
Errors-To: groff-admin@ffii.org
X-Mailman-Version: 1.1
Precedence: bulk
List-Id: GNU troff discussion list <groff.ffii.org>
X-BeenThere: groff@ffii.org


It's amazing to see that people are interested in having Unicode
resp. EBCDIC input within gtroff.

Other people want Latin-2, others again want Japanese...

How to handle this best?

My suggestion is to enlarge gtroff so that it can handle arbitrary
31bit characters (this covers ISO 10646).  Characters with the 32nd
bit set (i.e. negative numbers) can then be used for special gtroff
`characters' like `ESCAPE_c'.

It should use Unicode (resp. ISO 10646) as the internal encoding and
nothing else.

Question: How far is the project of Unicode input?

Additionally, I suggest to use UTF8 exclusively as the external
encoding representation if, say, the command line option `-u' is used.

Groff should then come with a character set conversion tool (as a
preprocessor; maybe with heuristics to recognize the proper encoding?)
to map everything to Unicode in UTF8 representation (e.g. Latin-2, JIS
-- EBCDIC charsets also).

On the output side, I think that no essential changes are necessary
(except better support for very large fonts since gtroff's font handling
mechanism isn't very efficient here).  Of course, grops e.g. should be
extended to support CID-keyed PS fonts.

Comments please.


    Werner

_______________________________________________
Groff maillist  -  Groff@ffii.org
http://ffii.org/mailman/listinfo/groff

--Multipart_Sat_Mar_11_15:10:34_2000-1--
