[rt-users] Migration from MySQL to PostreSQL - corrupt characters (german umlaut)

Maik Nergert maik.nergert at uni-hamburg.de
Thu Sep 17 08:12:27 EDT 2015


Hi RT Users,

I'm testing the migration from MySQL to PostgreSQL and I'm experiencing 
problems with LATIN1 characters (particularly German umlauts) after the 
migration. They look like ü instead ü

Hexcode from ü → c3 bc
is encoded again to → c3 83 (Ã) and c2 bc (¼)
(http://www.utf8-zeichentabelle.de/)


First I've upgraded RT from 3.8 to 4.2 with mysql db (utf8) and 
everything went smoothly.

New server with fresh installed RT, MySQL, Postgresql from package manager

System (Debian Jessie)
request-tracker: 4.2.8-3+deb8u1
mysql-server: 5.5.44-0+deb8u1
postresql: 9.4.4-0+deb8u1
apache: 2.4.10-10+deb8u3
php: 5.6.13+dfsg-0+deb8u1


Now I followed this tutorial to migrate the DB 
http://requesttracker.wikia.com/wiki/MigrateMysql2PostgresqlV4


The binary files, generated by rt-serializer --clone, include characters 
like öäü so I suppose that there is something going wrong while 
importing to Pg.

'rt-setup-database' creates the Pg DB as UTF8.


postgres=# \l
                               List of databases
     Name    |  Owner   | Encoding  | Collate | Ctype |   Access privileges
-----------+----------+-----------+---------+-------+-----------------------
   postgres  | postgres | SQL_ASCII | C       | C     |
   rt4       | rt_user  | UTF8      | C       | C     |
   template0 | postgres | SQL_ASCII | C       | C     | =c/postgres      +
             |          |           |         |       | 
postgres=CTc/postgres
   template1 | postgres | SQL_ASCII | C       | C     | =c/postgres      +
             |          |           |         |       | 
postgres=CTc/postgres



postgres=# show server_encoding;
   server_encoding
-----------------
   SQL_ASCII
(1 row)


postgres=# show client_encoding;
   client_encoding
-----------------
   UTF8
(1 row)

rt4=# select subject from tickets where id=82527;
                           subject
----------------------------------------------------------
   Hardware prüfen
(1 row)


Newly created tickets with umlauts are displayed correctly only the 
imported are corrupt!



Switching the client_encoding via “set client_encoding='LATIN1';” or 
directly in /etc/postgresql/9.4/main/postgresql.conf
client_encoding = latin1    #(default would be the database encoding utf8)


rt4=# set client_encoding='LATIN1';
SET
rt4=# select subject from tickets where id=82527;
                           subject
----------------------------------------------------------
Hardware prüfen
(1 row)



Can someone help me to migrate the db with a full utf8 setup?



Best,
Maik


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5413 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://lists.bestpractical.com/pipermail/rt-users/attachments/20150917/2a622763/attachment.bin>


More information about the rt-users mailing list