'PHP script to retrieve data from mysql and write to mssql: syntax error when accented and special characters

I’m running a php script that takes data from a mysql db and writes into a mssql db which I have no rights on except writing. Tables in first DB, where I get data from, uses utf8_general_ci. Tables in the second DB, where I write to, uses Latin1_General_CI_AS.

I didn’t write this code but I’m asked to solve this problem. The problem is that ANY accented character or symbol (like °, or even ‘) breaks the query and gives “SQLSTATE[HY000]: General error: 20018 Incorrect syntax near…”

The script uses PDO to fetch data and to write and its purpose is to write into a warehouse database to keep it updated basing on what happens in the database of the online shop (new users, new sells, user updates, etc.). The best would be to copy data as they are, keeping all the accented and special characters to keep data consistent especially for invoice info (names, addresses, etc.).

I’ve set up a custom log error

ini_set("error_log", DIR."/php-error.log");

Overwhelmed by desperation, I already tried to convert the strings before inserting them in the queries like this

  $string = mb_convert_encoding($string, 'Windows-1252', 'UTF-8'); 
  $string = mb_convert_encoding($string, 'ISO-8859-1'); 
  $string = utf8_decode($string);
  $string = mb_convert_encoding($string, 'UTF-8', 'UTF-8'); // yes, a friend of mine told me to try that  because in some cases it inexplicably helped

I tried to set the charset after the connection

  $conn = new PDO($dsn, $user, $password);
  $conn->exec("SET NAMES 'latin1'; SET CHARACTER SET latin1;");
  $conn->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

I also tried to prefix the strings in the query with the N character -> No result.

I even tried something dirty like setting an array of unwanted accented characters and replacing them with strtr(). No result. I don’t know why but in this case the characters seems to stay as they are like the accented characters are not “detected” in the strtr() but they hell are in the query (?!?!?!?).

The even stranger thing is that it seems that if I run this code from browser I have no errors in the log and query works. If the script is ran through cronjob it does. Can anyone help pointing me in the right direction?



Solution 1:[1]

  • Don't use any encode/decode functions.
  • SET NAMES 'latin1'; is telling MySQL that the client (PHP) code is encoded in latin1.
  • SHOW CREATE TABLE to see what encoding will be stored in the column (in MySQL). I do not know how to specify MSSql's charset.
  • When both parts are MySQL, those two CHARACTER SETs do not need to be the same; the encoding will be converted as needed during INSERT/SELECT.
  • To help further, please provide hex of some string that is having trouble. Try to do the hex in the database if relevant, else do it in PHP with bin2hex().
  • MySQL: SELECT col, HEX(col) FROM ...

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Rick James