Difference between revisions of "User:Alvonruff/The Charset Problem"

From ISFDB
Jump to navigation Jump to search
Line 50: Line 50:
 
</pre>
 
</pre>
  
While on ISFDB2 these variables are set to:
+
While on ISFDB2, MySQL defaulted these variables to:
  
 
<pre>
 
<pre>
Line 67: Line 67:
 
+--------------------------+----------------------------+
 
+--------------------------+----------------------------+
 
</pre>
 
</pre>
 +
 +
These variables can be set using the mysql app by issuing the following commands:
  
 
* set character_set_results = 'latin1';
 
* set character_set_results = 'latin1';
Line 72: Line 74:
 
* set character_set_client = 'latin1';
 
* set character_set_client = 'latin1';
 
* set character_set_connection = 'latin1';
 
* set character_set_connection = 'latin1';
* character_set_system is READONLY
+
 
 +
'''character_set_system''' is a read-only variable and cannot be changed at runtime. Changing the four above variables had no observable effect on the issue.

Revision as of 07:45, 29 April 2022

XXX

  • Browser - The Browser just follows the html content-type indicator, as well as the <meta> tag. This definitely affects the appearance of the text, as this was one of the first hacks attempted at isfdb2.
  • Apache - Apache now has a configurable charset. This defaults to utf-8, based on this entry in the config file: AddDefaultCharset UTF-8
  • ISFDB Scripts - Whatever is stored in the UNICODE variable in localdefs.py, which is currently ISO-8859-1 (latin1)
  • Python2.7 - Defaults to UTF-8
  • MySQLdb - ??
  • MySQL - Set to latin1

MySQLdb

The Connection() function takes an optional arguments named use_unicode, and charset (these only work on MySQL-4.1 and newer).

conn = mysql.connect(host='127.0.0.1',
                     user='user',
                     passwd='passwd',
                     db='db',
                     charset='utf8',
                     use_unicode=True)

MySQL

The current ISFDB character set of the MySQL database is latin1 (ISO-8859-1):

mysql> select default_character_set_name, default_collation_name from information_schema.schemata where schema_name='isfdb';
+----------------------------+------------------------+
| DEFAULT_CHARACTER_SET_NAME | DEFAULT_COLLATION_NAME |
+----------------------------+------------------------+
| latin1                     | latin1_swedish_ci      |
+----------------------------+------------------------+

That said, there are other MySQL charset variables to look at. On ISFDB1, we have:

mysql> show variables like '%character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | latin1                     | 
| character_set_connection | latin1                     | 
| character_set_database   | latin1                     | 
| character_set_filesystem | binary                     | 
| character_set_results    | latin1                     | 
| character_set_server     | latin1                     | 
| character_set_system     | utf8                       | 
| character_sets_dir       | /usr/share/mysql/charsets/ | 
+--------------------------+----------------------------+

While on ISFDB2, MySQL defaulted these variables to:

mysql> show variables like '%character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8mb3                    |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

These variables can be set using the mysql app by issuing the following commands:

  • set character_set_results = 'latin1';
  • set character_set_server = 'latin1';
  • set character_set_client = 'latin1';
  • set character_set_connection = 'latin1';

character_set_system is a read-only variable and cannot be changed at runtime. Changing the four above variables had no observable effect on the issue.