Difference between revisions of "User:Alvonruff/ISFDB2 Notes"

From ISFDB
Jump to navigation Jump to search
Line 65: Line 65:
  
 
This generates 2 questions:  
 
This generates 2 questions:  
# Why is the output different on the two systems, and
+
# Why is the output different on the two systems (python is clearly generating more and different characters), and
 
# Why does the file appear correct when viewed inside a vim session?
 
# Why does the file appear correct when viewed inside a vim session?
  

Revision as of 06:48, 28 April 2022

The isfdb2 staging system is a minimal system, with few packages installed, which uses dnf instead of apt-get.

Prerequisites

The staging system a minimum configuration AlmaLinux system, which is a variant of Fedora Core. It's really intended for tight cloud installations, so almost everything is missing, and installation of packages is done with yum/dnf.

  • dnf install gcc
  • dnf install make
  • dnf install tar
  • dnf install zip.x86_64
  • dnf install bzip2.x86_64
  • dnf install wget

Apache

  • dnf install httpd
  • firewall-cmd --add-service=http --add-service=https --permanent
  • service httpd start

MySQL

  • dnf update
  • dnf module enable mysql:8.0
  • dnf install @mysql
  • systemctl enable mysqld
  • systemctl start mysqld
  • Issue: mysql
  • While in mysql, issue the command: create database isfdb;
  • While in mysql, issue the command: use isfdb;
  • While in mysql, issue the command: alter database isfdb character set latin1 collate latin1_swedish_ci;
  • While in mysql, issue the command: source <<backupfile>>;'
  • GRANT ALL PRIVILEGES ON isfdb.* TO 'isfdb1'@'localhost';

Python 2.7.18

  • dnf install python2.x86_64
  • dnf install python2-devel.x86_64
  • dnf install mysql-devel.x86_64
  • pip2 install mysqlclient

Versions

  • Linux: 4.18.0-240.15.1.el8_3.x86_64 x86_64
  • Apache: Apache/2.4.37 (AlmaLinux)
  • MySQL: 8.0.26
  • Python: 2.7.18

Charset Experiments

I have a python script for generating Wikipedia article stubs from the ISFDB tables in MySQL. It was run on both isfdb.org and isfdb2.org. Running diff on the outputs shows:

4c4
< | name        = Philip José Farmer
---
> | name        = Philip José Farmer

If, however, the files are brought up in the vim text editor, they both appear to be correct. If I pull the name string out of each file and run od -X --endian=big STRING_FILE, the results are (with hand annotation):

0000000 5068696c 6970204a 6f73e920 4661726d         Phil   ip J   osé   Farm
0000020 65720a00

0000000 5068696c 6970204a 6f73c3a9 20466172         Phil   ip J   osé   Far
0000020 6d65720a

This generates 2 questions:

  1. Why is the output different on the two systems (python is clearly generating more and different characters), and
  2. Why does the file appear correct when viewed inside a vim session?

The answer to the first question is highlighted by the answer to the second question: vim uses utf-8 as it's default charset. I made the following changes to common/isfdb.py:

[1] Altered the html content type output from:

print 'Content-type: text/html; charset=%s\n' % UNICODE

to

print 'Content-type: text/html; charset=%s\n' % "UTF-8"

[2] Altered the meta tag string from:

print '<meta http-equiv="content-type" content="text/html; charset=%s" >' % UNICODE

to:

print '<meta charset="UTF-8"/>'

The output now appears normal. Need to run with this for a while to see if there are any untoward side effects. This also does not answer the question as to why this runs fine on the original isfdb.org