User/Alvonruff/2to3

From ISFDB
Jump to navigation Jump to search

Python3 Conversion

  • Use 2to3 in a manner the same as futurize. This produces results suitable for python3.
  • Update to the official MySQL mysql.connector. There is no python3 support for MySQLdb, which is our current connector. Moving to a new connector is a prerequisite.
  • Fix the following issues:

DATE Field With Month or Day of Zero

Probably the most significant issue found so far is that the mysql.connector does not support datetime values where either the month or day are zero. As stated in the MySQL documentation:

   “Zero” date or time values used through Connector/ODBC are converted automatically to NULL because ODBC cannot handle such values.

This is a fundamental change to Python3 (See https://docs.python.org/3/library/datetime.html):

   class datetime.date(year, month, day)
       All arguments are required. Arguments must be integers, in the following ranges:
       MINYEAR <= year <= MAXYEAR
       1 <= month <= 12
       1 <= day <= number of days in the given month and year
       datetime.MINYEAR - The smallest year number allowed in a date or datetime object. MINYEAR is 1.
       datetime.MAXYEAR - The largest year number allowed in a date or datetime object. MAXYEAR is 9999.

So the following date constructions are no longer allowed in a datetime variable:

  • 0000-00-00
  • 1984-00-00
  • 1984-10-00
  • 8888-00-00
  • 9999-00-00

This means that many dates in the ISFDB are returned as NULL by mysql.connector, which is converted to None in Python, which are then displayed as 'unknown' by the ISFDB. The least invasive workaround would be to use DATE_FORMAT to convert the datetime field before it ever leaves MySQL:

    select *, DATE_FORMAT(title_copyright, '%Y-%m-%d') as title_date  from titles where title_id=92392;

This has the side effect of adding the stringified date as a new field appended to the end of the record. There would then need to be code to patch up the NULL title_copyright field:

   for record in records:
       index = len(record)-1
       record[TITLE_YEAR] = record[index]

This will make the TITLE_YEAR field a string instead of a datetime. Note that this requires the records array to be constructed with the list() operator:

       while record:
               records.append(list(record[0]))
               record = cursor.fetchmany()

MD5 is Not a Distinct Module Under Python3

Use of md5 needs to be replaced with hashlib:

   import hashlib
   result = hashlib.md5(something)

DateTime Variables Cannot be Subindexed

We have a lot of date manipulation code that looks like this:

   year = title[TITLE_YEAR][:4]

This generates an error under python3. The subfields need to be extracted with the appropriate method:

   year = title[TITLE_YEAR].year
   month = title[TITLE_YEAR].month
   day = title[TITLE_YEAR].day

This situation is complicated, since some fields are in datetime, and some are strings (due to the mysql.connector). Most of the effort in converting in a file to Python3 tends to be datetime issues. We will likely need to create a custom datetime class that supports existing year, month, and day values that a now invalid in Python3.

String Methods Cannot Be Accessed via 'string'

Python3 doesn't allow the following syntax:

   string.replace()

It instead should be:

   str.replace()
   str.strip()
   str.rstrip()

mysql.connector Doesn't Have a db.escape_string() Method

Since mysql.connector is DB API v2.0 compliant interface, the API automatically performs the escape without asking.

Surprise - Division Works Differently on Python3

The following code generates different results on Python2 vs Python3:

   checksum = 112
   remainder = (checksum/10)*10

On python2 it returns the integer 110, while on python3 it returns the float 112.0. Under python3 the division operator always returns a float. This obviously breaks the ISBN checksum calculator. The corrected code would be:

   checksum = 112
   remainder = int(checksum/10)*10