User:Alvonruff

From ISFDB
Jump to navigation Jump to search

Founder of the ISFDB.

Python3 Notes

The primary difficulty with a python3 conversion project is trying to avoid a massive rewrite of the website, and then checkin all those changes with a single big bang integration. That said, the primary function of the scripts is to read data from MySQL, and then organize and print that information to the browser. But the two things that change with python3 is the MySQL connector (which requires a rewrite of all code interfacing with MySQL), and the way print statements work (which requires a rewrite of all code outputting information). So the first goal is to find a way for the ISFDB to exist simultaneously in Python2 and Python3 format. General outline of steps to move to Python3:

  1. Python3 does not tolerate mixed tabs and spaces. There needs to be a project to convert the tabs in all files to 8 spaces. This change works on either Python2 or Python3
  2. Introduce a MySQL connector class, which hides the differences between MySQLdb (which only works on Python2) and mysql.connector (which only works on Python3).
  3. Convert all MySQL code to use the new connector class. This is a large project in itself, with details listed here.
  4. futurize all print statements. This can be done by running futurize, and then keeping only the print() syntax changes. These will run fine under python2.
  5. Perform the Python3 Conversion, and follow up on the numerous porting issues.
  6. Update all character sets. Final procedure still TBD.
  7. Change the default charset in MySQL
  8. Repair strings which have URL encodings in MySQL

Remarks on Debugging the Current Code Base

There are a number of issues slowing down the porting effort, which makes the debugging process slow.

  • SESSION Arguments. Since we've introduced the SESSION variable, many ISFDB scripts are no longer executable from the command line. The Session class pulls the arguments from the environment variable QUERY_STRING, which is not set when executing from the command line. When command line execution is not possible, then the new script must be installed, and if a simple syntax error appears within the file, then the browser simply returns "Internal Server Error", with no clue as to where the issue is. These types of issues are easily observable when running from the command line. To re-enable command line execution, I added the following to the end of ParseParameters in the Session class:
   # Allow for command line invocation
   if (cgi_path == None) and (self.query_string == None):
       num_args = len(sys.argv)
       for i in range(1, num_args, 1):
           self.parameters.append(sys.argv[i])
  • Try/Except Usage. We have a tendency to use try/except to cover many possible potential issues within a large code block. Here's an actual example from se.py, which contains one error when trying to run under Python3:
       try:
               type = form['type'].value
               # Save the double-quote-escaped version of the original search value
               # to be re-displayed in the search box
               search_value = form['arg'].value.replace('"','"')
               # Replace asterisks with % to facilitate wild cards
               arg = str.replace(normalizeInput(form['arg'].value), '*', '%')
               # Double escape backslashes, which is required by the SQL syntax
               arg = string.replace(arg, '\\', '\\\\')
               user = User()
               user.load()
               if not user.keep_spaces_in_searches:
                       arg = str.strip(arg)
               if not arg:
                       raise
       except:
               PrintHeader("ISFDB Search Error")
               PrintNavbar('search', %'%', 0, 'se.cgi', )
               print("No search value specified")
               PrintTrailer('search', , 0)
               sys.exit(0)
When this runs, all we see is "ISFDB Search Error, No search value specified", with no clue as to which clause might have caused the error. The typical approach then is to copy all of the try code, copy it to a position above the try statement, and then reformat the lines. Rather than getting into a philosophical debate about the overusage of try/except, we can make these blocks more debuggable. For instance, if we change the except clause above to:
   import traceback
   except Exception as e:
       e = traceback.format_exc()
       PrintHeader("ISFDB Search Error")
       PrintNavbar('search', , 0, 'se.cgi', )
       print("No search value specified")
       print('Error: ', e)
       PrintTrailer('search', , 0)
       sys.exit(0)
We now see the following error message:
   Error: Traceback (most recent call last): File "/var/www/cgi-bin/se.cgi", line 263, in arg = string.replace(arg, '\\', '\\\\') AttributeError: module 'string' has no attribute 'replace'
Showing that we forgot to replace 'string' with 'str'.
  • SQL Debugging. As documented elsewhere, there is an issue with mysql.connector in extracting DATE values from MySQL when the month or day is zero. We have an SQL syntax fix for that, but there are many hundreds of SQL statements in the ISFDB, so only a small percentage have been addressed so far. When debugging a script (adv_search_results.py comes to mind, since that is my current problem area), one simply sees the following error:
   TypeError: must be str, not datetime.date
       args = ('must be str, not datetime.date',)
       with_traceback = <built-in method with_traceback of TypeError object>
So we know where the final problem occurred, but not which SQL method needs to be altered. Sometimes it is trivial to find, but on other occasions it requires looking through library.py, common.py, and some number of *Class.py files, which takes a fair amount of time. Additionally, I would like to make an set of SQL unit tests, but don't know what a typical valid set of arguments look like.
As such, I've added an SQL logging feature to the Session class, which can be enabled in SQLparsing.py by setting SQLlogging to 1. This outputs a bulleted list of the SQL function calls that were made by a particular script, outputting them in a new section added by PrintTrailer(). For scripts that are generating faults, a call to SQLoutputLog() can be temporarily insert just above the fault point in the code.

Status Trackers

System Upgrade Notes

Details on how to bring up a LAMP stack (on two different OSs), and how to setup https:

User:Alvonruff/Test Page

Other Loose Notes

Obituary Sources

Reading List