Difference between revisions of "User talk:Alvonruff"

From ISFDB
Jump to navigation Jump to search
(→‎Searching: Old pages (help pages for example...))
Line 220: Line 220:
 
:::: If I type "lord of the rings" with no quotes into the top search box, I get 6 hits, none showing "lord of the rings". The search results page shows context for me is Advanced, with Main + User + User talk + ISFDB + ISFDB talk + File + Help + Help talk + Magazine.  Re-pushing the Search button there produces the same results.  If I type "lord of the ring" with quotes, in either location, and all other things the same, it produces 5 hits, all of which show that phrase.  I do notice the five pages found are a subset of the 6 pages found in the original hit, so perhaps what's displayed is just an artifact of how the snippet is selected; I did not notice that before.  BUT....  Taking the original complaint that there should be many hits at face value, I also notice that all of the hits are on recently modified pages, and none is on a page not recently modified.  Which makes me think (Male Answer Syndrome, I admit) that the background index rebuild is working on the recently changed pages, and so searching is finding those, but older pages need to be re-indexed. --[[User:MartyD|MartyD]] ([[User talk:MartyD|talk]]) 12:46, 16 September 2022 (EDT)
 
:::: If I type "lord of the rings" with no quotes into the top search box, I get 6 hits, none showing "lord of the rings". The search results page shows context for me is Advanced, with Main + User + User talk + ISFDB + ISFDB talk + File + Help + Help talk + Magazine.  Re-pushing the Search button there produces the same results.  If I type "lord of the ring" with quotes, in either location, and all other things the same, it produces 5 hits, all of which show that phrase.  I do notice the five pages found are a subset of the 6 pages found in the original hit, so perhaps what's displayed is just an artifact of how the snippet is selected; I did not notice that before.  BUT....  Taking the original complaint that there should be many hits at face value, I also notice that all of the hits are on recently modified pages, and none is on a page not recently modified.  Which makes me think (Male Answer Syndrome, I admit) that the background index rebuild is working on the recently changed pages, and so searching is finding those, but older pages need to be re-indexed. --[[User:MartyD|MartyD]] ([[User talk:MartyD|talk]]) 12:46, 16 September 2022 (EDT)
 
::::: I think that Marty is right - the problem is not new pages - these get indexed just fine. The problem are old pages after the DB change. One page I always look for via search is [https://isfdb.org/wiki/index.php/Help:Entering_non-genre_periodicals Entering non-genre periodicals]. The current search cannot find it in any way or form (it does find a few pages where it was referenced and which had been updated lately but not the main page). I just went via google search to pull it up but I suspect that the lack of an index may have made all our help pages not findable and this is not optimal... [[User:Anniemod|Annie]] ([[User talk:Anniemod|talk]]) 13:44, 16 September 2022 (EDT)
 
::::: I think that Marty is right - the problem is not new pages - these get indexed just fine. The problem are old pages after the DB change. One page I always look for via search is [https://isfdb.org/wiki/index.php/Help:Entering_non-genre_periodicals Entering non-genre periodicals]. The current search cannot find it in any way or form (it does find a few pages where it was referenced and which had been updated lately but not the main page). I just went via google search to pull it up but I suspect that the lack of an index may have made all our help pages not findable and this is not optimal... [[User:Anniemod|Annie]] ([[User talk:Anniemod|talk]]) 13:44, 16 September 2022 (EDT)
 +
 +
:::::: Ok. maintenance/rebuildtextindex.php has completed. Let's see if it rectifies the situation. --[[User:Alvonruff|Alvonruff]] ([[User talk:Alvonruff|talk]]) 14:51, 16 September 2022 (EDT)

Revision as of 14:51, 16 September 2022

MediaWiki

Hi. I am responsible for a MediaWiki-based wiki for work, and in early 2020 I did a major upgrade from 1.16 to 1.34, which included having to upgrade MySQL, PHP, and a bunch of extensions. My environment is Windows, but from a MediaWiki point of view, I don't think underlying OS makes much difference. I realize ISFDB's MediaWiki is even older than what I started with, but if I can help, answer questions, or experiment, let me know. I'm happy to try to lend a hand. --MartyD 08:46, 1 January 2021 (EST)

So I might as well start with a current accounting of everything. My notes (from May 2014) from a previous attempt to move to MediaWiki 1.22 (that I never finished) showed that we needed to do the following:
  • Move to SQL 5.0.2 or later. We were on 5.0.45 at that time (and we still are)
  • Move to PHP 5.3.2 or later. We were on 5.2.4 at that time (and we still are)

I have the following packages/add-ons laying around in my home directory at the ISFDB:

  • mediawiki-1.12.0rc1.tar (original Mediawiki version)
  • ImageMagick-6.4.1-8.i386.rpm
  • highlight-2.6.10
  • geshi-1.0.7.21.tar
  • SyntaxHighlight_GeSHi.class.php
  • SyntaxHighlight_GeSHi.i18n.php
  • SyntaxHighlight_GeSHi.php

The following extensions are installed in wiki/extensions:

  • ConfirmEdit
  • SyntaxHighlight_GeSHi
  • SVGtag.php

Moving PHP is easy, because nothing else on our system relies on it. Looks like MySQL 5.5.8 is required for the latest version of MediaWiki. I'm running 8.0.22 on my fresh install at home, and I am seeing errors on all pages with incorrect date values. I'm not exactly done with my installs, so some of these might be artifacts of some other issues, but there are some notes/queries elsewhere in our wiki about date format issues while using later versions of MySQL.

If a MySQL move is required to get to the new MediaWiki, then obviously we need to move the ISFDB along as well. So that seems like the first step to me. Alvonruff 14:15, 1 January 2021 (EST)

I have been running MySQL 5.5.17 on my development server for the last 6 years or so. I have encountered only one problem so far. A particularly tricky SQL query ran fine under 5.5.17, but it hung on the production server. I had to break up the query into two separate queries for it to work. Based on my experience with it, we should be able to upgrade the production version to 5.5.17 without running into any issues. Ahasuerus 17:03, 1 January 2021 (EST)
I'm running MySQL 5.7.29 with MediaWiki 1.34 and PHP 7.4.4 at the moment. (I have not done the 1.34 -> 1.35 update yet). I did not have any MySQL problems but did have some PHP-related issues where behaviors, packages/functions, and defaults have changed. And then MediaWiki itself needs different syntax and different packages/settings in LocalSettings.php. I recall having to do the upgrade in multiple steps (I thought I had to do something like get to 1.27 first, then from there go to 1.34), but the UPGRADE information seems to suggest being able to convert directly from 1.1x to 1.35. If you'd like, I can get my environment up to snuff for 1.35 and then try upgrading a recent dump and see what happens. DB upgrade will probably take hours. I could also see what's up with those three extensions. --MartyD 08:02, 2 January 2021 (EST)
That would be awesome, given you've already done this before. Do you need a current listing of LocalSettings.php ? Alvonruff 16:40, 2 January 2021 (EST)
I'm happy to help with the extensions and LocalSettings.php, as I have experience with those where I work (I maintain several different wikis, and have updated them multiple times now). I'd need to have admin access via command line, though, as that's how I know how to do things. :) ···日本穣 · 投稿 · Talk to Nihonjoe 12:23, 4 January 2021 (EST)
@Al: Yes, if you could send me the LocalSettings.php, that would be great. You can XXXX out the database credentials -- I will use my own -- and I don't think there is anything else sensitive in there. You might need to zip it or rename it to ".txt" to get it through any mail filtering. --MartyD 14:38, 6 January 2021 (EST)

Ongoing work on HTTPS-Support of ISFDB

I'm currently in contact with Ahasuerus who is working on conversion ISFDB the python code to configure/enable HTTPS (#1298 Support HTTPS). Which is somewhat a bigger task, I thought at the beginning, doing it right.

I'm currently running my own HTTPS-implementation of HTTPS on a local server, using newest MariaDB, MediaWiki 1.36, Apache 2.4. Upgrading ISFDB to a similar setting requires OS-Updates, including MySQL and other upgrades (as mentioned above).

I won't go into detail here, but [[User_talk:Ahasuerus|Ahasuerus] asked to jump into the current discussion. There is currently a need for the system administrator, who is responsible for the server (upgrading...) --elsbernd 06:25, 28 November 2021 (EST)

I have moved the discussion from my Talk page to Development/HTTPS. I plan to add a list of identified dependencies next. Ahasuerus 16:28, 28 November 2021 (EST)
I have created a list of dependencies and sent Al an e-mail. Ahasuerus 18:15, 28 November 2021 (EST)

2022-03-05 performance issues -- a DDOS attack?

Our current performance issues may be due to a DDOS attack -- see these findings for details. Would you happen to have any ideas? Ahasuerus 13:16, 5 March 2022 (EST)

SQLloadNextSubmission error

I am trying to recreate the SQLloadNextSubmission error that you ran into on my development server. A couple of questions to make sure that we are on the same page:

Ahasuerus 16:16, 5 March 2022 (EST)

Yes to both. I do only have exactly 1 user on the system. The observed error is:
   Traceback (most recent call last):
     File "/usr/lib/cgi-bin/mod/submission_review.cgi", line 44, in <module>
       ApproveOrReject('%s.cgi' % submission_filer, submission_id)
     File "/usr/lib/cgi-bin/mod/common.py", line 110, in ApproveOrReject
       PrintSubmissionLinks(submission_id, reviewer_id)
     File "/usr/lib/cgi-bin/mod/common.py", line 127, in PrintSubmissionLinks
       next_sub = SQLloadNextSubmission(submission_id, reviewer_id)
     File "/usr/lib/cgi-bin/mod/SQLparsing.py", line 2139, in SQLloadNextSubmission
       db.query(query)
   ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'groups\n
I changed PrintSubmissionLinks to:
       try:
               next_sub = SQLloadNextSubmission(submission_id, reviewer_id)
       except:
               next_sub = 0
But there is another failure case when SQLloadNextSubmission actually succeeds, and there is a next_sub, but I need to figure out the steps to get there. —The preceding unsigned comment was added by Alvonruff (talkcontribs) .
According to the error message above, it's a syntax error in the following SQL query:
       query = """select * from submissions s
               where s.sub_state = 'N'
               and s.sub_holdid = 0
               and s.sub_id > %d
               and not exists (
                       select 1 from mw_user u, mw_user_groups groups
                       where s.sub_submitter != %d
                       and s.sub_submitter = u.user_id
                       and u.user_id = groups.ug_user
                       and groups.ug_group = 'sysop'
                       )
               order by s.sub_reviewed
               limit 1""" % (int(sub_id), int(reviewer_id))
Would it be possible to use Python's "print" statements to display the values of "sub_id" and "reviewer_id" before the query is executed?
submission_id = 5243522 (same as the argument to the cgi script), reviewer_id = 2
Also, when you say that you have only 1 user in the database, do you mean that you are not using the publicly available backups? Or did you truncate mw_user after installing them? Ahasuerus 18:28, 5 March 2022 (EST)
No, the mw_user table has 1,977,439 entries in it. The mw_user_groups table, which control the editing permissions was empty in the backup, so it now has two entries for me. Since I was already present in mw_user, I modified the create_user.py script to not insert me again into that table, and let it do all the password stuff, and then add the two entries into mw_user_groups (sysop and bureaucrat).
What MySQL version are you using? "GROUPS" is a reserved word as of 8.0.2 per https://dev.mysql.com/doc/refman/8.0/en/keywords.html#keywords-8-0-detailed-G ErsatzCulture 19:24, 5 March 2022 (EST)
That will do it, thanks! Let me change it to something else... Ahasuerus 19:51, 5 March 2022 (EST)
Done, although the list of recently added MySQL reserved words is so long that I wouldn't be surprised if we ran into something else during testing. Ahasuerus 20:20, 5 March 2022 (EST)

Python 2.7

I am not sure how much it may help with Linux, but here is what I have been running on my Windows 10 development server for the last few days:

I am in the process of testing every Web page and every ISFDB process on my development server. Hopefully everything will check out. The current Python code does a number of funky things like occasionally redefining "id" and "type" and it's possible that Python 2.7 may be less tolerant of that level of abuse. We should find out soon enough. Ahasuerus 18:46, 6 April 2022 (EDT)

Python2.7 does seem to be the likely culprit (more so than MySQL itself). I'll be installing 2.5, and that will tell me where the problem lies. Alvonruff 06:04, 7 April 2022 (EDT)
I'm curious why Python 2.7 is being used for the upgraded site. Wouldn't it be better to switch to a newer version, like 3.8, or even 3.10, especially since 2.7 reached EOL in 2020? ···日本穣 · 投稿 · Talk to Nihonjoe 13:07, 9 May 2022 (EDT)
Unfortunately, Python 3.x, unlike Python 2.7.x, is not backward compatible with Python 2.5.x. Ahasuerus 13:30, 9 May 2022 (EDT)
That's true, but since so much of the site is being rewritten and redone, wouldn't it be better for the future to redo any Python used on the site to use Python 3.x? ···日本穣 · 投稿 · Talk to Nihonjoe 13:35, 9 May 2022 (EDT)
I am sure we would all like to move to Python3, however, that would entail many changes (which could be done and probably should be however these are fraught with issues and thus will take significant time to untangle). For example, the website is serving pages in Latin1 (not UTF-8) and I believe the database is storing strings that way too. We get Unicode support by allowing HTML entity coding in said strings. It is quite ugly but it works. Strings in Python2 are basically binary strings (although in late versions there is also a "unicode" string type). In Python3 all strings are Unicode (though not UTF-8; there is a PEP for the encoding someplace) but there is also a "bytes" type which is basically a binary string (as well as a mutable "bytearray" type). We would likely want to update the database to use UTF-8 strings and get the website to serve UTF-8 and get rid of all the HTML entity encodings for for non-Latin1 content but updating all the Python, JavaScript and SQL code to handle such is a nontrivial undertaking. Could we move to Python3 keeping our currently encoding mess? Maybe but I am not sure it is worthwhile. —Uzume (talk) 01:54, 7 September 2022 (EDT)
There are three pretty big issues moving to Python 3:
  • The primary function of the isfdb is to gather information from MySQL, and then organize, format, and print that data. Since the fundamental data type for strings is different in Python 3, and the methods for printing are different (including the syntax of those methods), all of the formatting/output code of the isfdb code would require rewrites - and that's the vast majority of the isfdb. There are some automated tools to help in such a conversion, but I haven't tried those as yet.
  • The current connector between MySQL and the ISFDB only works on Python 2.X. Moving to Python 3 requires moving to a new connector (most likely the connector produced by MySQL). That connector uses a different paradigm for the storage of queries, so all of SQLparsing.py would require a rewrite. I have done experiments with Python 3 and the new connector, and have written up how to convert our current SQL design patterns into the new connector requirements, but our SQL code isn't isolated to just SQLparsing.py (we have a ton of support scripts that would also need conversion).
  • As Uzume discussed above, the character set problem is probably the biggest issue of all, and was the single biggest issue in the modernization project this year (see User:Alvonruff/The Charset Problem). Since python and the connector both have charset changes, we would almost certainly attempt to simplify the charset problem by moving to the unicode standard (which we absolutely should do at some point), which would require converting all of the data resident in MySQL (as in we would need to write conversion tools to pull every line of data out of MySQL, convert it from the current mish-mash of Latin1+Other Stuff into unicode), and then write it back into MySQL. This part of the project comes with a super-crazy amount of risk.
On a scale of 1 to 10, I would put the difficulty of this year's modernization project at a 2 or 3, and we started that at the beginning of the year, and still haven't quite completed that project. That said most of the changes we did for the modernization project were configuration changes, and a smattering of single-line code additions, not a rewrite. I estimate the difficulty of a python 3 conversion to be more like a 7 or 8, and may take more than a year to complete. I've done some experimentation for a python 3 move, and will likely use the isfdb2 staging server for that project next year. But we were out of time due to the hosting change, which put python 3 out of scope for this year.
It's really on the Python team for breaking compatibility between 2.X and 3.X. I'm sure there were fabulous reasons for doing so, but the barrier to entry for 3.X is causing widespread delays in its adoption. For instance, in the 3D printer community, the most modern firmware available is Klipper, which is still on Python 2.X, as moving to 3.X is a massive undertaking. Most teams are making decisions on whether they want to burn their limited volunteer time on doing new features, or on trying to move to the latest version of Python. In my mind, there have been two major versions of the ISFDB: the first was the non-database version written in C, with the indices compiled for online use, and the second was the move to python/MySQL. Moving to python 3 is such a large project, that I consider that to be the next major phase of the ISFDB, as it would be mostly a rewrite of everything. Alvonruff (talk) 07:10, 7 September 2022 (EDT)
We are far from the only ones so affected (e.g., Ren'Py used to use pygame mostly for its SDL binding but they needed to move from SDL1 to SDL2; pygame moved but also moved to Python3 so Ren'Py created its own SDL2 binding for Python2 until it can be migrated, however, migrating to Python3 is complex since Python2 is exposed to users of Ren'Py necessitating a major breaking change). That said, this has been coming for a very long time and we could have been more proactive to not let ourselves get put into the position we are not in, e.g., Python has had a Unicode string type since 2.0 (introduced with PEP 100; 2.0 was released 2000-10-16). Regardless, we are here now and it is a major issue that will take significant work to fix. As Al said, we will likely have to do a major database conversion (effectively making a new Unicode database based upon the current pseudo-Unicode one). That will necessitate significant changes to the codebase. I might make more sense to consider rewriting the code. For example, make a script to convert the database and then a Python3 access code and keep the current Python2 for database changes until we can get a new submission and moderator interfaces developed, etc. (i.e., the new Python3 would likely start as just a database viewer and the database would have to get say period updates converted from the current one, etc. until a full switch over could be made.). —Uzume (talk) 15:00, 12 September 2022 (EDT)
Moving from HTML-encoded characters to UTF-8 is clearly beneficial because it will allow proper searching and other functional improvements. However, it can be done under Python 2.7. What are the benefits of moving to Python 3? Ahasuerus (talk) 18:05, 12 September 2022 (EDT)
All of my own Python code for the past 5+ (?) years has been Python 3. However, there's very little that I use that isn't available in 2.7. From the discourse I see on Twitter, the main things of note that have been added in recent years are type hints, async and pattern matching (basically like a more powerful switch statement AIUI). None of these things are particularly interesting to me - I'm sure they have their uses, but I don't see any major wins for the stuff I personally do.
AFAIK (and I don't pay much attention to this) all distros still ship with both Python 2 and 3, none has moved to just Python 3. (And generally people use virtualenvs, Docker etc for their own preferred version/libs/etc.) In the context of ISFDB moving to Python 3, I think any benefits would be (a) not getting caught out if there are security issues discovered in Python 2; (b) longer term, new Python devs are likely to start on Python 3, so might struggle with Python 2 codebases; (c) picking up any new tooling/enhancements/whatever that almost certainly won't be available for Python 2. None of these strike me as convincing arguments to justify the work necessary for a 2->3 migration, at least at this moment in time.
IMHO there are plenty of other areas that might deserve attention over a 2->3 migration, but that's a different discussion... (Which is why I'd kept out of this talk item prior to this time) ErsatzCulture (talk) 18:29, 12 September 2022 (EDT)
My concerns with staying on Python2.7 are: 1) There are issues with the Python2 unicode model, and those issues were a driving factor in the creation of Python3 (see: https://python-notes.curiousefficiency.org/en/latest/python3/questions_and_answers.html). If we undertake a large charset project, we will encounter those issues. 2) The unicode support in MySQLdb is sketchy, and the newer connectors don't work with Python2. 3) Staying with Python2 leaves us marooned on MySQLdb, which was abandoned a decade ago. If we stay on Python2, then we'll either have to take ownership of MySQLdb to fix the unicode issues we encounter, or we'll need to fork mysql.connector to support Python2 (which may not be possible).
That said, I don't think we yet have a concrete (and reasonable) proposal for how to proceed. It can't be: we rewrite everything, including all data in the database - as we'll be debugging that for years, as we don't have a formalized, automated test system. I suspect the MySQL conversion and the MySQLdb/Python3 work can be done separately, but it's currently unknown how well one would work without the other, or how that work would be verified. --Alvonruff (talk) 07:52, 13 September 2022 (EDT)
What do other Python 2.7-based projects do when they need to store Unicode data in a database? Do they use MySQLdb in spite of its "sketchy Unicode support"? Ahasuerus (talk) 10:40, 13 September 2022 (EDT)
P.S. I see that there are ways to write Python code that should work under both Python 2.7 and Python 3.4+ -- see this page. Some parts require importing from "future" and related "futurization" modules, but it looks doable. If true, it should be possible to update our code to be Python 3-compatible while we are still on Python 2.7. It wouldn't immediately address some parts of the Python 2 libraries that have been deprecated in later versions of Python 3, e.g. cgi, but it's a start. Ahasuerus (talk) 08:07, 14 September 2022 (EDT)
This is a pretty attractive approach, which would create the following substeps to Python3:
* Use 2to3 (https://docs.python.org/3/library/2to3.html) to do a quick side conversion of files from python2 to python3. We would not directly use these files, but would instead use them to diff the delta between python2.7 and python3 to create a roadmap of changes we would need to implement via futurization.
* Do a futurization of the files, ultimately minimizing the list of outliers to those that cannot be addressed with this methodology. This can be done file by file to minimize overall risk.
* Do a full conversion to python3 and move to mysql.connector, but do not modify the database. It's a TBD to determine if this step is feasible, but tests show we can use python3 and mysql.connector and extract the same strings from the database that we currently see in python2.7. It's a question of whether they can be formatted for output by python3 to a browser correctly (which I don't know the answer as yet).
* Technically, the python3 conversion would be done at that point, leaving unicode as a different next big project. --Alvonruff (talk) 08:33, 14 September 2022 (EDT)

(unindent) I just did a quick run of both 2to3 (which does an automated attempt at porting python2.7 to python 3) and futurize on all files in our common directory (22 files). futurize was probably built on top of 2to3, as they take exactly the same arguments. Diffing the output of the two files generates a pretty small diff, mostly around the handling of iterators, which requires the futurized version to import object from builtins, which is then fed into every class method. Observations:

  • The output from 2to3 fails to execute under python3 due to "inconsistent use of tabs and spaces in indentation", so we'll need to post-process the files to meet the isfdb indentation standard (using something like PyCharm). The output from futurize seems to run fine under Python2.7. So an additional step for moving to Python3 is to fix the indentation issues.
  • new imports from futurize that would need to be removed to run under python3 were:
    • from builtins import chr
    • from builtins import map
    • from builtins import object
    • from builtins import range
    • from builtins import str
    • from __future__ import print_function
    • from future import standard_library
  • We would need to remove all object references before a final move to Python3.

This seems pretty promising. We still have a need to run the current code base on isfdb2.org for now, so I'll explore the possibilities of having both futurized and non-futurized versions running at the same time (with different paths).

I futurized everything in biblio and common, and have it online at: https://www.isfdb2.org/cgi-bin/p3/index.cgi. Observations:
I agree that it's an attractive approach. Some of the changes outlined above, e.g. spaces vs. tabs, are just cleanup which will be beneficial even if we don't move to Python 3 in the foreseeable future.
It's also good that we identified the issue with non-Latin-1 characters early in the process. If it turns out to be a show-stoppers, we'll know about it sooner rather than later. Ahasuerus (talk) 20:21, 14 September 2022 (EDT)

CC license: 2.0 vs. 4.0

Earlier today User:Nihonjoe mentioned that Creative Commons 4.0 is the latest -- and almost identical -- version of CC 2.0, which we currently use. I know little about licenses, but it seems to be a correct assessment. Would you happen to be aware of any reasons why we shouldn't upgrade to 4.0? Ahasuerus 20:17, 18 April 2022 (EDT)

I have no issue with Attribution 4.0 International (CC BY 4.0). Alvonruff 06:37, 19 April 2022 (EDT)
Thanks! Ahasuerus 10:42, 19 April 2022 (EDT)

Searching

I don't know whose lap this would fall into (I will point Ahasuerus to this), but with regard to this Wiki search behavior, I think that may mean that maintenance/rebuildtextindex.php ought to be (re)run. IIRC, the normal background search index rebuilding only works off of the recent changes list. The move might have had some other oddball effect on the existing index that is causing it to not align correctly somehow. --MartyD (talk) 14:39, 14 September 2022 (EDT)

I'll leave it in Al's capable hands :-) Ahasuerus (talk) 15:07, 14 September 2022 (EDT)
Well before I embark on an update that "may take several hours" (according to the docs), I guess I'd like to know exactly what issue we're trying to solve, and how we know it the update had any effect. I'm not seeing any search issues, and I'm not using quotes. For example, if I search for the words normal background search (without quotes) from MartyD's post above, I see this page in the results. So is there a test search that fails, that we know should not fail? --Alvonruff (talk) 16:21, 14 September 2022 (EDT)
So just to be clear on what I'm seeing, let's take the Lord of the Rings example cited above:
  • If I type the words Lord of the Rings into the search bar, without quotes, I see a single hit for the page "Rules and standards discussion", which does not actually contain the phrase "Lord of the Rings".
  • If I then click on "Everything" (just below the search bar), I then see 5 hits, the two most important being "User talk:Animebill" and "User talk:Mavmaramis", both of which do contain the phrase "Lord of the Rings".
  • If I click on "Advanced" it reveals why this is the case. The default search namespace is "(Main)" and none of the other namespaces are checked, so searches will not look on the "User talk" namespace pages, which is why the phrase on both the "User talk:Animebill" and "User talk:Mavmaramis" pages fail to show up as hits.
  • If I select the "User talk" namespace, and then click on "Remember selection for future searches", and then search again, I see 4 hits. It doesn't display all 5 hits, because the fifth hit is "ISFDB:Moderator noticeboard", where the issue was reported, and that page is in the "ISFDB" namespace - which I did not enable.
  • If I go to "Advanced" and click the "All" button on the right (under "Check") as well as the box for "Remember selection for future searches", then a search shows all five hits. --Alvonruff (talk) 17:34, 14 September 2022 (EDT)
  • P.S. It now shows 6 hits, since I said the words on this page :) --Alvonruff (talk) 17:38, 14 September 2022 (EDT)
If I type "lord of the rings" with no quotes into the top search box, I get 6 hits, none showing "lord of the rings". The search results page shows context for me is Advanced, with Main + User + User talk + ISFDB + ISFDB talk + File + Help + Help talk + Magazine. Re-pushing the Search button there produces the same results. If I type "lord of the ring" with quotes, in either location, and all other things the same, it produces 5 hits, all of which show that phrase. I do notice the five pages found are a subset of the 6 pages found in the original hit, so perhaps what's displayed is just an artifact of how the snippet is selected; I did not notice that before. BUT.... Taking the original complaint that there should be many hits at face value, I also notice that all of the hits are on recently modified pages, and none is on a page not recently modified. Which makes me think (Male Answer Syndrome, I admit) that the background index rebuild is working on the recently changed pages, and so searching is finding those, but older pages need to be re-indexed. --MartyD (talk) 12:46, 16 September 2022 (EDT)
I think that Marty is right - the problem is not new pages - these get indexed just fine. The problem are old pages after the DB change. One page I always look for via search is Entering non-genre periodicals. The current search cannot find it in any way or form (it does find a few pages where it was referenced and which had been updated lately but not the main page). I just went via google search to pull it up but I suspect that the lack of an index may have made all our help pages not findable and this is not optimal... Annie (talk) 13:44, 16 September 2022 (EDT)
Ok. maintenance/rebuildtextindex.php has completed. Let's see if it rectifies the situation. --Alvonruff (talk) 14:51, 16 September 2022 (EDT)