WAMP, localhost, Mediawiki: Cannot access the database

From Cor ad Cor
Jump to navigation Jump to search

or The Bug that Ate a Week of My Life!

Mediawiki on localhost

Since 2:42 PM on 14 March 2009, I have run a private wiki for myself on a WAMP stack — 6.5 years and counting.

I have migrated the wiki from one desktop to another and installed it on two different laptops.

I have never had everything work perfectly right out of the box, but I have never been frustrated for too long. People have solved all of the basic problems more than once, and it has been just a matter of googling around long enough to debug the setup.

All that has changed.

I've found a bug that has defeated me.

Mediawiki 1.26, WAMPserver 2.5, and WAMPserver 3

Error Messages

Shorter version

Sorry! This site is experiencing technical difficulties.

Try waiting a few minutes and reloading.

(Cannot access the database)

With debugging and backtracking enabled

Cannot access the database: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. ())

Backtrace:

#0 C:\wamp\www\mediawiki\includes\db\LoadBalancer.php(842): DatabaseBase->reportConnectionError('Unknown error (...')
#1 C:\wamp\www\mediawiki\includes\db\LoadBalancer.php(537): LoadBalancer->reportConnectionError()
#2 C:\wamp\www\mediawiki\includes\GlobalFunctions.php(3631): LoadBalancer->getConnection(-1, Array, false)
#3 C:\wamp\www\mediawiki\includes\User.php(1208): wfGetDB(-1)
#4 C:\wamp\www\mediawiki\includes\User.php(397): User->loadFromDatabase(0)
#5 C:\wamp\www\mediawiki\includes\User.php(362): User->loadFromId(0)
#6 C:\wamp\www\mediawiki\includes\User.php(2459): User->load()
#7 C:\wamp\www\mediawiki\includes\User.php(1167): User->getToken(false)
#8 C:\wamp\www\mediawiki\includes\User.php(365): User->loadFromSession()
#9 C:\wamp\www\mediawiki\includes\User.php(2029): User->load()
#10 C:\wamp\www\mediawiki\includes\User.php(3204): User->getId()
#11 C:\wamp\www\mediawiki\includes\MediaWiki.php(627): User->isLoggedIn()
#12 C:\wamp\www\mediawiki\includes\MediaWiki.php(476): MediaWiki->main()
#13 C:\wamp\www\mediawiki\index.php(51): MediaWiki->run()
#14 {main}
reportConnectionError "Unknown error"

Failure message when MySQL is totally disabled

Warning: mysqli::mysqli(): (HY000/2002): No connection could be made because the target machine actively refused it. in C:\wamp\www\mediawiki\LocalSettings.php on line 152

Connection failed: No connection could be made because the target machine actively refused it.

Not a code or configuration problem

Except for a few times when I shot myself in the foot while debugging, the proof that the PHP code base and my configuration are correct is that I can immediately obtain the page I want by reloading the page.

It is only on the first call after a long time (five minutes? ten?) of not accessing the wiki that the problem occurs.

Timing is everything

The most common time for the bug to bite is when I open my browser.

I've got two tabs to pages on the local wiki in my collection of home pages.

The first one usually fails if I've had the browser closed for a while; the second never does.

I dropped the second page from the set to see if that would help the first.

It didn't.

Once MySQL is awakened from its slumbers, all the links in the wiki work reasonably well — no worse than ever before — unless I spend a long time composing a page, and then the connection fails when I try to save the page. Refresh the page, and it saves perfectly.

Something somewhere keeps MySQL from answering the FIRST request for a connection in the time frame that Mediawiki expects. After that, everything is normal enough until MySQL decides to fall asleep again.

Round up the usual suspects

It's some interaction between Windows, Apache, MySQL, and the Wikimedia routines that has got me down.

Why can't we all just get along?

The articles in the links below have got excellent information and links in them. They cover pretty much everything I've tried, and then some. I'm just going to list the failed fixes that float into mind:

  • turn off IPv6 on my machine
  • have Apache listen only to 127.0.0.1:80
  • bind MySQL to 127.0.0.1
  • turn off name resolution in MySQL
  • change localhost to 127.0.0.1 everywhere I can to avoid DNS calls
  • edit Hosts file to make sure localhost is defined
  • enable opcache and install APCu
  • turn off Zend debugger
  • moved from WAMPserver 2.5 to 3

Total failure, but intriguing: get MySQL to use a socket. There are paths for it in the configuration files, but I've never gotten it going. I must do more googling! It seems as though it SHOULD work.

Sysinfo

Hardware

BIOS Manufacturer: American Megatrends Inc.
Release Date: 10/23/2012
BIOS Version: ALASKA - 1072009
BIOS Version: BIOS Date: 10/23/12 15:23:54 Ver: 04.06.05
BIOS Version: BIOS Date: 10/23/12 15:23:54 Ver: 04.06.05
Board Product ID: Gigabyte 990FXA-UD3
AMD FX-8350 Eight-Core Processor, 4 Ghz
16 GB RAM
480 Gb SSHD
1 Tb HD
650 W power supply
AMD Radeon HD 7700 with 2 Gb RAM

Report Generated at: 1/8/2013 1:02:45 AM

OK, so the machine is sneaking up on its third year in service. It still gets a decent rating from me and Windows:

And, as with everything else except for Wikimedia version and the WAMPserver versions, I haven't changed anything from last week, when all was well in wikiland.

OS

Windows 7 Professional, 64-bit
Service Pack 1

Browser

Chrome 46.0.2490.86 m

Software Guardians

Malware Bytes
Set to exclude 127.0.0.1, httpd.exe, and mysql.exe from inspection.
CylanceProtect
I've got no control over this. All I can say is that with MediaWiki 1.20, I had no connection problems.
Firewall
Made sure port 3306 is open.

WAMPserver 3

Apache 2.4.17
MySQL 5.7.9
PHP 5.6.15

Ugly Workaround #1

Because the connection ALWAYS worked on the very next try after a failure, I thought that perhaps I could arrange a sacrificial connection to feed the ravenous Connection Monster; while it was chomping on the carcass, the real Mediawiki connection might sneak by unnoticed.

So I put this in my LocalSettings.php:

$servername = "127.0.0.1";
$username = "dbuser";
$password = "dbuserpassword";

// Create connection
$conn = new mysqli($servername, $username, $password);

// Check connection
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
} 
#echo "Connected successfully";
mysqli_close($conn);
#die("test finished");

This idiotic connection hasn't ever failed yet.

And the wiki page hasn't yet failed to connect afterward, either.

Pasta faggiole!

I hate it.

But not as much as I hate seeing the "connection failed" page.

Not yet, anyway.

Using the hrtime extension, I can see that this little connection check only takes a couple of milliseconds, so it's not costing me huge time in loading pages.

But I hate it.

It's a feature, not a bug

The MediaWiki wizards know 'that there is something awkward in LoadBalancer.php: "'Unknown error' in LoadBalancer->reportConnectionError() hides a real database problem" (Sep 25 2015, 9:15 AM):

Change 157488 abandoned by Hashar: Do not hide real DB error on first connect

Reason: That would cause T31233 Broken failover for DB slave connection errors.

The try/catch was done on purpose, see Tim comment on https://www.mediawiki.org/wiki/Special:Code/MediaWiki/90266#c20239

We have a task for a better fix T35036

Do not hide real DB error on first connect

LoadBalancer should not silence database exception. Bug: 70223

This is not a problem on public linux production sites, such as this one (moleski.net) and another one that I run, even though they are, like localhost, single-server sites with no database slaves to worry about.

There is something about the Windows environment that unsettles Mediawiki's database procedures ...

Ugly Workaround #2

I've commented out the "try ... catch" code and replaced with with the rejected simplification suggested by Saper. I don't care that this simplification "would cause T31233 Broken failover for DB slave connection errors." Not my circus, not my monkeys. I'm running this code on localhost. There are no slave connections.

# MXM Cutting out "ugliest thing" and substituting another's guy's remedy
#		try {
#			$db = DatabaseBase::factory( $server['type'], $server );
#		} catch ( DBConnectionError $e ) {
#			// FIXME: This is probably the ugliest thing I have ever done to
#			// PHP. I'm half-expecting it to segfault, just out of disgust.  —  TS
#			$db = $e->db;
#                       die('MXM: yep, came through here.')
#		}
# MXM
$db = DatabaseBase::factory( $server['type'], $server );
# MXM

What difference does this difference make? $db is not assigned to $e->db if the connection fails. I put some feedback in the "catch" block, to see if for some reason my system is taking that route. After I rebooted, I got the confirmation that the program execution flowed through that branch.

But making this change did not make all things well again. On a cold boot the next day, all of my "technical difficulties" came crashing back. By itself, Ugly Workaround #2 does not scratch the itch.

What I don't know is why there is any problem on startup at all. What is the difference between this setup and the one that hummed along for almost three years without any difficulty? I can't disable Cylance to find out. It is imposed on me from above. In theory, I have configured Malware Bytes to ignore things served from localhost.

Just the facts, Ma'am.

- My setup is slow on startup after a reboot.
- Mediawiki is not willing to wait for localhost to wake up.
- Ugly Workaround #1 does work around the problem, probably because it has no Mediawiki timer running but just sits there stupidly for hundreds of thousands of nanoseconds while something in the background preoccupies my system. If I can find the timing loop for the connection, I can maybe develop a more elegant workaround.

Tweak MySQL

"MySQL Service takes a very long time to start up."
I've set httpd and mysqld to run with administrator privileges after reading this article, FWIW.

I give up

I'm not going to find an elegant solution.

Ugly Workaround #1 works.

I presume that is because my request for a connection has no timing constraints.

If the system is slow to respond, that line of code just waits until MySQL starts talking. Once aroused, then the server is ready to respond to queries from Mediawiki in a reasonable time frame.

It works.

That is really all that matters.

It's ugly.

Once the inital lag is over, it doesn't cost me more than a few microseconds at the most per fresh page load. That's a small price to pay for not having to manually reload my home pages after cold boots or long delays.

I can live with it.

Yes, I can.

Any Day Now.

Links

The intriguing line: we implement here what we already do for mysql connections: if we set a connection timeout to a very small value (in the order of magnitude of twice the RTT to the poolcounter server) and retry twice, we can have very fast failures in case one server goes down, and the failover to the other active servers would happen almost harmlessly.
I'm convinced this is a clue, but I haven't waded into the code to see what I might be able to play with.
David Random's essay on this page is superb. It didn't fix what ailed me, but it made me feel that I was understanding the problem better.
Contains an excellent set of links to various tips and tricks, as well as making some fresh suggestions.
  • Count jobs on my localhost. This won't do you any good or me any harm, if I'm lucky. This is a handy spot for me to drop this link to see whether jobs have accumulated on my local wiki. They haven't.