Difference between revisions of "WAMP, localhost, Mediawiki: Cannot access the database"

From MXnet
Jump to: navigation, search
(Ugly Workaround #2)
(It's a feature, not a bug)
Line 186: Line 186:
 
The MediaWiki wizards know about this.
 
The MediaWiki wizards know about this.
  
There is something awkward in '''LoadBalancer.php''': [https://phabricator.wikimedia.org/T72223 "'Unknown error' in LoadBalancer->reportConnectionError() hides a real database problem."]
+
There is something awkward in '''LoadBalancer.php''': [https://phabricator.wikimedia.org/T72223#1673869 "'Unknown error' in LoadBalancer->reportConnectionError() hides a real database problem":]
  
 
<blockquote>
 
<blockquote>

Revision as of 21:23, 2 December 2015

or The Bug that Ate a Week of My Life!

Mediawiki on localhost

Since 2:42 PM on 14 March 2009, I have run a private wiki for myself on a WAMP stack--6.5 years and counting.

I have migrated the wiki from one desktop to another and installed it on two different laptops.

I have never had everything work perfectly right out of the box, but I have never been frustrated for too long. People have solved all of the basic problems more than once, and it has been just a matter of googling around long enough to debug the setup.

All that has changed.

I've found a bug that has defeated me.

Mediawiki 1.26, WAMPserver 2.5, and WAMPserver 3

Error Messages

Shorter version

Sorry! This site is experiencing technical difficulties.

Try waiting a few minutes and reloading.

(Cannot access the database)

With debugging and backtracking enabled

Cannot access the database: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond. ())

Backtrace:

#0 C:\wamp\www\mediawiki\includes\db\LoadBalancer.php(842): DatabaseBase->reportConnectionError('Unknown error (...')
#1 C:\wamp\www\mediawiki\includes\db\LoadBalancer.php(537): LoadBalancer->reportConnectionError()
#2 C:\wamp\www\mediawiki\includes\GlobalFunctions.php(3631): LoadBalancer->getConnection(-1, Array, false)
#3 C:\wamp\www\mediawiki\includes\User.php(1208): wfGetDB(-1)
#4 C:\wamp\www\mediawiki\includes\User.php(397): User->loadFromDatabase(0)
#5 C:\wamp\www\mediawiki\includes\User.php(362): User->loadFromId(0)
#6 C:\wamp\www\mediawiki\includes\User.php(2459): User->load()
#7 C:\wamp\www\mediawiki\includes\User.php(1167): User->getToken(false)
#8 C:\wamp\www\mediawiki\includes\User.php(365): User->loadFromSession()
#9 C:\wamp\www\mediawiki\includes\User.php(2029): User->load()
#10 C:\wamp\www\mediawiki\includes\User.php(3204): User->getId()
#11 C:\wamp\www\mediawiki\includes\MediaWiki.php(627): User->isLoggedIn()
#12 C:\wamp\www\mediawiki\includes\MediaWiki.php(476): MediaWiki->main()
#13 C:\wamp\www\mediawiki\index.php(51): MediaWiki->run()
#14 {main}
reportConnectionError "Unknown error"

Failure message when MySQL is totally disabled

Warning: mysqli::mysqli(): (HY000/2002): No connection could be made because the target machine actively refused it. in C:\wamp\www\mediawiki\LocalSettings.php on line 152

Connection failed: No connection could be made because the target machine actively refused it.

Not a code or configuration problem

Except for a few times when I shot myself in the foot while debugging, the proof that the PHP code base and my configuration are correct is that I can immediately obtain the page I want by reloading the page.

It is only on the first call after a long time (five minutes? ten?) of not accessing the wiki that the problem occurs.

Timing is everything

The most common time for the bug to bite is when I open my browser.

I've got two tabs to pages on the local wiki in my collection of home pages.

The first one usually fails if I've had the browser closed for a while; the second never does.

I dropped the second page from the set to see if that would help the first.

It didn't.

Once MySQL is awakened from its slumbers, all the links in the wiki work reasonably well--no worse than ever before--unless I spend a long time composing a page, and then the connection fails when I try to save the page. Refresh the page, and it saves perfectly.

Something somewhere keeps MySQL from answering the FIRST request for a connection in the time frame that Mediawiki expects. After that, everything is normal enough until MySQL decides to fall asleep again.

Round up the usual suspects

It's some interaction between Windows, Apache, MySQL, and the Wikimedia routines that has got me down.

Why can't we all just get along?

The articles in the links below have got excellent information and links in them. They cover pretty much everything I've tried, and then some. I'm just going to list the failed fixes that float into mind:

  • turn off IPv6 on my machine
  • have Apache listen only to 127.0.0.1:80
  • bind MySQL to 127.0.0.1
  • turn off name resolution in MySQL
  • change localhost to 127.0.0.1 everywhere I can to avoid DNS calls
  • edit Hosts file to make sure localhost is defined
  • enable opcache and install APCu
  • turn off Zend debugger
  • moved from WAMPserver 2.5 to 3

Total failure, but intriguing: get MySQL to use a socket. There are paths for it in the configuration files, but I've never gotten it going. I must do more googling! It seems as though it SHOULD work.

Sysinfo

Hardware

BIOS Manufacturer: American Megatrends Inc.
Release Date: 10/23/2012
BIOS Version: ALASKA - 1072009
BIOS Version: BIOS Date: 10/23/12 15:23:54 Ver: 04.06.05
BIOS Version: BIOS Date: 10/23/12 15:23:54 Ver: 04.06.05
Board Product ID: Gigabyte 990FXA-UD3
AMD FX-8350 Eight-Core Processor, 4 Ghz
16 GB RAM
480 Gb SSHD
1 Tb HD
650 W power supply
AMD Radeon HD 7700 with 2 Gb RAM

Report Generated at: 1/8/2013 1:02:45 AM

OK, so the machine is sneaking up on its third year in service. It still gets a decent rating from me and Windows:

Homebrew 5 performance.jpg

And, as with everything else except for Wikimedia version and the WAMPserver versions, I haven't changed anything from last week, when all was well in wikiland.

OS

Windows 7 Professional, 64-bit
Service Pack 1

Browser

Chrome 46.0.2490.86 m

Software Guardians

Malware Bytes
Set to exclude 127.0.0.1, httpd.exe, and mysql.exe from inspection.
CylanceProtect
I've got no control over this. All I can say is that with MediaWiki 1.20, I had no connection problems.
Firewall
Made sure port 3306 is open.

Ugly Workaround #1

Because the connection ALWAYS worked on the very next try after a failure, I thought that perhaps I could arrange a sacrificial connection to feed the ravenous Connection Monster; while it was chomping on the carcass, the real Mediawiki connection might sneak by unnoticed.

So I put this in my LocalSettings.php:

$servername = "127.0.0.1";
$username = "dbuser";
$password = "dbuserpassword";

// Create connection
$conn = new mysqli($servername, $username, $password);

// Check connection
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
} 
#echo "Connected successfully";
mysqli_close($conn);
#die("test finished");

This idiotic connection hasn't ever failed yet.

And the wiki page hasn't yet failed to connect afterward, either.

Pasta faggiole!

I hate it.

But not as much as I hate seeing the "connection failed" page.

Not yet, anyway.

Using the hrtime extension, I can see that this little connection check only takes a couple of milliseconds, so it's not costing me huge time in loading pages.

But I hate it.

It's a feature, not a bug

The MediaWiki wizards know about this.

There is something awkward in LoadBalancer.php: "'Unknown error' in LoadBalancer->reportConnectionError() hides a real database problem":

Change 157488 abandoned by Hashar: Do not hide real DB error on first connect

Reason: That would cause T31233 Broken failover for DB slave connection errors.

The try/catch was done on purpose, see Tim comment on https://www.mediawiki.org/wiki/Special:Code/MediaWiki/90266#c20239

We have a task for a better fix T35036

Do not hide real DB error on first connect

LoadBalancer should not silence database exception. Bug: 70223

Ugly Workaround #2

I've commented out the "try ... catch" code and replaced with with the rejected simplification suggested by Saper. I don't care that this simplification "would cause T31233 Broken failover for DB slave connection errors." Not my circus, not my monkeys. I'm running this code on localhost. There are no slave connections.

# MXM Cutting out "ugliest thing" and substituting another's guy's remedy
#		try {
#			$db = DatabaseBase::factory( $server['type'], $server );
#		} catch ( DBConnectionError $e ) {
#			// FIXME: This is probably the ugliest thing I have ever done to
#			// PHP. I'm half-expecting it to segfault, just out of disgust. -- TS
#			$db = $e->db;
#		}
# MXM
$db = DatabaseBase::factory( $server['type'], $server );
# MXM

Links

The intriguing line: we implement here what we already do for mysql connections: if we set a connection timeout to a very small value (in the order of magnitude of twice the RTT to the poolcounter server) and retry twice, we can have very fast failures in case one server goes down, and the failover to the other active servers would happen almost harmlessly.
I'm convinced this is a clue, but I haven't waded into the code to see what I might be able to play with.
David Random's essay on this page is superb. It didn't fix what ailed me, but it made me feel that I was understanding the problem better.
Contains an excellent set of links to various tips and tricks, as well as making some fresh suggestions.