Hi,
Apologies if you tried to access the site earlier this week and got a 503 (Service Temporarily Unavailable) error. This was due to timeouts in our web host/payment processor’s security scripts, exacerbated by a couple of inefficiently-coded helper functions on our own part of the site. As it was an intermittent error it took several days (and in my case more-or-less sleepless nights) to track down and fix.
To make it up to you, there will be bonus new updates between Xmas and New Year, plus at least THREE archive sets a day for all of December!
Cheers, Hywel
UPDATE 16/11/2018: This morning’s error log looks MUCH more hopeful. If you continue to have problems accessing the site please let me know. But it looks like yesterday’s changes have fixed it.
UPDATE 15/11/2018: OK we have tracked down two or three code errors which may have been consuming a lot of server CPU, and we’ve increased several settings which should give more headroom before declaring a timeout. Also taken several measures to improve efficiency and cache a few more pages which were database-access-heavy to help reduce load on the server.
The error rate has decreased significantly, and although I’ve seen a couple of timeout errors in the last hour or so it isn’t the pages full that we’ve had for the last few days. If you get the problem again persistently, please could you let me know? I’m going to leave the site running as-is overnight again and assess the apparent error rate in the morning to see if this has made a significant impact.
In parallel, I have been processing all of December’s updates and I promise to make up for the unacceptable poor service in the last few days with a bunch of bonus updates for the Xmas season! Apologies again!
Hywel
UPDATE 4: Last update for today. Have I mentioned how much I hate trying to chase down intermittent problems? Tech support have made some changes to the security scripts and things now appear to be running more smoothly, so I am going to go to bed and see how the site survives the night. At least we should get a cleaner error log tomorrow if the problems recur.
Apologies again for the very poor service over the last few days. I will make it up to you with lots of bonus updates in December! 🙁
UPDATE 3:
The whole site is currently giving the 503 error. Whatever is causing it, it doesn’t seem like it was the code I was playing with. On the plus side, that’s eliminated all the other spurious warnings and errors from the error logs, so maybe tech support will find it easier to track down the problem now. Apologies, be back up as soon as we can 🙁 🙁
UPDATE 2: 14th November evening
OK, so I’ve found and fixed a couple of runaway function calls which were using a lot of server resources. The error logs are much quieter since I made that change. Let’s see if that’s properly fixed it.
UPDATE: 14th November
We are having strange intermittent problems following the server outage. The main symptom seems to be a 503 error message (“Service Temporarily Unavailable”) when trying to access the members’ area. The public pages seem to be working fine, and the members’ area works some of the time. But at other times it will give the 503 error when you try to access it. There are a bunch of error messages in the logs and it looks as though something is failing to terminate properly, using up a lot of server resources, and when the server load gets too much, triggers the 503 errors.
I am working on it along with my developer friend and the tech support people at our web hosts. We can’t quite understand why it has suddenly started happening as nothing on our side of the site has changed recently; it might be a security update patch or something has caused a problem with depreciated PHP-database calls.
I’m very sorry for the spotty service- intermittent problems are a bastard to find and fix. We just need to detective-hunt every warning in the log files and run some experiments to figure out where the “first cause” is – what process is it that is using a bunch of resources and not terminating properly? Then triggering a cascade of 503 errors. Please rest assured that we are working as hard as we can to fix!!! Many apologies.
Hywel
RestrainedElegance.com was offline briefly yesterday because of a database access issue. We’ve solved the immediate problems but are still getting a bunch of warning flags in the log files so we are chasing down the cause. The site is running “at risk” until we’ve figured out exactly what the issue is- we may need to reboot the server again to apply fixes. We will keep the disruption to a minimum, please bear with us.
Thanks, and apologies- we’ll be back to smooth running as soon as possible.
That explains it. I was getting 503 errors all day yesterday – I could get to the RE public site, but not the member area. So I opened a support ticket with SurfNet. And this morning I got a “we don’t recognize your email; did you sign up with a different one?” message from them.
But I was also able to access the member area this morning, so the “at risk” is working to at least that extent. OTOH, I can understand you not wanting to keep running in “risk” mode.