Since a couple of weeks you will have noticed, that we do have severe hosting issues. Our websites are slow and sometimes even unreachable. Apologies for any inconveniences this has caused so far…
The main reason why we are experiencing this at the moment is that our main hosting machine has some hardware or software issues. We are not really one hundred percent sure what it is, since both is quite old (hardware from 2010 running CentOS 5) and we have no time to take this machine off the network to investigate since our services have to be up and running. In the end this machine is a hosted one, so that even if we find something, we will probably not be able to do anything about it.
Every couple of weeks, this machine crashes with some random kernel panic and needs a hardware reset. After that, a very long process of restarting and recovering all the services we are running is taking place. Every time. Over and over again. The filesystem is usually corrupted, the RAID is out of sync, database files are broken and over all this system is in a messy state.
On top of that, this machine is overloaded all of the time any way. It runs our main web server, our main database server, chat and telephony services for the people involved into this project, our build services (or at least major parts of it), our mail server. Probably everything that is publicly accessible. Restarting that takes a very long time since this exactly hits the weak spot of this server which is I/O. Most of the time the server is starving on I/O operations which cause that connections time out, because data that is requested cannot be retrieved from the disks. Using some services is a real pain at the moment, but we are trying as hard as we can to run the applications that most people use as smooth as possible, but I am sure you will have noticed that the forums are still quite slow at times.
So far we have not really been able to look out for any solutions. Essentially we need more hardware to host this project, but we are not at all able to afford this at the moment. We do have many ideas what we want and what this project would need, but all of these are way out of scope.
We will see what the future will bring. For the time being, I just wanted to bring this information to you those who want to know what is going on in the “IPFire Infrastructure” team and what is causing these hosting outages.
Posted: September 18, 2015 • 1840 views