Over the last few weeks, I have been having Meshcentral.com reliability problems, the site would just go down. When looking into it, the database seemed locked and all the queries for data would hang and timeout. This would affect all server components: the web site (IIS), the binary routing server (Swarm Server) and the HTTP routing server (AJAX server). It's like someone was holding a lock on the database and not releasing it. I would reset software and it would run again, for a time, and the problem would happen again.
Last night, I finally figured it out (I think). Each time there was a lock up, the log files showed this like:
Autogrow of file 'MeshCentral_log' in database 'MeshCentral' was cancelled by user or timed out after 896 milliseconds. Use ALTER DATABASE to set a smaller FILEGROWTH value for this file or to explicitly set a new file size.
In a nutshell, the database file on disk was too small and needed to be resized to be larger. The database was about 4 gigaytes in size and the policy was set to enlarge it by 1 megabyte increments. Even if it did grow by a little, it was not long it would need to grow again. So, last night I made the database file 10 gigabytes in length, giving it plenty of room. I am likely going to go back in a few days and make it 100 gigabytes. Since it's a dedicated server, there is no need to by stingy.
Sorry to anyone who noticed the site was down. If this was the problem, and I am confident it was, it was a very easy fix and should not happen again.
Thanks,
Ylian
Immagine icona:
