Today I spent a lot of time working on enhancing the status checker. Something I should have put time into a long time ago.
A fair while back, I made some changes to the status monitoring program, but unfortunately, introduced a bug resulting in many outages being extended to the tune of 7 minutes (no, we test the providers every minute).
I isolated that issue, and corrected that, and we now don’t extend outages, and unfortunately, that leads to large amounts of 60 second outages (and frankly, a provider can go down for 60 seconds and not bother a customer).
So, my new plan is to take the focus of the monitoring, and remove a lot of the issues centered around how I put that together a year ago, and work on fetching the raw data and being able to put that out for analysis eventually.
What we do notice is that many of the outages for 60 seconds are a result of just a blip where a provider might appear momentarily unreachable.
This wasn’t good for providers who might not have been down and a packet simply didn’t get in with the time it had.
So my enhancements focus on speeding the testing up, when we get responses, we don’t really need to do much apart from getting past outage checks, and focus on finding those that are down, and finding out why they are down.
We obviously can’t do that without heavy detail being logged about logged outages. So, the plan focuses on moving away from a passive test, and into a more active test, and for the most part, it is working well.
C++ is a beautiful language, and I’ve come to have some fun with it today.
Hopefully I’ll be able to finalise the touches on the status system and that can be ready to slot in the server tomorrow, and then I’m going to probably work on the database overhaul to group providers, so that servers and providers are more different than they are at the moment, and swap the system over to the new method.
That leaves that going well, and then we can put the time into getting that website working nice and the way I want it to.
I got a big list of changes for the site planned, and they are in progress. I’ve had thoughts about dropping the RRD graphs, they seem a little wasteful to me, though I would need to look at their usage in the logs to see who is using them.
I’m also wanting to reconsider the site layout, but that’s not critical right now (but I do want to have something more appealing).
I’m always getting stuck at picking a nice logo or header for the site, something that depicts VoIP, and at the same time downtime (or uptime) and perhaps even something a little comical.
I don’t really have a vision of a logo for the site, but really am looking for something that sort of takes that look and adds a real VoIP touch to it.
The site has a long way to go before I will consider it finished and happy, like all good creations however, you keep on maintaining them, and keep them growing!
The issue for me is there’s never enough time to take what’s in my head as a vision of what I want for the site, and putting those into action in the site.
That’s changed a bit, there’s free time on the weekends (I spent all day on the status checker, with exception to my whirlpool exploration today).
So, hopefully, the progress continues and come Christmas, we’ll be locked and loaded with the site in a good state, ready for me to tackle my Diploma in IT next year.
One Response to OzVoIPStatus Changes