test content
What is the Arc Client?
Install Arc

Server downtime attempted to be explained by a software Developer

SystemSystem Member, NoReporting Posts: 178,019 Arc User
edited March 2012 in Ten Forward
First let me start by saying that I love the game, I love the story line and some of the Foundry content has a good story too. Like many of you i am Frustrated by the "Unexpected downtime" that is happening lately.

I am in no way affiliated with or know anybody in Criptic, or PWE. Also i suck at trying to explain things so i am trying my best.

I am a software developer, I have not worked on to many games (and the ones I did are only moderately complex). Like all software developers I get Bug reports, and let me tell you 90% of bug reports tend to be stupid, or feature requests. Also there are people in here that call themselves Software Developers, and you can tell they are not.

The Devs determined that the main problem is "Heap Corruption". I have seen this problem and let me tell you that it is the HARDEST problem to track down in big projects, In my experence. (1) Beta and Alpha testing can't catch it because Tribble and the internal builds that the devs use don't have a high load to cause the "Heap Corruption" (2) The memory dump almost never tells you where the problem is, It only tells you where the Crash is. In most bug issues where the Crash is and where the problem is are close enough to track down, not in a "Heap Corruption" where the main bug is in a different function or even file.

With a game like this on the Holodeck (main Shard) you can have 10's of thousands of players accessing the faulty code any given day, untell the errors in the Heap stack up enough to make the Heap unusable. On Tribble you only see maybe 1/10 of that number with low errors it is never caught before they bring the system down for normal maintenance, when that happens the errors are not caught, Normal maintenance just automatically repairs the errors.

Sad part with this error the only way to catch it the first few times, is to have the database in a Debug mode and monitor that untell the next crash, If people are complaining about Lag now wait tell something like that happens Live Debuging a Database reduces Read and Write time, because all read and write code have to take extra steps. I know one game Dev that tells me that when they have a "Heap Corruption" problem, there online section of the game has to crash a good number of times before they get enough information to track it down.

Also i have been a member of other MMO games as they Transitioned to F2P. STO made the transition better then other MMO games. One MMO was down every day for 5 hours because of some crash. There has to be 1,000's of lines of Functional code (not counting white lines and comments) just to handle a STF mission.

P.S. I know this was probably all over the place but it is my attempt to help the people that don't know and want to know, It will not help the Willfully ignorant

P.P.S. This is now the First F2P MMO game that i have enjoyed playing.
Post edited by Unknown User on

Comments

  • Archived PostArchived Post Member Posts: 2,264,498 Arc User
    edited March 2012
    Thank you for that calm and detailed explanation. This is worth a hundred "Boooo let me complain about the server going down" threads.
Sign In or Register to comment.