AWS Deployment
Hey community, Rick (Lead Game Dev) here to share with you another weekly update.
The rapid implementation of the server backend (AWS_V15b) is well on its way, but…I had a pretty unexpected “issue” which if you look at the date,… it would be hard to believe that this wasn’t just a bad April fools prank, but I assure you this very much happened.
So at the beginning of the week, I go to log into our Shiryo’s development server and it fails, throwing an error… strange. I naturally assume “something commonplace” like the internet might be down?. But that small explanation is quickly stifled the moment my morning caffeine kicks in and I realise that I am talking about the Shiryo server here.
For those who are interested, we run a fairly straight forward development setup.
We host our development servers locally so that we at Shiryo have full ownership (and counter to that, responsibility) of our assets/data… plus there is something reassuring about being able to actually see the hardware with our own eyes. (As opposed to fully cloud based).
For data storage, we run a double redundant NAS (Network Attached Storage) Drive, with a hosted VPN server. This provides anyone on the team easy access to Shiryo’s shared storage drive, from something as familiar as (My Documents) on their own machine, from anywhere in the world simply and efficiently.
For source control (software used for tracking and managing changes to Shiryo’s codebase) we use an industry standard (Perforce) which is optimally integrated into Unreal Engine and used by Epic Games themselves so it’s a no brainer. Without source control, development would quickly become a headache with multiple people working on the same part of the project, adding, deleting code. Managing these changes manually between team members would become a significant overhead. As well as this, individual user changes are stored so that we can revert back to a previous build version should there be an unexpected issue.
Now we underpin these systems with multiple stages of redundancy. This is to address the almost inevitable event of a drive failing (The loss of significant data in a project is a rough life lesson that you seldom ever make again). All of our servers are set up with redundancy (mirrored storage drives and remote backups). So that no single point of failure could cause permanent data loss (An almost incomprehensible outcome at this stage.)
Now I am guessing this long winded explanation has indicated to you all that yes one of our server’s had indeed failed and the ole “turn it on and off again” was unfortunately a non starter. After some stressful diagnosing, It became clear that we had actually been subjected to some kind of hardware failure, and the most suitable solution would be to take the opportunity to replace the hardware and rebuild the os/data if any corruption had occurred.
Now due to the multiple levels of redundancy, in this situation we were not in any risk of losing any of the development progress, but as you can imagine I did not have a fun few days.
Fortunately the majority of the team were able to continue their work unhalted by using their local machines’ most recent version of the Perforce depot. Though It was necessary for me to pause my work temporarily on the AWS deployment to address this issue. As there are only a few things that could be more important to us than the continued, steadfast development progress of Shiryo, but data loss…. that is one of them.
I am disappointed that I don’t have more exciting progress to share with you this week, but I hope that in our continued transparency of not only showing you “positive” updates, but by also sharing with you the “hurdles” that we overcome, that it will portray more clearly the difficult realities of creating a project like Shiryo.
Once again, I hope you have found this weeks behind the scenes update interesting and I look forward to returning to our regularly scheduled positivity.