The eTool website itself is built using Microsoft’s latest and most secure web technology ASP.NET MVC. It leverages the ASP.NET Forms Authentication library to provide the highest level of security to users accounts and their content. Users are required to use a combination of email address and password to gain access to their eTool workspace. Passwords must meet minimum complexity requirements and we follow industry best practice by not storing users passwords but rather salted hashes.
All of eTool’s services are built on top of very secure foundations. Using Amazon’s Elastic Compute Cloud (Amazon EC2) to host eToolLCD ensures a high level of security and protection against malicious attacks at the infrastructure level. Amazon have stringent controls, policies, procedures, hardware standards and robust systems that meet or exceed the CESG cloud computing principles as far as the data centre responsibilities are concerned.
eToolLCD uses one back-end MYSQL database to store all data hosted by Amazon Web Services. The only separation of data involves documents uploaded by users which are stored on a separate “S3 Bucket” that is called by the application when documents are to be retrieved. The application and database share the same instance. We deploy redundant servers, daily backups and an additional level of backup for disaster recovery. In the event of an outage on a server, we will continue to work on the redundant copies. In the eventuality of a large scale event, we can recover within 10 minutes using the daily backups or within 2 hours using the disaster backups. The whole instance is backed up at 3 hourly intervals with the backups stored using the following schedule:
1. 3 Hourly backups stored for 24 hours
2. Daily backups stored for 7 days
3. Weekly Backups stored for 2 months
4. Quarterly backups stored indefinitely
Skeddly (skeddly.com) service is used to backup the AWS instances. In all, eight separate scheduled actions are set up in Skeddly to achieve the above backup schedule and history. The actions are a combination of “Create Snapshot” and “Delete Snapshot” actions (the delete actions are filtered to ensure preservation of historical backups). At daily intervals snapshots are also copied by the Skeddly service to another region (that is from Singapore to Sydney data centre) to ensure a catastrophic event at the Singapore data centre will not compromise Disaster Recovery efforts.
AWS Cloudwatch alarms have been set up to alert key eTool staff when the eToolLCD server does not pass a status check. The alerts have been set up for hourly intervals. This ensures that eTool staff will become aware of the issue within an hour of the fault if they haven‟t already been alerted by users.
eTool have a disaster recovery and business continuity strategy that enables very fast full recovery to production. eTool’s disaster recovery strategy has the following objectives which have been successfully tested in simulated catastrophic failure events:
• Recovery Point Objective: Maximum of 3 hours
• Recovery Time Objective: Maximum of 4 hours including:
o Communication: Maximum of 1 hour
o Implementation of DR procedure: Maximum of 2 hours
The essential elements of the DR plan are:
• Communication: Automated communicate to key eTool staff if eToolLCD server fails any status checks
• Pre-prepared reboot instances with the operating system etc installed and ready to boot across multiple regions
• Procedure for creating a new volume from the backups snapshots, and attaching this volume to the pre-prepared reboot instance.
• Post recovery procedure to ensure subsequent failure can be managed in the same fashion and that the root cause of the issue (if internal) is established and re-occurrence prevented.
• Bi-annual simulation of DR event, including full disaster recovery test.
eTool have simulated data recovery incidents and have been able to restore the application to the last backup within 20 minutes of being alerted to the failure.
eTool maintains an excellent track record when it comes to software delivery. Unplanned maintenance at eTool has been limited to two occurrences, of approximately 4 and 2 hours respectively, in the five years since eToolLCD has been live. This represents an availability of 99.986% of total calendar hours (incidentally, both of these events happened outside of UK business hours).
eToolLCD updates are conducted on 2-4 week intervals and are scheduled for shoulder periods when user activity is minimal, being a system that is used globally this does present some challenges. The normal update window is 20 minutes. When the application is down for such maintenance a clone is made available in case users are in the middle of a demonstration.
Etool support services are committed to resolving critical support tickets within 4 hours. A critical ticket is defined as one in which an organisation or group of users cannot work and are relying on eToolLCD to achieve a deadline or a process is affected and users cannot perform certain functions.
Posted in: eToolLCD Software