Prepared for setbacks
No matter how well prepared you are, in a project of this magnitude you are always bound to run into challenges. The first surprise we encountered was the labeling of the racks. The label coding used in the data center was different from our own, which meant we needed to update the rack designs accordingly. We also discovered that some parcels had been delivered to other Switch locations. And at first glance, it looked like the rails we had ordered for the racks didn’t quite fit.
Then there were things you just have to deal with: all deliveries came in unmarked boxes, so we had no idea which box contained which hardware. As a result, we had to unpack and sort everything. We needed a total of 30 boxes to sort all equipment and accompanying rails, cables, screws and other accessories.
Pressure cage
Once things get going, you are everything but alone. Multiple teams – from the data center, the customer, vendors, and us – are working simultaneously. Mix-ups start to happen, especially when the Switch team unexpectedly changes their schedule, which leads to someone getting ready to work in a rack where the Switch team just happens to be testing PDUs.
At first, everything is still neat and orderly. The racks are empty, and things are quiet. As the week progresses, more devices start humming and the temperature in the cage rises. This leads to our own temperature rising as well. Especially once the schedule changes and our access to the management network layer is postponed for three days. Things get tense as pressure rises, because we need enough time to properly finish the cable management.
Triple testing
Once we step out of the data center it is time to start the actual testing of the management and infrastructure layers. Since we want to make absolutely sure that the entire chain is tested, we do this in three test cycles. Only if each test succeeds, will the data center be taken into production.
1. Platform test
The first test is the platform test, which we perform on-site in Las Vegas. We test all infrastructure components according the to our test plan. We check whether the monitoring is up, redundancy is in place, everything is labeled, if any errors pop up, if all systems have been set up according to the naming convention, and if we can log on using LDAP.
2. Power test
The remote project team tests the interfaces between the individual stack in order to detect any errors before going live.
3. Soft shutdown test
The last step is to turn the entire cage off. The Switch data center team then turns on every piece of equipment in a specific order, according to our rack design. This order is determined using dependencies in our schedule. All the tests passed successfully.
Finalizing the stack
Everything has been tested. We now have a few weeks to implement generic services through VMs. These include, among others, Active Directory, Global Directory Service, NetIQ, MFA-NAAF, Splunk, SCOM and Darktrace. Standard services we need to install in order for the customer to run their applications.
A total of 23 VMs need to be implemented, for several different teams. Because every team is responsible for their own configuration, and each person has a different approach, this takes some organization. A solid planning is, once again, indispensable. Mapping the requirements and specifications, user rights, and schedule for each VM and team.
The project was a success, with only a few setbacks along the way. This has shown to me that, as is often the case, a good preparation is half the battle. But in addition to that, the use of standardized building blocks, automation templates, and predefined script is indispensable to complete tasks of this magnitude in a limited time frame.
