Production deployment starts with the first line of code: an application should be built to work in production infrastructure, could be deployed using existing process, be it CD or manual update and handle load expected in production. Even in almost ideal CD process with wide use of automation where it is possible, some human interaction might be required when it comes to changes to DB, build scripts or infrastructure with new release. In this article, we will take a look at such scenarios. What is a good deployment plan? Complex production update requires detailed documentation of all required changes. The easiest way to describe it is a table that consists of following columns:
- Step order #
- Step Description
- Step Owner
- Step Duration
- Notes to QA if applicable
- Additional information (paths to scripts, endpoints, credentials etc)
What could be the role of QA? As mentioned in deployment plan section, notes for QA could contain instruction for validation. In cases when deployment plan is already overflowing with information, it makes sense to keep data for QA in separate document that includes:
- Specific noninvasive test scenario for production to validate presence of new functionality in the build.
- Instructions and checklist for each testable production configuration step, such as:
- Configuration changes: Pom.xml, IIS etc
- Integrations with correct service endpoints
- DB schema /Data changes
- Data migration
- Master/feature branch merges: all late minute fixes are present in the build
- Access to shared locations
- Free space on the servers after deployment (if this is not automated in AWS)
- Short performance test
Explaination for some of the bullet points for configuration change tests in details below. Configuration changes. Depending on your process, this could be a very long list of possible scenarios, from simple configuration file change to change of the server itself, but let’s limit this to few typical examples from this range.
|Change type||Change source||QA approach|
|Change in configuration file||Manual or new configuration file from build||Verify that application is operational and change is there. Typical example of such change: changed/added URL for a service. Choose test approach where this service is used to check if system receives data from it.|
|Server changes||New hardware/new AMI/new docker image||Full integration test is required to ensure connectivity from the new machine + at least short performance tests to check if configuration is correct. Usually these tests are done before switching over to new instances, so time is not a huge factor here and team can do extensive load and stability tests on new instances. Performance tests are even more critical if server migrates to new datacenter, because in this case server change is accompanied by networking changes.|
|Changes in Software/Plugins||Manual/Automated through AMI change or Docker images||Choose scenarios where this SW is used. Typical example: URL rewrite installation and configuration; Registration of new dlls on the server.|
|Networking change||Automatic/manual||Integration test to ensure connectivity with services and operation of the software; Performance test to see if setup can handle production load.|
|OS update||Automatic/manual||Integration test to ensure connectivity with services and operation of the software. Load and stability testing to ensure that there is no degradation of performance.|
Integration with correct service endpoints These types of test are self-explanatory, but may be tricky in case there are multiple available environments for integrated services. Easiest approach to validate that correct endpoint is used is to execute scenario that will require call to this service and verification of configuration file. Alternative approach could be achieved using data markers: postfixes in record names, test records that are absent in production etc. DB schema /Data changes Tests for presence of DB changes could be different. For data it could be UI/Response validation to see if new/updated values are used in the system. For schema changes, it is different. Some changes will not be not have visible effect on UI, so there is no real way to validate their presence without connection to Prod DB that is not always possible. In some cases even missing stored procedure change (new field, updated data type etc) could not be visible right away (sometimes code that is using these fields is wrapped in try catch and system could use default value. For these cases QA should choose scenarios where new/updated field is used in logic and does not match default value in code, to be sure that code is up to date. If new release requires adding of new fields to records/changing some values in records, usually it is safe to check that script was executed (LoE to add this to automated deployment usually is much higher than running it once manually). One of common possible issues that team can face during deployment is script execution time due to record count difference between QA and prod environments. Unfortunately, there is no 100 % safe way to predict it, unless it is possible to setup full copy of production DB in QA. Few examples of such issues that required downtime to be extended are:
- For SQL DB it usually results in query plan optimization/creation of indexes to make script execution faster
- For NoSQL DBs (e.g. AWS Dynamo) it may require changes of throughput, changes of scripts to limit load on DB, changes to script and DB optimizing queries to decrease required throughput values.
Master/feature branch merges: all late minute fixes are present in the build As a minimal requirement for last minute change presence test QA needs to perform at least one specific scenario for each of the commits done since latest tested branch synchronization. Free space on the servers after deployment (if this is not automated in AWS) Some new versions of application require updates off frameworks, installation of new software and may have much bigger size of the builds. For such cases, it is required to ensure availability of at least 10 Gb of free space after the deployment if server does not have space issue alerts configured. Conclusion As mentioned before, production deployment starts with the first line of code. Meeting all functional requirements is important, but testability of application is also an important factor in development and deployment process. Proper logging, log aggregation stacks, access to documentation, ability to check configuration of the servers – all this makes development, debugging, QA and deployment process much easier for everybody. Mutual respect, good communication and a common goal will always be a great addition to any deployment plan.
Rama R Anem
Rama is a Quality professional who manages teams and complex technology-driven projects. She has over 13 years of experience and has worked with companies like IBM and AMD in past in addition to leading global teams. Currently she is responsible for managing multiple software quality products in Sunpower Corporation. She makes sure active releases are on target with No live site issues. At AMD she managed the enterprise-wide data platform practice and was directly responsible for the success of the EDW testing practice. Under Rama’s leadership AMD was able to build the first EDW Center of Excellence. She oversaw multiple EDW projects and teams from envisioning through go live. At IBM she worked as a staff software engineer, contributed for DB2 product development, and also worked as Technical Lead for multiple teams. She served as one of the Brand ambassadors for IBM DB2. Rama has contributed in technical software international conferences, workshops and magazines.