Compuware recently published a study that shows up to 60% of development and testing time is spent working on data-related tasks. Data-related tasks can present themselves in the form of data creation, modification, or writing, investigating, and resolving data-related defects. Simple math will demonstrate how high this cost can become. As an example, if ten resources are working at a rate of $50 per hour, on a project that is estimated at 1,000 hours of effort, the total cost of the project will be $50,000. Based on Compuware”s study, up to $30,000 (60%) of that project, or 600 hours of effort will be spent working on data-related tasks.

If that cost can be reduced (note: it can never be totally eliminated) it is fair to assume that software test and development projects can become more efficient from a cost and delivery timeline perspective. Through careful planning of test data management, businesses are seeing a tangible return on investment by reducing the amount of time spent juggling data-related tasks with the normal development and testing effort. Resources become more effective by remediating the potential for costly data-related defects, and reducing the time spent to create and refresh valid test data. This activity results in the ability to supply the software to the end users more quickly and ultimately deliver the competitive facet that the software is designed to provide.

Where to Start
If the concept of a test data management plan is new to the organization, it is understandable to have little to no idea what goes into the considerations of the plan. Listed below are a base set of data strategy considerations that have been applied to projects.

Under ideal circumstances, a test data strategy is built at the same time that planning for the application implementation begins. This is to provide ample time to surface, and subsequently resolve, test data strategy issues. Software project initiatives are at an all-time high, however, pairing those efforts with a test data strategy is still somewhat new. For that reason, businesses are struggling to find information about the industry best practices and how those practices might be applied to their project.

There are three ways to incorporate test data considerations into a software testing project. The test manager or project manager can build tasks and time into the testing project plan to allow for testers to create the data necessary to execute their tests. The task can be assigned to the development team to assist with data creation for testing purposes; or a dedicated resource can be assigned to focus on test data as their primary function. In one of these three solutions, whether it is the entire test team, the development team, or a single individual, one or all of these people is taking on the role of test data management.

A solid test data strategy includes preparation time from a test data manager that includes each of the following:

Data Flow Through the System
Data flow is important to understand when determining whether the tester will be testing the function or the entire workflow. Function testing may require a less comprehensive view of the data, and may simply accept any value within the constraints of the field being tested. However, feature testing becomes more complex because it transfers data from the function and into the workflow. Therefore, this data needs to be acceptable to all functions it will pass through to complete an entire workflow.

  • Most often, this requires a graphical representation including, but not limited to:
  • The individual internal components, and the data expectations that each component is built around
  • Data dependencies, specifically, how the data will transfer between internal components
  • User specific functional calls
  • Role specific functionality of the system

An Understanding of the Internal Application Integrations

Integration points are the systems that the application under test will interface with, and that are internally-owned by the company. Since the applications are internal to the company, the test data management resource has flexibility to request access to the system. And, ideally, work with a subject matter expert to gain a thorough understanding of the integration point. In addition, the test data manager may be able to extend their services to the project by utilizing the SME to assist in creating data for or from the integration point.

Knowledge of the Integrations of the System with External Vendor Systems
On the opposite end of the spectrum is the external vendor application. Since the application is not housed or developed internally, the flexibility and internal SME knowledge is limited. There is little control over what types of environments are offered, or the capacity of the vendor system to allow for specific types of testing like volume and load testing.

The test data manager will need to investigate the standards used in the vendor production system, and how those compare with the test region. If there is a discrepancy, scaling is necessary to avoid overloading the vendor”s test environment. As an example, if the test environment is housed on one server, but the production system is housed on five servers, scaling to 20 percent of the expected production load may be necessary to avoid crashing the vendor test environment.

An Environment and/or Data Refresh Plan
Through some combination of data model changes, defect fixes, or change requests; a portion (if not all) of data created during the first week of a testing project becomes obsolete and unusable throughout the project, and cannot be re-used during the final cycle of a software testing effort. Furthermore, if testers are running the same test multiple times, the chances of having junk, or irrelevant and unrealistic data in the system is high. Using obsolete data can result in uncharacteristic behavior of the system, non-reproducible defects, and may drain time on both the testing and development sides of the project. To combat this issue, an environmental or data refresh plan can assist.

The environmental refresh plan ideally includes the who, what, how, and how often of the project”s data refresh strategy. It identifies which teams, or specific individuals responsible for performing the refresh activities, along with identifying what data sets will be removed, overwritten, or updated. The timeline for refresh (how often) is usually dependent upon the development process chosen for the project, and the timeline for each development and testing cycle.

Data Tracking & Reuse
Given that data-related tasks can account for up to 60 percent of development and testing effort, the ability to re-use the data created can help in the reduction of effort used to focus on data.

Test data managers can increase their effectiveness by identifying data that can be used across multiple testing efforts. By keeping an updated copy of each of the test”s pre-conditions, and data end states, it can identify more quickly which scripts can logically use the data next. When new tests are introduced, the data manager can determine where the pre-conditions match the data end state to determine if data may be re-used.

Test Data Management Measurement
Measurement is a key obstacle in reporting the success of a test data manager. Unless a dedicated resource is used to manage the test data for your project, it may be difficult to track and measure the benefits of proactively handling the data tasks through the planning and execution phases. However, if measurement can be accomplished, it will help the team to display return on investment to its stakeholders, sponsors, and the entire IT project organization. Measurement can take on many forms, such as number of requests made, requests made by each workstream, percentage of types of requests (i.e., static integration data, transactional, payment, xml, vendor, etc.), time required to deliver requests, and percentage of time an agreed upon service level agreement is met or exceeded. To keep the test data management activities visible, consider a periodic test data status report to outline successes and difficulties encountered through the process. By providing well documented results pertaining to test data management, those results will become an advocate for maintaining a test data management focus on future projects within the organization.

Keep in mind that although one may have a clear picture of how to build a test data management plan, the data manager will likely face obstacles during the execution of the plan. Of the infinite obstacles to consider, three appear frequently on testing projects. The three items start with the fact that test managers often overlook the test data element as a whole, and therefore do not allot an appropriate amount of time to compensate for creation. Second, and without failure I have seen this on every large project; test teams always, always, always neglect to think through the consequences of modifying administrative, or backend data. The third item is common across platforms, however, it is typically handled as a reactionary activity rather than an opportunity to be proactive and assist the test team”s effectiveness, and that is the accumulation of incomplete data.
Test Data Creation Takes Time:
It takes time to both identify and then populate valid data into a stand-alone testing region. That time needs to be accounted for in the testing project plan. Too often, test managers are omitting the test data creation activity from their project plan, or not allocating enough resources to ensure its timely turnover. If it is ignored altogether, or de-emphasized during the planning phase, the time it takes to create the data will present itself during a future state of the testing effort. Frequently, neglecting to pay attention to test data creation results in a slip in either cost (i.e. resources working overtime to complete) or time (i.e. pushing back the delivery date).

Modification of Administrative Data Can Have a Negative Impact: Applications are designed to accept or expect only certain values to route data through the system properly. In some capacity, on every application, this is true. When the data that the system is expecting is modified or deleted, it wreaks havoc on underlying functionality. In certain cases, that can mean bringing the entire system to a standstill, resulting in a loss of resource hours. If the system does not come down altogether, modifying that data will almost certainly cause new defects to be reported, which will take time to investigate, resolve, and retest.

Accumulation of Incomplete or Junk Data Will Become an Issue:
During normal progression of a software project, data is created at different times to test new functionality as it is introduced into the system. However, through careful observation, you will notice that these rows of data have the opportunity to quickly become obsolete. Whether it is through changes to the data model, defect fixes, or change requests, any data created prior to the freshest build of code can never confidently be called pristine. If a tester attempts to re-use a piece of data he or she created during a previous cycle of testing, there is no way to know whether or not a subsequent error is a true defect, or a data-related defect. Typically, the buildup of data is handled as a reactionary activity rather than an opportunity to be proactive and assist the test team”s effectiveness.

Whether the test data management task is distributed across the entire test team, the development resources, or a single dedicated individual, a strong test data management plan will assist through a successful execution of that plan. By knowing the data flow, internal and external system integrations, and how to track and re-use data for additional testing, any perceived myth to the test data becomes an understandable piece of the testing project. Creating data refresh initiatives proactively rather than reactively allows for optimal environmental data expectations, and reduces the number of accidental data-related defects being reported. Keeping in mind that, as with anything worth doing, successfully implementing a test data strategy will require time and effort, and allocating for that effort in the planning phase of the project will allow for maximum efficiency in data-related work. In reference to the Compuware study stating that up to 60% of development and testing time is spent tending to datarelated tasks, an argument can be made that not all projects experience the 60% time commitment noted. That is a fair argument; however, if any amount of time is spent tending to unforeseen data or related tasks, it uncovers inefficiencies in the testing project. Inefficiencies lead to wasted time and money. That additional time could mean the difference between beating your competitors to market, implementing a regulatory requirement, or introducing a business critical process improvement that will directly affect the organization”s bottom line. By effectively managing test data, we gain the ability to deliver the project less expensively, more quickly, and ultimately more competitively.

About the Author

Daven Kruse My background consists of 8 years of IT experience in varying roles. Most recently, my time spent as a software testing consultant has lent itself to experience with various industry verticals. Whether it is the pharma industry, the energy industry, the insurance industry, or anything in between, the value of software testing throughout organizations continues to grow. I have a passion for test management, and feel great accomplishment when a team I am part of delivers a completed project.

I strive to help customers realize their needs, and communicate that into a buildable solution. I enjoy developing test strategies, and incorporating continuous improvement techniques to improve test team efficiencies.

I am a lifelong learner, and continue to educate myself through conferences, workshops, online courses, discussion forums, and formal classes. I am a high energy individual whose flexibility and drive help me succeed.