Strand Management Solutions, Inc. 3525 Quakerbridge Road Suite 6325, Hamilton, NJ 08619


Data Analysis

Data Analysis is central to and the most consequential component of any information technology project. It is the starting point and basis from which all subsequent design decisions flow.

Analysis begins with an investigation meant to arrive at an understanding and precise definition of 1) the business need for the data, 2) where the data originates, 3) where it gets modified, its data security requirements, 4) the validity checks that can be used to scrutinize the data and insure its accuracy and 5) the auditability of its changes.

In most organizations these considerations have either been overlooked or their consideration so long ago reviewed as to be out of date and likely forgotten.

Detail data analysis then continues with agreement on the entities that must be tracked and the attributes of those entities. Each entity occurrence needs to be uniquely identified and each of the attributes must be considered with respect to their occurrence (one or n) and whether the attributes need be tracked over time.

An example might be entities of students, teachers, sessions and session enrollment. Session enrollment would be the intersection of sessions and their teacher (1 teacher per session) and students (n students per session). The issue of tracking entities or attributes over time might be to consider if we needed to know not only a student’s current major but what the student’s major was at previous points in time.

The relationships between the entities are discussed, again with careful consideration being given to defining 1 to n and m to n relationships. The result of these data analysis steps will likely be an entity relationship diagram or model, which is a graphic representation of the data and its relationships.

It most cases, our experience with systems of similar types help us proceed through the design stage quickly.

Data Analysis Pitfalls

We can share however that incorrect information about the nature of the data will almost always result in time consuming and costly changes when the discrepancies are found. It is therefore important that data analysis results be shared with and explained to each of the team members with knowledge of the data. It is particularly important that the people who deal with data entry and verification are given sufficient time to verify that the conceptual work accurately matches the day to day real life occurrences.

One of the common mistakes we regularly encounter is the assumption that data conversations only involve discussion of the common and usual cases. If you expect the system to accommodate all cases, then all data cases must be taken into consideration. If a data situation occurs one time in a million and you want the system to deal with it properly then that case must receive the same analysis and attention as if it were the case that applies 99.99 percent of the time.

For most systems we construct process for the entry, review and modification of data. For other systems we deal with data that comes from external processes or data that has been gathered by other systems. In the case where we are processing data gathered from other systems the analysis step will likely be more easily defined and therefore of shorter duration.

The Importance of Accuracy in Data Analysis

Be aware that because data analysis is the foundation of all subsequent development it is crucial that it be devoted the proper attention. We convey this to clients in clear language in our standard service agreement. The agreement includes the following paragraph:

If your business processes in any way rely on data that is being provided to your systems that do not precisely meet your stated representations as to their data types, ranges, compliance with uniqueness claims or data relationship integrity problems you run the danger of requiring a material amount of additional services to analyze, and either clean the data, add additional system validation, or modify already programmed business rules to meet the new circumstances. Inaccurate representation of data conditions is a most significant impediment to on-time on-budget delivery. If we find data conditions that conflict with your representations, we will notify you to discuss a proper course of action for dealing with the anomalies. Such a discovery will likely affect both the scheduling and budget of the project.

Database Development Steps That Follow Data Analysis

In addition to providing data analysis services we also provide: