It is fairly meaningless to say 'System X is modifiable'. Firstly, with respect to what is it modifiable? Some of the changes will be handled more easily than others. If the system was designed with client-server pattern, then changing it to peer-to-peer is easy? It might be, but I find it rather unlikely. Even if that is the case, what are the consequences of such highly modifiable system? What other quality attributes were sacrificed in order to make this happen?
Ideally therefore, we should be able to characterize quality attributes in a more objective way. SEI suggests a general solution: scenarios. They basically represent a condition 'what if' something happend? The idea being that for every quality attribute you create a list of 'likely' scenarios e.g. for reliability - no response from the server when data sent. Then you explain how would the system handle such case. The format of the scenarios is fairly simple and is made of 6 points. These are:
- Source of Stimulus - e.g. human - who triggers the set of events
- Stimulus - e.g. mouse click - how do they do it
- Environment - e.g. the system is working in a normal - what are the constraints, overall conditions, global assumptions made
- Artifact - e.g. application - what artifact is stimulated? What is being affect by this mouse click?
- Response - e.g. save data and restart application - activity undertaken after the arrival of stimulus
- Response Measure - e.g. no data lost (or recover within 1 min) - how do we measure the success rate
A set of those scenarios can be found here ftp://ftp.sei.cmu.edu/pub/documents/01.reports/pdf/01tr014.pdf. SEI also gives a number of keywords which can be used in order to asses.
They are known as General Scenarios, for example, availability:
- Source: internal/external
- Stimulus: fault, crash, timing, response
- Environment: normal, degraded, operation
- Artifact: process, storage, processor, communication
- Response: record, notify, disable, continue (normal mode, degraded mode), be unavailable
- Response Measure: repair time, availability, available/downgraded time interval
This kind of format enables a uniform communication between different people. The scenarios ultimately as used by testers to verify correct behavior of the system. It also used a a proof that various, important and likely scenarios have been taken into account to assure the system meets the required standards. However, within that we also need specific scenarios.
- Source: external to system
- Stimulus: unanticipated message
- Environment: normal operation
- Artifact: process
- Response: inform operation and continue to operate
- Response Measure: no downtime.
The specific scenarios are derived from general scenarios by instantiation each part.
It is not always so simple to evaluate the scenarios. As an example, consider a financial transaction system. The architecture might need to support the following:
- Checking all paths to each component
- Ensuring mechanisms exist to catch the unanticipated message
- Un-doing any actions taken based on the message
- Understanding where the message came from and whether its impact could be less severe
An excellent example, where that goes horribly wrong is Ariane flight 501. Unanticipated message arrived in the function, the message was recognized as a fault and the system shut itself down.
Availability doesn't have to be uniform all the time. For example, at night it may be possible to
- Source: one of a number of independent sources, possibility from within the system
- Stimulus: Periodic events arrive; sporadic events arrive; stochastic (aperiodic) events arrive
- Artifact: system
- Environment: normal/overload
- Response: process stimuli, changes level of service
- Response Measures: latency, deadline, throughput, jitter, miss rage, data loss
- Source: user makes financial transactions
- Stimulus: significant number of sporadic events
- Artifact: system
- Environment: system already overloaded
- Response: all events fairly slowed down
- Response measure: average end-to-end response times over moving 1 minute windows
Some of the things which can be included in the architecture may include:
- Has the mechanisms to cope
- Identify potential bottlenecks
- Logging to demonstrate fairness and to support future optimizations