Software Design Introduction INTRODUCTION ------------ DEF: Software Design - is a process through which requirements are translated into a representation of software. - Focus is on HOW the program will work. - Try to avoid programming language and hardware specific details that affect HOW. - Develop a program architecture and map requirements to portions of the architecture. Designer's goal - produce a model or representation of an entity that will later be built. NEED FOR DESIGN --------------- Design is the only way we can accurately translate a customer's requirements into a finished software product Without a design we risk building an unstable system: - one that will fail when small changes are made (maintainability) - one that may be difficult to test (testability) - one whose quality can not be determined until late in the development process DESIGN PROCESS -------------- The design for a system is created in stages. From a project management point of view (refinement of lifecycle) \ \ High Level Design (Preliminary Design) \ \ \ \ Design == Detailed Design \ \ \ \ Design Review \ \ High Level Design ----------------- - Also called Preliminary Design - Concerned with the transformation of requirements into data and software architecture - PRIMARY INPUT is the SRS - PRIMARY OUTPUT is an architectural design and data design Detailed Design --------------- - Focuses on refinements to the architectural representation that lead to detailed data structure and algorithmic representations for software - PRIMARY INPUTS are the outputs of the High Level Design step - PRIMARY OUTPUT is the SDD (Software Design Description) + contains architectural design (structure charts), detailed design (module specifications) data design (Design data dictionary) and a Software Design Walkthrough document. Design Review ------------- - Examines and evaluates design documents for completeness, correctness, quality, agreement with requirements. - PRIMARY INPUTS are the outputs of the detailed design stage. - PRIMARY OUTPUTS is an acceptance or rejection of the SDD, plus a list of issues or modifications to be handled. Running through the High Level and Detailed Design Phases are: Data Design - transforms the information domain model created during the requirements analysis into the data structures that will be required to implement the software. Architectural Design - Defines the relationships among major structural components of the system. Procedural Design - transforms structural components into a procedural description of the software. Interface Design (?) - establishes layout and interaction mechanisms for human-machine interaction. STRIVING FOR QUALITY -------------------- Quality is an important objective of software design. - Quality in the system really begins at the design stage - Designs can be assessed for quality - One proven metric that can predict quality is (Fan-Out)^2 i.e For a given module, as the number of other modules called by it increases, the complexity of the calling module is likely to increase, making that module more likely to contains mistakes. Guidelines for Design Quality (Pressmans Book) 1) A design should exhibit a hierarchical organization that makes intelligent use of control among components of software. ** Abstraction 2) A design should be modular; that is, the software should be logically partitioned into components that perform specific functions and subfunctions. ** Cohesion 3) A design should contain distinct and separable representations of data and procedure. 4) A design should lead to modules (e.g functions or procedures) that exhibit independent functional characteristics. ** Coupling, Cohesion 5) A design should lead to interfaces that reduce the complexity of connections between modules and with the external environment. ** Coupling 6) A design should be derived using a repeatable method that is driven by information obtained during software requirements analysis. ** methods, techniques Structured Design Introduction and Notation DEF: Structured Design : a method that takes a logical model for a system, in the form of a DFD hierarchy, together with non- functional requirements, and produces a system design. Structured Design (SD) Products ------------------------------- A design created by the SD method is represented using structure charts, module specifications, and a design dictionary. Structure Chart Hierarchy - A model of the software architecture. Identifies the structural components and their interfaces. (High-level design) Module Specifications - A detailed description of how a module transforms its inputs into outputs or the task performed by a module. (Detailed design.) Design Dictionary - Contains a text based description of all items appearing in a structure chart hierarchy. (This includes the modspecs.) (Both stages of design.) * The walkthrough document is a tool built to assist with the software design review. It is not maintained after this phase. (Do we do the walkthroughs anymore. No.) *NOTE: Avoid stating implementation specific/dependent information in the design artifacts. No pointers!! Don't want to restrict the design to one specific programming language. NOTATION Drawing Structure Charts -------- There is a basic style of notation that is commonly used. Module - Module is a generic term. A module can be a function in C, a subroutine or function in Fortran, a procedure in Pascal, etc. - Common module properties: 1) A module consists of a set of lexically contiguous statements, i.e. statements comprising the module are written together, sequentially. 2) A module has an identifier by which the entire module, all of its statements, may be referenced as a single piece. (Abstraction, here it is procedural abstraction). - Represented by a simple rectangle with the module name contained in the rectangle. - Module names should be verb phrases. (Modules perform an action) Intermodular Connection - Used to denote that one module refers to another, mainly to use the functionality of the referenced module. - Represented as a downward pointing arrow connecting two modules. The top module always calls the bottom module. - Connecting arrows NEVER point up or horizontally. - If a module calls several modules. The called modules are generally drawn from left to right in appropriate calling order. NOTE: this is not always possible or accurate. Parameters - Arguments or parameters passed between modules are drawn along side an intermodular connection. - A parameter arrow is always labeled with a noun. Data - Denotes information or data that is passed from one module to another. Data is acted on ( manipulated ) by a module. - Drawn as an arrow with a circle on one end. The arrow points to the direction of flow. Control - Denotes control or state information or a signal that is passed from one module to another. Control information determines which parts of a module will be executed. - Is different from checking a property of data to make a decision (i.e sorting on names, names are not control.) - Drawn as an arrow with a dot on one end. The arrow points to the direction of a flow. Combined - An arrow with no circle or dot on one end may be either a control or data item or contain both. (Rarely used.) Global - The name of a global variable used in a module can be written below a module box with either (IN) or (OUT) appearing next to the global variable name to indicate the direction of flow. Procedural Annotations - represent procedural aspects of a module. Conditional Access - Represented by enclosing the calling end of an intermodular connection in a diamond. - This means that some decision in the calling module will determine if a connected module is really called or not. - If several intermodular connections originate from a diamond, this can be interpreted to mean that just one of the connected modules will be called (Selection). Iteration - Used when intermodular references are used repeatedly within an iterative procedure. - Represented by drawing three quarters of a loop around the affected connections. The loop is a curved arrow. Extensions to the basic notation can be used to denote interactions between modules and external entities such as files and the user. File Interaction - Used to show that a module accesses a file. Accesses include opening, read/write, and closing a file. - A file is represented by two parallel lines with a name given between the lines. - The name of a file is a noun. - A connection between a file and a module does not have arrows. The direction of data flow is given by flow arrows (with labels) that indicate the information read or written into a file. Interactive Interaction - Used to show information entered by a user or information displayed by a module directly to a user. - Drawn as arrows entering or leave the SIDE of a module. Input - Represented by an arrow entering the left side of a module. - The arrow will be labeled with a noun. Output - Represented by an arrow leaving the right side of a module. - The arrow will be labeled with a noun. * The names of External Input or Output flows should not be the same as the names of parameters or globals used elsewhere in the structure chart hierarchy. * When drawing a structure chart, include no more than 8 modules in a diagram. This helps prevent clutter, but also increases the overall number of diagrams. Structured Design : Transformation Methods The techniques for transforming a DFD hierarchy into a structure chart hierarchy rests on some basic concepts. Factoring --------- Basic idea - is that in a structure chart the higher level modules should consist mainly of iterative and decision logic, i.e. control, and contain little or no processing. Most if not all processing should be contained in the bottom level atomic modules (i.e modules with no subordinates). (Processing is the transformation of data into different data.) DEF: EXECUTIVE MODULE - is a module that contains little or no iterative, decision logic, or processing. It only (for the most part) contains calls to subordinate modules. DEF: A COMPLETELY FACTORED SYSTEM - is a system where all processing is accomplished by bottom-level atomic modules, and all non-atomic modules consist only of control and coordination. ** Factoring is good, because experience shows that systems whose design is factored cost less to maintain and are easier to modify. Types of Modules ----------------- 1) AFFERENT MODULE - Accepts data from subordinate modules and passes this data to superordinates. 2) EFFERENT MODULE - Accepts data from superordinate and passes it to subordinate modules. 3) TRANSFORM MODULE - Exists solely to transform data into some other form. Takes data from superordinate, transforms it, and passes transformed data to superordinate. 4) COORDINATE MODULE - Takes data from one or more subordinates and passes it to other subordinate modules. *) MIX - Some modules are a combination of the above. METHOD: TRANSFORM ANALYSIS -------------------------- This design method is appropriate for systems whose primary task is to receive a flow of similar input transactions and turn them into an output flow of data, after having taken some action on each of the input transactions. DEF: TRANSFORM ANALYSIS - is a design strategy for deriving initial structural designs that are good with respect to modularity. Primary purpose of this strategy is to identify - the primary processing functions of the system - the high-level inputs to those functions - the high-level outputs of those functions. This strategy requires that a dataflow model of the problem be created first. We did this using Structured Analysis. STEPS ------ 0) State the problem as a data flow diagram (Structured Analysis) 1) Identify the afferent (input flows) and efferent (output flows) 2) Do first-level factoring 3) Factor the afferent, efferent, and transform branches STEPS (Details) ----- STEP 1: Identify the afferent (input flows) and efferent (output flows) in the DFD hierarchy. DEF: AFFERENT DATA ELEMENTS are those high-level elements of data that are furthest removed from physical input, yet still constitute inputs to the system. Think of these as data items ready to be processed, they have been - read in (removing file structure) - edited, checked, validated - possibly formatted - it is data ready to be processed. 1.1) TO IDENTIFY AFFERENT DATA ELEMENTS * Identify these data elements by starting at the physical inputs to the system (stores and external entities) and moving inwards along the DFD until you identify a flow that can no longer be considered as incoming (input). * Do this by tracing for each input flow. 1.2) TO IDENTIFY EFFERENT DATA ELEMENTS DEF: EFFERENT DATA ELEMENTS are those elements of data that are furthest removed from the physical outputs but may still be regarded as outgoing flows. These data elements should require little processing to be converted to physical outputs. * Identify these data elements by starting at the physical outputs from the system (where flows enter stores or external entities) and move backwards along the DFD until you identify a flow that will still requires transformational processing. 1.3) This partitioning of the DFD hierarchy should leave one or more processes in the middle of the DFDs that were not marked as afferent or efferent. These processes are called "central transforms." These processes do the main work of the system. STEP 2: Do first-level factoring 2.1) Specify a top-level (executive) module for the structure chart hierarchy that when activated will perform the entire task of the system by calling its subordinates. * The main module is the overall control or executive module for the design. 2.2) For each afferent data element flowing into a central transform, an afferent module is defined that is an immediate subordinate to the main module. This afferent module will pass its afferent data element to the main module. 2.3) For each efferent data element emerging from any central transform, define a subordinate efferent module that will accept the efferent data element from the main module and transform it into the physical output. 2.4) For each central transform module or functionally cohesive composition of central transforms, specify a subordinate transform module that accepts the appropriate input data from the main module and transforms it into the appropriate output data which is passed back to the main module. STEP 3: Factor the afferent, efferent, and transform branches Different strategies are used to factor the three types of subordinate modules. 3.1) AFFERENT FACTORING 3.1.1) For each afferent module, identify the transform (or computations) required to produce the output of the afferent module. This transform becomes a new transform module that is subordinate to the afferent module. 3.1.2) For each new transform module identified in step 3.1.1, identify a new afferent module that will pass data needed by the new transform module to the superordinate afferent module. (i.e new afferent passes to old afferent which passes to new transform.) 3.1.3) Each new afferent module identified in step 3.1.2 must also be factored by recursively performing steps 3.1.1 and steps 3.1.2. This recursion continues until the physical inputs are reached. 3.2) TRANSFORM FACTORING * There is no good procedure to be followed to factor the central transforms. * The goal is to identify sub-transforms, transform modules that are subordinate to the original transform module and will compose the overall transform. 3.3) EFFERENT FACTORING 3.3.1) For each efferent module, identify the transform (or computations) required to produce the output of the efferent module. This transform becomes a new transform module that is subordinate to the efferent module. 3.3.2) The output from each transform module identified in step 3.3.1 becomes input to a new efferent module that is also subordinate to the original top-level efferent module. 3.1.3) Each new efferent module identified in step 3.3.2 must also be factored by recursively performing steps 3.3.1 and steps 3.3.2. This recursion continues until the physical outputs are reached. * This completes the first-cut at the structure chart. It will need to be refined to reflect design considerations and trade-offs. * The design heuristics (module size, fan-in, fan-out, etc) and idea of cohesion and coupling can be used to evaluate and guide refinements to the design. METHOD: TRANSACTION ANALYSIS ---------------------------- Transaction analysis is another design strategy for deriving structure charts from dataflow diagrams. This strategy is useful when a DFD diagram contains processes that split an input flow into several discrete output flows. Transaction analysis can be helpful in PART of the design process. DEF: A TRANSACTION is defined (loosely) to be any element of data, control, signal, event, or change of state that causes, triggers, or initiates some action or sequence of actions. This strategy simply recognizes that data flow graphs of the form shown in the slide can be mapped into a particular modular structure. A transaction center (is a module) that must be able to: - get (obtain or respond to) transactions in raw form - analyze each transaction to determine its type - dispatch each type of transaction - complete the processing of each transaction type STEPS ----- STEP 1) Identify the sources of transactions. * These may be apparent from the inputs to the DFD, or the design may recognize afferent, transform, or efferent modules that generate transactions. (Detected during transform analysis.) STEP 2) Specify the appropriate transaction-centered organization. * Slide shown in class gave a good structure, but others are possible STEP 3) Identify the transactions and their defining actions. * define carefully the processing that must take place for each transaction STEP 4) Note potential situations in which modules can be combined. * Try to detect when an intermediate-level module can be created from a functionally cohesive group of low-level modules. * Perhaps a module can be called by different transactions. STEP 5) For each transaction, or cohesive collection of transactions, specify a transaction module to completely process it. * Avoid the temptation to group the processing of several transactions into one module. This should be avoided if the resulting module has low cohesion. STEP 6) For each action in a transaction, specify an action module subordinate to the appropriate transaction module(s). * This is a factoring step. STEP 7) For each detailed step in an action module, specify an appropriate detail module subordinate to any action module that needs it. * This continues the factoring step. Note several levels of detail modules are possible for large systems. Software Design Fundamentals DESIGN FUNDAMENTALS (SEE SLIDES) ------------------- - A set of fundamental software design concepts has evolved over time. - Each fundamental provides the designer with a foundation from which more sophisticated design methods can be applied. 1) ABSTRACTION - allows a designer to represent an object/component at varying levels of detail. + Highest level - a solution is stated in broad terms using the language of the problem environment, i.e. "problem oriented terms" are used + Middle level - "Procedural oriented terms" are added to the solution description. + Lower level - "Implementation oriented terms" are combined with procedural oriented terms to state the solution + Lowest level - The solution is stated in a manner that can be directly implemented. - Types of Abstraction + Procedural Abstraction : is a named sequence of instructions that has a specific and limited function. + Data Abstraction : is a named collection of data that describes a data object. - KEY IDEA: Hide Details 2) REFINEMENT - create successive representations of a object where each new representation is more detailed than the last. - a hierarchy is developed by decomposing a macroscopic statement of function (a procedural abstraction) in a stepwise fashion. (Decomposition of DFD processes is one example of refinement.) - with respect to design, the architecture of a program is developed by successively refining levels of procedural design. - the use of refinement is the key to creating levels of abstraction 3) MODULARITY - Software is divided into separately named and addressable components, called modules, that are integrated to satisfy problem requirements. - modularity is an important principle that allows a system to be intellectually manageable. - Let C(x) be a function that defines the perceived complexity of a problem. Let E(x) be a function the defines the effort (in time) required to solve a problem. Let p1 and p2 be two problems: IF C(p1) > C(p2) THEN E(p1) > E(p2) It has been shown by experimentation that: C(p1 + p2) > C(p1) + C(p2) E(p1 + p2) > E(p1) + E(p2) - Lesson : it is easier to solve a complex problem when you break it into manageable pieces. 4) SOFTWARE ARCHITECTURE - is composed of: + the hierarchical structure of procedural components (modules) ** may be control hierarchy or composition hierarchy (Pascal) + the structure of data - architecture is created by successive refinements that produce a modular structure - For a given set of requirements there are many different architectures that can satisfy those requirements. + Effect of using different design methods + Effect of the sequence of design decisions + We really don't know how to determine the "best" architecture 5) CONTROL HIERARCHY - represents the organization (often hierarchical) of program components (modules) and implies a hierarchy of control. ** Structure charts (module A calls module B) - created without regard to the sequence of processing and decisions. 6) DATA STRUCTURE - represents logical relationships among individual elements of data. - Data structures are selected/created during design work. + Choose data structures carefully. - Think about impact on rest of system design. - Think about algorithms that will work on that type of structure. - Choice of good data structure has important impact on overall design. 7) SOFTWARE PROCEDURE - focuses on the processing details of each module individually. - defines : sequences of events, exact decision points, repetitive operations, data organization/structure if necessary - Algorithms are often selected during detailed design work. + Must analyze the needs for an algorithm. - what are the characteristics of the data it will be used on. + Try to isolate the algorithm from other parts of the design. + If possible, select simplest algorithm to solve a problem. - Is it really necessary to have the most efficient algorithm? + Trade off: efficiency of algorithm vs understandability of code + Faster algorithms are sometimes more complicated, consuming programmer time and generally more costly to maintain. + However faster algorithms are not always too complicated. Versions whose time complexity varies only by a constant can still be important. 3 seconds versus 3 minutes for an interactive program. + Generally, the more tricks that go into making a design or implementation efficient the harder the code becomes to understand. - Is the algorithm fundamental to the functionality of the program or is it just triggered occasionally? How much effort is the algorithm worth? 8) INFORMATION HIDING - modules should be specified and designed so that information (both data and procedures) contained within a module are inaccessible to other modules that have no need for such information. - implies that effective modularity can be achieved by defining a set of independent modules that only exchange the information necessary to achieve software function. - use of information hiding creates modular systems that are easier to test and maintain. Software Design: Heuristics, Coupling/Cohesion DESIGN HEURISTICS ----------------- Here are some heuristics - tricks - which can help increase the modularity of a system. These serve as checks or indicators by which a structure chart can be examined for potential improvements. 1) MODULE SIZE * The larger the size of a module, i.e. the more lines of code it contains, the harder it becomes to understand. * Empirical evidence suggests that a module should not be larger than 30 to 100 lines of code. * Modules as a rule of thumb should not be smaller than 5 to 10 lines of code. * If the size goes outside some optimal range of 5 to 100 LOC the total system costs will rise. * There may also be a very good reason for a module to be very large or small. But generally a large module represents an incomplete breakdown into appropriate subordinate modules. 2) SPAN OF CONTROL (Fan-out) * The number of immediate subordinates of a module. * Very high or very low spans are possible indicates of poor design. Generally, be concerned if the span of control exceeds 10 or is simply 1 or 2. (Note: SC standard limits fan-out to 7) * A low span of control can be increased in some cases by breaking the few subordinate modules into additional subordinate modules. or by compressing the few subordinates into their superordinate. * A high span of control could be indicative of an over-zealous breakdown of a module into subordinates or just a failure to define intermediate levels. The solution is to introduce a few intermediate modules. * Fan-Out^2 is a good measure of design quality. A high value indicates a greater likelihood of errors in that part of design. 3) FAN-IN * Fan-in is GOOD. Fan-in means that a module is called by many other modules. * During structured design, as a new module is about to be added, stop and ask if there is already a module available that performs the required function? If so, draw an arrow to the existing module. 4) SCOPE OF EFFECT/SCOPE OF CONTROL * DEF : SCOPE OF EFFECT - of a decision is the collection of all modules containing any processing that is conditional on that decision. If the activation of an entire module is conditional on the outcome of the decision, that module's superordinate is also included in the scope of effect. - Setting an internal state flag as a decision effect. * DEF : SCOPE OF CONTROL - is the module itself and all of its subordinates (children, grandchildren, etc). * Here is an important design heuristic - For any given decision, the scope of effect should be a subset of the scope of control of the module in which the decision is located. - Decisions should not occur higher in the hierarchy than is necessary to place the scope of effect within the scope of control. COUPLING -------- DEF: Modular Independence - two modules are totally independent if each can function completely without the presence of the other. DEF: Coupling - a measure of the strength of interconnection between two modules. (The degree of interconnection.) Highly coupled modules - are joined by strong interconnections Loosely coupled modules - are joined by weak interconnections uncoupled or decoupled modules - have no interconnection KEY ISSUE : How much of one module must be known in order to understand another module. If two modules are highly coupled, then there is a high probability that a programmer trying to modify one of them will have to understand or make a change to the other module. We want loosely coupled modules. Why? - So we can understand one module without learning others. - Easier to debug - Easier to maintain - Easier to replace - Total systems cost are strongly influenced by the degree of coupling between modules. TYPES OF COUPLING ----------------- These are presented from lowest (desirable) to highest (undesirable), i.e data coupling is loose coupling and content coupling is a form of tight coupling. 1) Data Coupling : simple data passing - Two modules communicate by parameters, where each parameter is a single field or a homogeneous table. 2) Stamp Coupling : sharing of parts of a data structure - variation of data coupling - two modules refer to the same data structure, but use different parts of the structure. - create dependencies between otherwise unrelated modules. 3) Control Coupling : Communication via control variables - when state (control) variables are passed between modules to control the internal logic of the receiving module. - often occurs as passing a "flag" to a module. 4) External Coupling : tie to environment external to the system - when a module is tied to an environment external to the software being designed. - coupling to i/o devices, special i/o formats, communication protocols, windowing system, etc. - affects portability 5) Common Coupling : sharing of global data area - occurs when two modules refer to the same global data area. - the danger is that one module will use some data that has been erroneously produced by some other module a long time ago. Or unexpected sequences of set/use occur. 6) Content Coupling : access to info private to a module, branch into middle of a module. - occurs when one module refers to data that is private to another module, or one module transfers execution control to a point inside of a second module rather than to the top of the module. COHESION -------- KEY ISSUE : To what degree are parts of a module related functionality? DEF: Cohesion - a measure of how tightly bound or related are the internal elements of one module. Cohesion may be thought of as the cement that holds the processing elements of a module together. A high degree of module cohesion is an indication of close approximation of inherent problem structure. The greater the cohesion of individual modules in a system, the lower the coupling between modules. We want modules with high cohesion. Why? - a highly cohesive module performs a single task within a system requiring little interaction with other modules. - Reduces coupling TYPES OF COHESION ------------------ - Cohesion can be put into practice by using an "associative principle". - When deciding to put certain processing steps into a module, ask yourself whether there are certain properties that relate the new steps with steps already in the module? - There are seven levels of cohesion. These are listed below from least (i.e weakest, undesirable) to greatest (i.e strongest, desirable) cohesion strength. 1) Coincidental association - little or no constructive relationship among the elements of a module. - can occur when several statements that appear together in several places are put into a module and then called from the various places they used to be used in. (Save code space.) - The problem is that each use of the module may not have the same meaning in terms of the application domain. 2) Logical association - occurs when several similar, but slightly different functions are combined into one general module, rather than several specialized modules. - Example, all functions related to "validating" data are put into a module that validates ALL inputs regardless of their source, type, or use. - This form of cohesion represents some minimal problem oriented basis for associating the functionality in a module. - A logically associated module does not perform a single function. 3) Temporal association - occurs when a module performs a set of functions that are related in time. - All functional elements in a module occur within the same limited period of time, thus are put into a module that executes them all at once. - Example, all initialization or termination steps. - Temporal association is stronger than logical because the time- relatedness more closely associates the process steps. 4) Procedural association - occurs when the processing steps in a module must be executed in a specific order. - There may be several processing steps that are unrelated except for the control-flow that binds them together. - procedural modules tend to be fairly strongly coupled and are somewhat clumsy to use as independent entities. (i.e reuse) - the point is that procedural cohesion cuts across functional lines. A module may contain only part of a complete function or one or more functions plus parts of other. 5) Communicational association - occurs when process steps that operate on common data are grouped together. The steps operate either on the same input data or produce the same output data. - This is the lowest level of cohesion at which we encounter a relationship among processing steps that is intrinsically problem-dependent (related to the application domain). - Example, a module to print results to a file, a module that accepts data from several sources, and transforms it into a report line. 6) Sequential association - occurs when the output data from one process step serves as input to the next processing step. - A sequential module may not perform a complete function, or may contains more than one function. 7) Functional association - occurs when every processing step within a module contributes to performing a single function and the function is completely performed by the module. - Functionally cohesive modules an usually be described by a single phrase. - This is the highest level of cohesion and the most desirable. *Note in practice it is not necessary to determine the precise level of cohesion for each module. Rather it is important to strive for high cohesion and to recognize low cohesion during design.