Tuesday, February 9, 2016

Dimensions of Programming

The following information protected by USA and international law. Any usage of this publication or its part can be allowed per request. Consultation available - leave your email address as a comment.

 
This post describes some specific tendency in programming appeared over the last years, especially in ETL programming. The analysis was mostly prepared in 2011 and finalized now because the tendency of multi-dimensional programming not disappeared, but even became dominant, sometimes transforming into really dispersed programming, especially in the user interface implementation. Most of narrative below was not changed because the tendency stays strong despite issuing new versions of different tools and applications with some modification in details.
 

History


1.    Böhm and Jacopini showed in 1966 that any program can be created as a sequence of 3 basic constructions: sequence of commands; if-logic; and loop. They considered the program as a sequence of steps provided one-by-one – in 1 dimension.  Later, Dahl, Dijkstra and Hoare developed this approach and built “without go-to” style of programming – so named “structural programming”. Such way, it was eliminated “spaghetti” style of programming where logic of programs was even not multidimensional but scattered. From that moment, the computer programming transformed from logical knitting into industrial technology.
2.    Later, program execution became parallel, and observation of program became 2-dimensional. To consolidate consideration (tracking) of such parallel processes, the parallel processes were synchronized. Such way, relation and dependency between parallel processes became observable, visible and readable because we can see them on flat surface.
3.    After appearing and using multi-featured components with variety of tuned abilities to transform data (for example, ETL transformations), details (variations) of each component (transformation) hide behind scene, in 3-d dimension. Such 3-dimensional programming (which later became a multidimensional) made the tracking of data processing more complicated, and, hence, slower.

Example 

To provide analysis of how program parts spreaded in 3 dimensions, let’s for example, provide this analysis of a very popular ETL tool Informatica®. I choose it because it’s a very popular and powerful tool with which I worked during long time. It gives a lot of advantages and has a numerous achievements in ETL process. To show the way how to make it better, let’s consider it from prospective of 3-dimensional programming and show how to make it more efficient.
 

It has different types of objects each of which has different types of links to other ones:
  • Mappings and Mapplets 
  • Sessions  
  • Workflows and Worklets 
  • Parameter files
  • Parameters and variables with or without persistent values
  • External objects (for example, files and database tables)
  • External operating system and database processes

 
Each of them has a few levels of sub-objects in depth, spreading on a few squares (one in other) or even in a few other objects. Sometimes, entry to the lower level description attached as a patch (for example, Set File Properties point in the in the Mapping tab of the session Task Editor).  As result, tracking of some process and keeping a whole big picture interfere with each other. It reminds a noodle-style programming existed before structured programming came in IT when programmers were creeping over the program trying to catch how the program works. The efforts developers make to work with such ETL program are much bigger than if it would be presented compactly without numerous jumps back and forth from one window to another one.
 
To eliminate this difficulties, Informatica added some extra options (for example, tracing Link Path Forward and Backward throw fields in a mapping; gathering all Connections on one plate in separate item of a session Mapping tab; preparing Compare… and Dependencies… observing reports for Workflows). The list of ways to eliminate the 3-dimensional programming issues can be prolonged. These ways can help a lot, but they are not organically embedded into the product and do not give whole complete picture of particular ETL program.
We can see the similar 3-dimensional programming issues in other ETL tools, and not only ETL, but also in another types of tools and homemade applications. It’s especially notable in the Enterprise Architecture where data processing running in heterogeneous environment.


How to rid of multidimensional programming? 

To get rid of multidimensional program, it’s necessary to transform it to one or at least to 2-dimensional view. For this purpose, first of all, it’s necessary to calculate metrics how much the particular program can be observed, other words, how much it's a “flat”.
 
To consider the level of flatness, we will name “plate” all types of descriptions showed on one surface which we can observe visually (for example, the program text presented on computer monitor or on the sheet of paper). The smallest object on the plate will be named as an “item” (for example, radio button, and check-box).
 
The biggest flatness will be when all items will on one plate (all items are visible), the lowest flatness will be when each item is on a separate plate (it’s visible only one item at a time).
 
Based on this terminology, we offer the using such symbols to characterize metrics of how easy the program can be verified:
 
            P – number of plates on which the program locates;
 
            i – number of items on a plate of the program;
 
            I – total number of items on all plates of the program;
 
            L – number of the program levels in deep;
            V – level of the program visibility;
            D – dispersion of items on the program plates.

 
Using these symbols, we can calculate the indexes of flatness and visibility:
 
Flatness of a program (of all plates together):

 
            F = 1 / P,

 
which shows that program is flat when P = 1 (whole program is on 1 plate) and, as result, F=1.
 
Average flatness of one plate:

 
f = i / I,

 
which shows that program is flat when i = I (all items are on 1 plate) and, as result, f = 1.

 
Level of program visibility / readability:

 
            V = 1 / L,

 
which shows that program is flat if L = 1 (only 1 level of program – only 1 plate where all items allocated).

 
Using these indexes, it is possible not only measure the visibility of particular program, but also compare visibility of different programs. It means that it’s possible to compare the tools and applications to find which of them allow building more flat (that means more visible) programs, and such way find which tool will allow building easier verifiable programs. The last finding means that the using listed indexes it’s possible to evaluate which tool allows build programs which will be more efficient for testing, production support and enhancement. 

 
Measure of dispersion can be calculated as:

 
            D = P*L,

 
that means: the more plates and more levels a program has, the more its code is dispersed and the more difficult to read and verify the program code. Obvious that the most efficient program has D = 1 (one plate and one level of program). 

 
Example of calculation:     
Informatica flatting index for session description (Edit Task) has 6 tabs (P=6) and up to 3        level s (L=3) on some tabs.  
 Hence, it’s flatting index F= 1/6; level of visibility V= 1/3; dispersion D=6x3=18.
 
Definitely, some other metrics can be used, but the most important is that we can evaluate in numbers how some tool is convenient for using.
 
Based on these metrics, we can compare different tools and see which of them lets to save more time on programs development and especially, on enhancement.  


Summary 

Multidimensional programming slows down the productivity of programmers during developments, testing, and especially, on software support and its enhancement. Despite that, the using of multidimensional programming grows. It happens because producers of the software not always pay attention on readability the programs and ability of their tool create the most readable programs. Users of such software don’t require a better level of readability the software first of all because of lack of traditions to require readable software or a tool which allows building such software.
 
One of the ways to eliminate the multidimensional programming is measure the level of readability of software. The measurement as a set of some metrics will allow compare the level of readability between different software to see how each of them is efficient for programming. Paying attention at the readability of software and on ability of tool to create readable software will help to increase productivity of programming, and especially, the supporting and enhancement of software. It will help to reduce the expenses on IT that is important for IT users.