The following information protected by USA and international law. Any usage of this publication or its part can be allowed per request. Consultation available - leave your email address as a comment.
This post describes some
specific tendency in programming appeared over the last years, especially in
ETL programming. The analysis was mostly prepared in 2011 and finalized now
because the tendency of multi-dimensional programming not disappeared, but even
became dominant, sometimes transforming into really dispersed programming,
especially in the user interface implementation. Most of narrative below was
not changed because the tendency stays strong despite issuing new versions of
different tools and applications with some modification in details.
History
1. Böhm and Jacopini showed in 1966 that any program can be created as a sequence of 3 basic constructions: sequence of commands; if-logic; and loop. They considered the program as a sequence of steps provided one-by-one – in 1 dimension. Later, Dahl, Dijkstra and Hoare developed this approach and built “without go-to” style of programming – so named “structural programming”. Such way, it was eliminated “spaghetti” style of programming where logic of programs was even not multidimensional but scattered. From that moment, the computer programming transformed from logical knitting into industrial technology.
2. Later, program execution became parallel, and observation of program became 2-dimensional. To consolidate consideration (tracking) of such parallel processes, the parallel processes were synchronized. Such way, relation and dependency between parallel processes became observable, visible and readable because we can see them on flat surface.
3. After appearing and using multi-featured components with variety of tuned abilities to transform data (for example, ETL transformations), details (variations) of each component (transformation) hide behind scene, in 3-d dimension. Such 3-dimensional programming (which later became a multidimensional) made the tracking of data processing more complicated, and, hence, slower.
Example
To
provide analysis of how program parts spreaded in 3 dimensions, let’s for
example, provide this analysis of a very popular ETL tool Informatica®. I
choose it because it’s a very popular and powerful tool with which I worked
during long time. It gives a lot of advantages and has a numerous achievements
in ETL process. To show the way how to make it better, let’s consider it from
prospective of 3-dimensional programming and show how to make it more
efficient.
It has different types of objects each of which has different types of links to other ones:
- Mappings and Mapplets
- Sessions
- Workflows and Worklets
- Parameter files
- Parameters and variables with or without persistent values
- External objects (for example, files and database tables)
- External operating system and database processes
Each of them has a few levels of sub-objects in depth,
spreading on a few squares (one in other) or even in a few other objects.
Sometimes, entry to the lower level description attached as a patch (for
example, Set File Properties point in the in the Mapping tab of the session
Task Editor). As result, tracking of
some process and keeping a whole big picture interfere with each other. It
reminds a noodle-style programming existed before structured programming came
in IT when programmers were creeping over the program trying to catch how the
program works. The efforts developers make to work with such ETL program are
much bigger than if it would be presented compactly without numerous jumps back
and forth from one window to another one.
To eliminate this difficulties, Informatica added some
extra options (for example, tracing Link Path Forward and Backward throw fields
in a mapping; gathering all Connections on one plate in separate item of a
session Mapping tab; preparing Compare… and Dependencies… observing reports for
Workflows). The list of ways to eliminate the 3-dimensional programming issues
can be prolonged. These ways can help a lot, but they are not organically
embedded into the product and do not give whole complete picture of particular
ETL program.
We can see the similar 3-dimensional programming issues
in other ETL tools, and not only ETL, but also in another types of tools and
homemade applications. It’s especially notable in the Enterprise Architecture
where data processing running in heterogeneous environment.
How to rid of multidimensional programming?
To get rid of multidimensional program, it’s necessary to
transform it to one or at least to 2-dimensional view. For this purpose, first
of all, it’s necessary to calculate metrics how much the particular program can
be observed, other words, how much it's a “flat”.
To consider the level of flatness, we will name “plate”
all types of descriptions showed on one surface which we can observe visually
(for example, the program text presented on computer monitor or on the sheet of
paper). The smallest object on the plate will be named as an “item” (for
example, radio button, and check-box).
The biggest flatness will be when all items will on one
plate (all items are visible), the lowest flatness will be when each item is on
a separate plate (it’s visible only one item at a time).
Based
on this terminology, we offer the using such symbols to characterize metrics of
how easy the program can be verified:
P – number of plates on which the program
locates;
i – number of items on a plate of
the program;
I – total number of items on all
plates of the program;
L – number of the program levels in
deep;
V
– level of the program visibility;
D – dispersion of items on
the program plates.
Using these symbols, we can calculate the indexes of
flatness and visibility:
Flatness
of a program (of all plates together):
F = 1 / P,
which shows that program is flat when P = 1 (whole
program is on 1 plate) and, as result, F=1.
Average
flatness of one plate:
f
= i / I,
which shows that program is flat when i = I (all items
are on 1 plate) and, as result, f = 1.
Level
of program visibility / readability:
V = 1 / L,
which shows that program is flat if L = 1 (only 1 level
of program – only 1 plate where all items allocated).
Using
these indexes, it is possible not only measure the visibility of particular
program, but also compare visibility of different programs. It means that it’s
possible to compare the tools and applications to find which of them allow
building more flat (that means more visible) programs, and such way find which
tool will allow building easier verifiable programs. The last finding means
that the using listed indexes it’s possible to evaluate which tool allows build
programs which will be more efficient for testing, production support and
enhancement.
Measure
of dispersion can be calculated as:
D = P*L,
that
means: the more plates and more levels a program has, the more its code is
dispersed and the more difficult to read and verify the program code. Obvious
that the most efficient program has D = 1 (one plate and one level of program).
Example of calculation:
Informatica flatting index
for session description (Edit Task) has 6 tabs (P=6) and up to 3 level s (L=3)
on some tabs.
Hence, it’s flatting index
F= 1/6; level of visibility V= 1/3; dispersion D=6x3=18.
Definitely, some other metrics can be used, but the most
important is that we can evaluate in numbers how some tool is convenient for
using.
Based on these metrics, we can compare different tools
and see which of them lets to save more time on programs development and
especially, on enhancement.
Summary
Multidimensional programming slows down the productivity
of programmers during developments, testing, and especially, on software
support and its enhancement. Despite that, the using of multidimensional
programming grows. It happens because producers of the software not always pay
attention on readability the programs and ability of their tool create the most
readable programs. Users of such software don’t require a better level of
readability the software first of all because of lack of traditions to require
readable software or a tool which allows building such software.
One of the ways to eliminate the multidimensional
programming is measure the level of readability of software. The measurement as
a set of some metrics will allow compare the level of readability between
different software to see how each of them is efficient for programming. Paying
attention at the readability of software and on ability of tool to create
readable software will help to increase productivity of programming, and
especially, the supporting and enhancement of software. It will help to reduce
the expenses on IT that is important for IT users.
No comments:
Post a Comment