Threesology Research Journal

Threesology Research Journal
Examples of "Threes"-oriented Web Pages
page 19

~ The Study of Threes ~
http://threesology.org

The following are references culled from other websites regarding the number 3 or have "three" as a focus, though other labeling may be used. Please give all respective authors their due credits. Links to their websites are provided following each section. However, it must be noted that many of the links may not be viable since some of the information was compiled in 2004 or earlier, unless otherwise noted.

Three Schema Approach
3 schema image 1 (12K)

The notion of a three-schema model was first introduced in 1975 by the ANSI-SPARC Architecture which determined three levels to model data.

--- ANSI/X3/SPARC three level architecture ---
http://en.wikipedia.org/wiki/ANSI-SPARC_Architecture

The three-schema approach, or the Three Schema Concept, in software engineering is an approach to building information systems and systems information management from the 1970s. It proposes to use three different views in systems development, in which conceptual modelling is considered to be the key to achieving data integration. ^[2]

Overview

The three-schema approach offers three types of schemas with schema techniques based on formal language descriptions:^[3]

External schema for user views.
Conceptual schema integrates external schemata.
Internal schema that defines physical storage structures

At the center, the conceptual schema defines the ontology of the concepts as the users think of them and talk about them. The physical schema according to Sowa (2004) "describes the internal formats of the data stored in the database, and the external schema defines the view of the data presented to the application programs". ^[4] The framework attempted to permit multiple data models to be used for external schemata.^[5]

Over the years, the skill and interest in building information systems has grown tremendously. However, for the most part, the traditional approach to building systems has only focused on defining data from two distinct views, the "user view" and the "computer view". From the user view, which will be referred to as the "external schema," the definition of data is in the context of reports and screens designed to aid individuals in doing their specific jobs. The required structure of data from a usage view changes with the business environment and the individual preferences of the user. From the computer view, which will be referred to as the "internal schema," data is defined in terms of file structures for storage and retrieval. The required structure of data for computer storage depends upon the specific computer technology employed and the need for efficient processing of data.^[6]

Traditional View of Data^[6]

Three Schema Approach^[6]

These two traditional views of data have been defined by analysts over the years on an application by application basis as specific business needs were addressed, see Figure. Typically, the internal schema defined for an initial application cannot be readily used for subsequent applications, resulting in the creation of redundant and often inconsistent definition of the same data. Data was defined by the layout of physical records and processed sequentially in early information systems. The need for flexibility, however, led to the introduction of:

--- Database Management Systems (DBMSs)
http://en.wikipedia.org/wiki/Database_Management_System

...Which allow for random access of logically connected pieces of data. The logical data structures within a DBMS are typically defined as either hierarchies, networks or relations. Although DBMSs have greatly improved the shareability of data, the use of a DBMS alone does not guarantee a consistent definition of data. Furthermore, most large companies have had to develop multiple databases which are often under the control of different DBMSs and still have the problems of redundancy and inconsistency.^[6]

The recognition of this problem led the ANSI/X3/SPARC Study Group on Database Management Systems to conclude that in an ideal data management environment a third view of data is needed. This view, referred to as a “conceptual schema” is a single integrated definition of the data within an enterprise which is unbiased toward any single application of data and is independent of how the data is physically stored or accessed, see figure. The primary objective of this conceptual schema is to provide a consistent definition of the meanings and interrelationship of data which can be used to integrate, share, and manage the integrity of data.^[6]

History

Image of the six layers in the
Zachman Framework
http://en.wikipedia.org/wiki/Zachman_Framework

The notion of a three-schema model consisting of a conceptual model, an external model, and an internal or physical model was first introduced by the ANSI/X3/SPARC Standards Planning and Requirements Committee directed by Charles Bachman in 1975. The ANSI/X3/SPARC Report characterized DBMSs as having a two schema organization. That is, DBMSs utilize an internal schema, which represents the structure of the data as viewed by the DBMS, and an external schema, which represents various structures of the data as viewed by the end user. The concept of a third schema (conceptual) was introduced in the report. The conceptual schema represents the basic underlying structure of data as viewed by the enterprise as a whole.^[2]

The ANSI/SPARC report was intended as a basis for interoperable computer systems. All database vendors adopted the three-schema terminology, but they implemented it in incompatible ways. Over the next twenty years, various groups attempted to define standards for the conceptual schema and its mappings to databases and programming languages. Unfortunately, none of the vendors had a strong incentive to make their formats compatible with their competitors'. A few reports were produced, but no standards.^[4]

As the practice of Data Administration has evolved and more graphical techniques have evolved, the term "schema" has given way to the term "model". The conceptual model represents the view of data that is negotiated between end users and database administrators covering those entities about which it is important to keep data, the meaning of the data, and the relationships of the data to each other.^[2]

One further development is the IDEF1X information modeling methodology, which is based on the three-schema concept. Another is the Zachman Framework, proposed by John Zachman in 1987 and developed ever since in the field of Enterprise Architecture. In this framework, the three-schema model has evolved into a layer of six perspectives. In other Enterprise Architecture frameworks some kind of view model is incorporated.

References

This article incorporates public domain material from websites or documents of the National Institute of Standards and Technology.

^ Matthew West and Julian Fowler (1999). Developing High Quality Data Models. The European Process Industries STEP Technical Liaison Executive (EPISTLE).
^ ^a ^b ^c STRAP SECTION 2 APPROACH. Retrieved 30 September 2008.
^ Mary E.S. Loomis (1987). The Database Book. p. 26.
^ ^a ^b John F. Sowa (2004). [ "The Challenge of Knowledge Soup"]. published in: Research Trends in Science, Technology and Mathematics Education. Edited by J. Ramadas & S. Chunawala, Homi Bhabha Centre, Mumbai, 2006.
^ Gad Ariav & James Clifford (1986). New Directions for Database Systems: Revised Versions of the Papers. New York University Graduate School of Business Administration. Center for Research on Information Systems, 1986.
^ ^a ^b ^c ^d ^e itl.nist.gov (1993) Integration Definition for Information Modeling (IDEFIX). 21 Dec 1993.

The Best New BI Invention You’ve Never Heard Of
The Triadic Continuum
John Zuchero

Would you pay good money for a data structure that made gathering business intelligence (BI) data quicker, cheaper and easier? While this may sound like a no-brainer, some computer scientists at Unisys Corporation have been having a difficult time getting the BI community to learn about the positive results of their newest discovery.

Over the past few years, computer software engineer and theoretical mathematician Jane Mazzagatti has developed and patented a new data structure that she and her colleagues have shown can find not only simple relationships among data but also can discover more complex, less easy-to-find relationships in vast amounts of real-time data streams. And, by real-time data they mean real-time - answers to queries may change as more and more data are introduced into the structure - similar to how we are able to change our perceptions and decisions based on the introduction of new facts.

Mazzagatti calls this new data structure the Triadic Continuum, in honor of the theories and writings of Charles Sanders Peirce, one of the least well-known scientific geniuses of the late 19th century. Peirce, who is recognized as the father of pragmatism, is also known for his work in semiotics, the study of thought signs. Using Peirce’s theoretical writings on how thought signs are organized into the structure of the human brain, Mazzagatti extrapolated a computer data structure that is self organizing - in other words, a data structure that naturally organizes new data by either building on the existing data sequences or adding to the structure as new data are introduced.

She and her colleagues began their quest for a new data structure because of the perceived limitations of databases and data cubes. While both technologies have proved their usefulness in gathering, storing and querying large amounts of business data, there are issues associated with modifying, updating and adding information into an existing structure. For example, one of the main problems with data cubes is that they are time-consuming to design and program, and the queries are limited to the exact data in the cube at the time it is created. Therefore, every time the data in the cube changes, the cube must be recreated, which is especially bothersome if the data is transactional data that changes constantly. Say, for example, a nationwide building supply company uses database cubes to identify potential trends in their business, and it takes weeks to create a cube. The data in the cube is weeks old before it is ready to query. A time lag of weeks, if not many months in some cases, means that the data is outdated before the first query can be asked. Consequently, this limitation virtually eliminates the ability to perform queries in real or near-real time. In Mazzagatti’s Triadic Continuum there is no need to recreate the structure; the structure changes naturally as new data is added or changed or old data is deleted. With new information, the structure continuously reorganizes without the need of additional programmer help.

In the BI world, this means that the traditional approach of assembling data in one place, generating cubes or OLAP queries, and turning information into knowledge by recognizing patterns, is shortened dramatically. Mazzagatti and her colleagues believe that the time it takes to design and develop a BI solution, generally from identifying an information need through designing the schema and the cube to mining it, can be reduced by as much as 75 percent. This is accomplished because there is no need to create a schema and a cube, and the time to extract, transform and load data is simplified. This all leads to the ability to create usable knowledge faster and cheaper. It also moves BI from just a strategic endeavor to one that can be used tactically, since there is no longer a significant time lag between knowing what information you need and how to get it.

So what is this structure, and why is it so powerful? And why haven’t you heard of it already? The answers to these three questions are a mixture of easy and difficult to answer. Let’s start with what you might assume is the most difficult question but which is actually the easiest: what is this structure?

The conceptual model of the structure of the Triadic Continuum is quite simple. Mazzagatti and colleagues use the term “simple and elegant” in explaining how it is organized. Briefly, the structure is comprised of a continuous tree-like arrangement of units, called “triads.” In a traditional tree-like structure, one often sees nodes that are connected to one another by branches or paths. The triads that comprise the Triadic Continuum can be visualized as three nodes arranged in a somewhat triangular formation. Node one is connected to node two by a bidirectional pointer, and node two is connected to node three by another bidirectional pointer. The pointers identify to where and from where a node is connected - thus allowing all nodes to always know their relation within the continuum of branches through only two pointers. And, theoretically, each individual particle of data occurs only once within the structure, and because of the organization of the bidirectional pointers the relationship of one datum to another is always known. While this may seem powerful, it’s not the only thing that makes this structure so important.

At the core of BI is data mining and data modeling, which are both interested in uncovering the knowledge within data and information. However, it’s increasingly clear that both of these disciplines are having a difficult time doing this. In traditional data structures, data is entered in a fixed format (tables, lists or trees) so that the data can be easily, reliably and consistently found. And, in a very real sense, data and information are discovered; programmers must write programs to enable users to query the data and write other specialized programs to search the distinct data structure in a prescribed way, searching for bits of data and particles of information until eventually something which matches the query criteria is found.

However, and this is significant, in the Triadic Continuum, data are learned into a structure whose format and organization systematically build a record or recording of the associations and relations between the particles of data. Besides that, the physical structure of the Triadic Continuum shapes the methods to obtain information and knowledge from the structure. So, instead of data and information being “found,” “analyzed” or “discovered,” it is already there waiting to be “realized.” About this incredibly unique aspect of the Triadic Continuum, Mazzagatti often says, “It’s all in the structure.” By this she means that the format and organization of the Triadic Continuum not only hold the representation of the data, but also the associations and relations between the data and the methods to obtain information and knowledge.

And while traditional databases deal mostly with finding data and information, the focus of the Triadic Continuum is in knowledge - acquisition of useful and purposeful knowledge.

The third question is actually the most difficult to answer: why have you never heard of this invention before?

Interestingly enough, the idea to develop a new data structure started on a Pacific island during World War II when a young serviceman by the name of Eugene “Gene” Pendergraft was introduced to the writings of Charles S. Peirce. From Peirce’s writings, Pendergraft became intellectually stimulated by the notion that thinking, reasoning and learning are based on biological structures that function through a series of physical operations.

After the war, Pendergraft went on to develop his own theories and in the early 1960s attempted to develop computerized language translation while co-directing a project at the University of Texas at Austin to use computers to translate German, Russian, Japanese and Chinese into English. Through this research Pendergraft theorized that machines, specifically computers, could be made to learn. However, computer memory at the time was too small to allow this. While he and his team were able to demonstrate a rudimentary form of mechanical translation, the project was halted when U.S. Air Force officials cancelled funding. With this setback, Pendergraft put his ideas of mechanized learning on hold until computer technology caught up to his prophetic thinking.

In the early 1990s, when Pendergraft thought the time was right for mechanized learning, he and a small group of programmers and entrepreneurs formed a company. They quickly realized that they needed to pursue a financial and technical relationship with a larger company. Of the five computer companies they contacted, only Unisys Corporation was interested.

Beginning in 1997, a team of engineers and scientists at Unisys Corporation began working on a prototype of Pendergraft’s mechanized learning software. When Pendergraft unexpectedly died, others took over his role. But, it was quickly discovered that no one understood Pendergraft’s interpretation of Peirce.

Into this came Jane Campbell Mazzagatti - with an extensive background in computer hardware and software, degrees in theoretical mathematics and educational psychology, a deep personal interest in cognitive development, a nonrelenting quest for knowledge and a strong personality, Mazzagatti was the right person to judge the validity of Pendergraft’s interpretation of Peirce’s theories. After extensive study of Peirce’s writings, she realized that while Pendergraft had understood the import of Peirce’s writings that he had not correctly seen how Peirce’s triad might be implemented as a computer data structure.

By 1999, others began to agree with the conclusion that Pendergraft’s interpretation was flawed and project funding was halted. While the program may have ceased, on her own, Mazzagatti continued research into how Peirce’s sign theory could be adapted to create a logical structure composed of signs that could be used in computers. The structure that she finally conceived of and turned into an invention fits into the general computer category of data structures, devices for storing and locating information on computers.

Beginning in early 2000, Mazzagatti worked with another colleague to make her discoveries into a prototype that could be shown to others. This skunk-works project and its prototype were so successful that the management of Unisys R&D began funding a new program. Over the last few years, Mazzagatti’s prototype has been developed into a product, which is called the Knowledge Store (K-Store).

So again, why haven’t you heard about this product? Well, for a number of simple reasons. I believe that the number one reason is that this new technology is both revolutionary but also evolutionary. By this I mean that in order to adopt it, a company must be willing to take a gigantic risk - not in the reliability of the technology - but in the change to process and infrastructure. This technology has the potential to be the next evolutionary step in databases, but it has been difficult to find those willing to transform their operations and infrastructure into the next stage of evolution without having seen someone else’s success.

The second and equally important reason can be explained in terms of the dynamics and background of Unisys Corporation. Unisys, which was formed as a merger between Burroughs and Sperry Corporations, has tried to transform itself from a hardware vendor to a services-led business. However, while this transformation occurs, many of the R&D staff and the majority of its old guard management still see themselves as a hardware company; software is often misunderstood and efforts to market it are often poorly organized and lacking in innovation and vision. This is especially true with K-Store. Since its inception, K-Store’s marketing and sales efforts have been stuck trying to find ways to brand and market this evolutionary product and to whom to market it to.

So, to help introduce this technology, Mazzagatti and I wrote a book called The Practical Peirce: An Introduction to the Triadic Continuum Implemented as a Computer Data Structure. As well, Mazzagatti has taken to the convention circuit to explain her theory at international data structure conferences. This grassroots effort is our attempt to shed light on the best new BI invention you’ve never heard of.

John Zuchero is a freelance writer, communications consultant and principal of Zuchero Communications. His background is in biology and instructional design and has experience in computer software training and documentation. He has written on topics in biology and computer science and has developed training in artificial intelligences, expert systems and computer software.

--- Information Management ---
http://www.information-management.com/specialreports/2007_44/10000157-1.html?

Latest Updated Posting: Saturday, 17-June-2007... 4:24 PM Your Questions, Comments or Additional Information are welcomed:

Herb O. Buckland
herbobuckland@hotmail.com