Intelligent Science Homepage   As My HomePage  |   As My Favorite |   Chinese Version  
 

Intelligent  Info Processing

Intelligence Science Lab
- Zhongzhi Shi
IntSci Research
Intelligent Systems
Intelligent Applications
Search IntSci.ac.cn

IntSci.ac.cn
 
Intelligent Decision Support System – IDSS
 

1. Introduction

2. Data-Driven DSS

3. Model-Driven DSS

4. Knowledge-Driven DSS

5. Web-Based DSS

6. Simulation-Based DSS

7. GIS-Based DSS

8. Communication-Driven DSS

9. Application Example

 

1. Introduction

Information Systems researchers and technologists have built and investigated Decision Support Systems (DSS) for more than 35 years. Here list the developments in DSS beginning with building model-oriented DSS in the late 1960s, theory developments in the 1970s, and the implementation of financial planning systems and Group DSS in the early and mid 80s. During the mid-1980s we have proposed and implemented Intelligent DSS through combining knowledge system with DSS. Then it documents the origins of Executive Information Systems, OLAP and Business Intelligence. The implementation of Web-based DSS in the mid-1990s became active topics and made influence widely.

Top of the page

2. Data-Driven DSS

       Data-driven DSS is a type of DSS that emphasizes access to and manipulation of a time-series of internal company data and sometimes external data. Simple file systems accessed by query and retrieval tools provide the most elementary level of functionality. Data warehouse systems that allow the manipulation of data by computerized tools tailored to a specific task and setting or by more general tools and operators provide additional functionality. Data-driven DSS with On-line Analytical Processing (OLAP) provides the highest level of functionality and decision support that is linked to analysis of large collections of historical data. Executive Information Systems (EIS) and Geographic Information Systems (GIS) are special purpose Data-Driven DSS.

       Metadata is data about data which describes the content, quality, condition, and other characteristics of data.. It plays an important role not only in the design, implementation and maintenance of the data warehouse, but also in data organizing, information querying and result understanding [3]. It usually records the location and description of warehouse system components. Here, we expand the scope of the metadata, use it to describe and manage the data and environment of the whole system that includes not only the data in data warehouse platform but also the task model and algorithms (or functions) in ETL and data mining. Metadata is in a core position of the whole system since it integrates ETL, data warehouse, and data mining tools. It controls the whole flow from ETL, data warehouse to data mining, so we can define and execute ETL and data mining tasks more conveniently and effectively.

    In MSMiner, the contents of the metadata are as follows:

    (1) Description of the external data source. The external data source can be relational database or other kind of data, such as Excel data, plain text, XML text, etc. In metadata, it contains allocated position and environment information of the external data source, data structure and description of the contents.

    (2) Descriptions of the subject, including the name and remark of the subject, when the subject is created and updated etc.

    (3) Description of databases under a subject, including the name, type and remark of database, the login information and other information.

    (4) Description of tables in a database, including fact tables, dimensional tables and temporary tables. It contains tables’ information and fields’ information.

    (5) Description of the ETL task, containing organization and steps of the task, data source, selection of the transformation functions, assumption of the parameters, creation and execution history of the task, and so on.

    (6) Description of the data mining task, containing organization and steps of the task, data source, selection of the mining algorithms, assumption of the parameters, evaluation and output of the results, creation and execution history of the task, and so on.

    (7) Description of the data cube, containing dimension and measure of extracted information, building information of the star-structure, and so on.

    (8) Management of the algorithm base for data mining, containing the registration and management of the mining algorithms,

    (9) Management of the functions for ETL, containing the registration and management of the functions.

   (10) User's information, containing user's basic information, authority, operational history, and so on.

       We build the correspondent metadata classes with object-oriented method. We take the three-tier architecture as the system architecture and put the metadata management subsystem at the middle tier position. It can be regarded as a metadata management server. The upper tier accesses and manages metadata by the middle tier.

       Metadata is automatically generated while every component of the system is created. Metadata will be changed during the daily maintenance of the system. MSMiner provides special metadata manager subsystem that can maintain the metadata directly and the whole system is managed validly.

ETL subsystem is an important subsystem of MSMiner. The main motivation of ETL function module is to transform the operational date from source database to analytical data in data warehouse. As we all know, the data in data warehouse is integrated and extracted from disperse database (for example Oracle, SQL Server, Access, Foxpro, Excel, DB2 etc), and there are many differences between the operational data in source database and the analytical data in data warehouse, so it isn’t a good way to load the data from various data sources into data warehouse directly. Namely, to get the clean data for data warehouse, the data from previous database must be cleaned, collected and transformed before being integrated into data warehouse. It is a key and complex step during building data warehouse. Generally speaking, ETL subsystem needs to finish the following works:

    (1) Because of data repetition and conflict in the source data from disperse database, the subsystem should unify the conflict data.

    (2) To get the comprehensive data in data warehouse, the subsystem should transform the original data structure from application-oriented one to subject-oriented one and do some generating and computing.

 

   

     The basic architecture of ETL subsystem is shown in Figure 2. From this figure, it is clear that there are 4 modules in ETL subsystem:

    (1) The Friendly user interface

      The users can do any ETL operations expediently by this interface, such as designing the ETL tasks, registering new ETL DLL functions, scheduling executing ETL tasks and visiting the result of ETL tasks.

    (2) The integrated ETL Function management and ETL task Management.

    This module including registering new ETL DLL functions, building new ETL tasks, scheduling and processing of ETL tasks etc.

    (3) The uniform metadata management

    The whole subsystem is developed in metadata-oriented way. Namely all information of this subsystem, including data source, algorithm and result, are managed by metadata.

    (4) The database server

    ETL subsystem supports disperse and various database (for example Oracle, SQL Server, Access, Foxpro, Excel, DB2 etc).

    The subsystem supports the expandable ETL function base. The main algorithms for ETL function are realized in the form of dynamic link lib (DLL) with uniform interfaces. Users can design the ETL task according to their need by choosing the relevant ETL DLLs. At present the subsystem provides about 30 kinds of ETL DLLs. In addition, users can develop some new ETL DLLs in accordance with uniform interfaces, and add them into ETL function base. In order to improve the efficiency, the ETL tasks can be scheduled at designated time and processed concurrently.

 

Data warehouse is “a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions” . The function of the data warehouse is to provide a general data warehouse environment, by which users can create and maintain their data warehouse in accordance with different needs to finish data analysis and processing and provide preparation for data mining task.

        Data warehouse in MSMiner consists of lots of subjects. When data warehouse is created, users establish several subject fields according to the application needs, the system help users extract the data for each subject and model them by star-schema. Based on the above operation, data warehouse realizes the multi-dimension data cube and OLAP, provides validate data source for data mining and decision-making. The final results may be shown by visualization tools.

        Data warehouse in MSMiner is modeled by star-schema. The system extracts the data from source tables or views and builds multiple fact tables through the data extraction, transformation and loading by the subject's request. A star-model 's structure is made of one fact table and several dimension tables related to the fact table, where the fact table includes multiple dimensions and measures. The dimension stands for the special visual angle for viewing data, such as time dimension, distribution dimension, product dimension and so on. The measure is data’s real meaning and describes what is the data. Each dimension table describes a certain dimension and its values, and each dimension consists of several levels. For example, a time dimension may be divided into three levels: year, season, and month, as each describes different query layer. One or several star-schema structures form a subject, which is the basic unit of data warehouse.

        The OLAP is realized by two ways: creating special multi-dimension database system (MOLAP) and simulating the multi-dimension data by using the relational database (ROLAP). MSMiner supports ROLAP, which is based on the star schema. The star structure related with multiple dimension tables simulates the multi-dimension data cube, where the dimensions and measures in the data cube come from dimensions and measures in the star schema. When OLAP operations are executed in the data cube, multi-dimension analysis translates the request into SQL statements, queries in fact tables, then shows the results in the form of multi-dimension.

        At present the system supports the standard OLAP operations, such as slice, dice, roll up, drill down and pivot. And the results may be displayed in many forms such as cross-tabulation tables, bar charts, pie charts or other forms of graphical output.

        The results of OLAP operations and data in fact tables may be the data source for data mining subsystem. They may be helpful to some preparation work for data mining.

Top of the page

3. Model-Driven DSS

       Model-Driven DSS emphasize access to and manipulation of a model, for example, statistical, financial, optimization and/or simulation models. Simple statistical and analytical tools provide the most elementary level of functionality. Some OLAP systems that allow complex analysis of data may be classified as hybrid DSS systems providing both modeling and data retrieval and data summarization functionality. In general, model-driven DSS use complex financial, simulation, optimization or multi-criteria models to provide decision support. Model-driven DSS use data and parameters provided by decision makers to aid decision makers in analyzing a situation, but they are not usually data intensive, that is very large data bases are usually not need for model-driven DSS. Early versions of Model-Driven DSS were called Computationally Oriented DSS by Bonczek, Holsapple and Whinston (1981). Such systems have also been called model-oriented or model-based decision support systems.

 

Top of the page

4. Knowledge-Driven DSS

       Knowledge-Driven DSS can suggest or recommend actions to managers. These DSS are person-computer systems with specialized problem-solving expertise. The "expertise" consists of knowledge about a particular domain, understanding of problems within that domain, and "skill" at solving some of these problems. A related concept is Data Mining. It refers to a class of analytical applications that search for hidden patterns in a database. Data mining is the process of sifting through large amounts of data to produce data content relationships. Tools used for building Knowledge-Driven DSS are sometimes called Intelligent Decision Support methods (cf., Zhongzhi Shi, 1988, Dhar and Stein, 1997).

Top of the page

5. Web-Based DSS

       Web-Based DSS deliver decision support information or decision support tools to a manager or business analyst using a "thin-client" Web browser like Netscape Navigator or Internet Explorer that is accessing the Global Internet or a corporate intranet. The computer server that is hosting the DSS application is linked to the user's computer by a network with the TCP/IP protocol. Web-Based DSS can be communications-driven, data-driven, document-driven, knowledge-driven, model-driven or a hybrid. Web technologies can be used to implement any category or type of DSS. Web-based means the entire application is implemented using Web technologies; Web-enabled means key parts of an application like a database remain on a legacy system, but the application can be accessed from a Web-based component and displayed in a browser.

Top of the page

6. Simulation-Based DSS

       Simulation-Based DSS deliver decision support information or decision support tools to help managers analyze semi-structured problems through simulation. These diverse systems were all called Decision Support Systems. DSS could support operations, financial management and strategic decision-making. A variety of models were used in DSS including optimization and simulation. Also,

Top of the page

7. GIS-Based DSS

       GIS-Based DSS deliver decision support information or decision support tools to a manager or business analyst using GIS. General-purpose GIS tools are programs such as ARC/INFO, MAPInfo and ArcView that have extensive functionality and can be difficult for users unfamiliar with GIS and cartographic principles to learn. Specific-purpose GIS tools are programs that are written by a GIS programmer to provide a user group with specific functions in an easy-to-use package. In the past, specific-purpose GIS tools were written primarily using a macro language. This method of delivering specific-purpose GIS tools requires that each user have a copy of the host program (ARC/INFO or ArcView) to run the macro language application. The GIS programmers now have a far richer set of tools for application development. Programming libraries with classes for interactive mapping and spatial analysis functions have made it possible to develop specific-purpose GIS tools using industry-standard programming languages that can be compiled and run without a host program (stand-alone). Internet development tools have matured as well, making it possible to develop fairly complex GIS-based programs that users can use through the World Wide Web.

Top of the page

8. Communication-Driven DSS

       Communications-Driven DSS is a type of DSS that emphasizes communications, collaboration and shared decision-making support. A simple bulletin board or threaded email is the most elementary level of functionality. The comp.groupware FAQ defines groupware as "software and hardware for shared interactive environments" intended to support and augment group activity. Groupware is a subset of a broader concept called Collaborative Computing. Communications-Driven DSS enable two or more people to communicate with each other, share information and co-ordinate their activities. Group Decision Support Systems or GDSS is a hybrid type of DSS that allows multiple users to work collaboratively in groupwork using various software tools. Examples of group support tools are: audio conferencing, bulletin boards and web-conferencing, document sharing, electronic mail, computer supported face-to-face meeting software, and interactive video.

Top of the page

9. Application Example

       We have applied IDSS to many application areas such as tax deviation, analysis of fishery information, analysis of VIP (very important person) for telecom corporation, and so on. For example, the fishing ground prediction system is a good application example of CBR. This system has been applied to the East China Sea fishing center prediction. In 2002, it is awarded the second grade of National Science and Technology Progress Award.

Top of the page

 

About the Site | Webmaster
Copyright © 2002-2003 Intelligent Science Research Group, at Key Lab of IIP, ICT, CAS, China.