Research Field

Inductive Inference by Stepwise Pair Expansion
Contact Person

Kenichi Yoshida

Systems Development Laboratory, Hitachi, Ltd.

yoshida@sdl.hitachi.co.jp.
Motivation

Inductive learning, which tries to find rules from data, has been an important area of investigation. One major research theme of this area is the data representation language that the learning methods can use. The conventional learning methods use attribute-value table as the data representation language, whereas inductive logic programming (ILP) uses first-order logic.
We propose colored directed graph as a data representation language for inductive learning methods. Graph-based induction (GBI)[1,2] uses this data representation language, and Stepwise Pair Expansion is the main procedure of the GBI method. The expressiveness of graph is in between the attribute-value table and the first-order logic. Thus its learning potential is weaker than that of ILP, but stronger than that of the conventional attribute-value learning methods.
In this research, we investigate the potential of GBI method through various application fields.
Research Plan

The real advantage of GBI appears in the domain where the dependency between data bears the essential information. The behavior analysis of computer user[3] is a typical example of such a domain. In this domain, the complex structure of dependency between the user tasks prevents us from using the conventional attribute-value learning methods, and ILP cannot meet the requirement for the efficiency.
Another interesting and important area is the network traffic analysis. The WWW (world wide web) service has recently become a widely used network service which symbolizes the benefits of a networked society. By using the WWW, the user can access various types of information easily and quickly. However, a rapid growth in demand sometimes causes a heavy network overload, and the resulting slow-response spoils the benefits of the WWW. Statistics clearly indicate the potential overload in the backbone of the wide area network. Thus, the demand for better tuning methods in wide area networks has been increasing.
However, tuning a distributed file system is a difficult problem. In [4], we examine a method for laying out the distributed cache storage on the network using the GBI method. By analyzing the access patterns of WWW data, this method can lay out distributed cache storage which is adapted to the access patterns.
Since the graph representation can handle the network topology, GBI method seems to be a promising method to analyze network traffic. Thus, the network traffic analysis and control is an important application area of this research.
References

[1]
K. Yoshida and H. Motoda. CLIP: Concept Learning from Inference Patterns. J. of Artificial Intelligence Vol. 75, No. 1 pp, 63-92 (1995)
[2]
K. Yoshida and H. Motoda. Table, Graph and Logic for Induction. Machine Intelligence Workshop (1995)
[3]
K. Yoshida and H. Motoda. Automated User Modeling for Intelligent Interface. International Journal of Human Computer Interaction Vol. 8, No. 3 pp. 237-258 (1996)
[4]
K. Yoshida. WWW Cache Layout to Ease Network Overload -Analyzing Regularity of Structured Data- 6th Int. Workshop on AI and Statistics (1997)