Computer Science at the University of Helsinki 1994
3.4. Abstracts of recent Ph.D. theses
Jukka Paakki: Paradigms for attribute-grammar-based language
implementation. (A-1991-1).
Attribute grammars are a formalism for specifying and
implementing programming languages. Methods and techniques
are presented expressing attribute grammars themselves as a
language. These methods are based on relating attribute
grammars with programming paradigms. The presented formalisms
are classified as structured, object-oriented, functional,
logic, and concurrent attribute grammars. The characteristics
of these attribute grammar paradigms are reviewed and
analyzed. The central results of seven self-standing papers
are summarized. These papers discuss empirically and
theoretically the practical significance of relating the
nonterminal concept of attribute grammars with the block
concept of programming languages, the nonterminal concept
with the class concept, and the attribute concept with the
logical variable concept. Accordingly, structured,
object-oriented, and logic attribute grammars are emphasized.
Jyrki Kivinen: Problems in computational learning theory.
(A-1992-1).
This thesis summarizes results on some computational learning
problems within the probabilistic framework proposed by
Valiant. We give a polynomial time learning algorithm for
concepts that can be represented as hierarchical rule sets.
These representations consist of an arbitrary number of rules
'if c then l' organized into a fixed number of levels. The
rule means that an instance satisfying the condition c
belongs to the class l. However, the rule can be overridden
by a rule at a preceding level. If the instances are strings,
the condition could be a test for the presence of a given
substring in the instance. Instances could also be value
assignments for a potentially infinite number of Boolean
attributes, in which case the condition could be a Boolean
conjunction of constant length. We also study reliable and
probably useful learning, proposed by Rivest and Sloan. We
derive upper and lower bounds for the sample complexity of
reliable and probably useful learning in terms of
combinatorial measures of the class of concepts to be
learned. These measures resemble the Vapnik-Chervonenkis
dimension. The results imply that monotone Boolean monomials
cannot be learned reliably and probably usefully with a
sample size that is polynomial in the number of variables.
Rectangles cannot be learned reliably and probably usefully
with any finite sample size. The sample complexity of
reliable and probably useful learning remains high even if
the examples are assumed to be drawn according to the uniform
probability measure.
Patrik Floréen: Computational complexity problems in neural
associative memories. (A-1992-5).
Neural associative memories are memory constructs consisting
of a weighted network with a large number of simple nodes,
whose computations collectively result in retrieving stored
information when partial or distorted information is given as
input. We study two types of neural associative memories: the
Hopfield and the Hamming memories.
For the Hopfield memory, we first study worst-case bounds on
the convergence times in both asynchronous and synchronous
updating. However, the main theme in our study of Hopfield
networks is the computational complexity of assessing the
memory quality of a given network. We show that (assuming P
compute the size of attraction domains (in synchronous
updating), and to compute the attraction radii (in
synchronous updating). It is hard even to approximate the
attraction radii, unless P = NP.
For the Hamming memory, we study which conditions the
parameters of the network must satisfy in order to ensure
convergence and we obtain a tight bound on the convergence
time. With an optimal choice of parameter values, the
worst-case convergence time is Q(p log (pn)), where p is the
memory capacity and n is the vector length. Restricting the
instances to those with a unique nearest stored vector to the
input vector, the worst-case convergence time is Q(p log n).
For random vectors, the probability for such instances to
occur approaches 1 as n grows, if the number of stored
vectors grows subexponentially in n. By dynamically changing
the connection weights and by taking powers of the activity
levels, the convergence is speeded up considerably. Finally,
we present a new memory model based on the Hamming memory,
but with a feed-forward network structure made possible by
the use of more complicated node functions. In simulations,
we show that the new memory model behaves as expected.
Throughout the study, we also explore the consequences of
allowing "don't know" elements in both the input and the
output. Experiments show that "don't know" elements are
effective especially in the input.
Pekka Kilpeläinen: Tree matching problems with applications
to structured text databases. (A-1992-6).
Tree matching is concerned with finding the instances, or
matches, of a given pattern tree in a given target tree. We
introduce ten interrelated matching problems called tree
inclusion problems. A specific tree inclusion problem is
defined by specifying the trees that are instances of the
patterns. The problems differ from each other in the amount
of similarity required between the patterns and their
instances. We present and analyze algorithms for solving
these problems, and show that the computational complexities
of the problems range from linear to NP-complete.
The problems are motivated by the study of query languages
for structured text databases. The structure of a text
document can be described by a context-free grammar, and text
collections can be represented as collections of parse trees.
Matching-based operations are an intuitive basis for
accessing the contents of structured text databases. In
"G-grammatical" tree inclusion problems the target tree is a
parse tree over a context-free grammar G. We show that a
certain natural class of grammars allows solving some of the
grammatical inclusion problems in linear time.
Tree inclusion problems are extended by introducing logical
variables in the patterns. These variables allow extracting
substructures of the pattern matches and posing equality
constraints on them. We show that most of the tree inclusion
problems with logical variables are NP-hard, and also
consider solving their polynomial versions. As an application
of these problems we finally show how tree inclusion with
logical variables can be used for querying structured text
databases, and discuss how the inclusion queries should be
evaluated in practice.
Jaana Eloranta: Minimal transition systems with respect to
divergence preserving behavioural equivalences. (A-1994-1).
Three divergence preserving behavioural equivalences for
labelled transition systems (lts), namely D-bisimilarity,
branching D-bisimilarity and CFFD-equivalence, are analyzed
and compared. For an image-finite lts, which does not contain
an infinite sequence of distinct t-connected states, it is
possible to find the unique lts which is state- and
transition-minimal with respect to D-bisimilarity or
branching D-bisimilarity. It is well known how to find an
equivalent state-minimal lts. We show how the state and
transition-minimal lts can be found, and moreover give a
transition minimization algorithm in a finite case. In the
case of CFFD-equivalence, which is a modification of
failures-equivalence, a unique state- and transition-minimal
lts cannot always be found.
For rooted versions of D-bisimilarity and branching
D-bisimilarity the uniqueness result holds only for
root-unwound lts's. Related equivalences, D-congruence and
branching D-congruence, handle the first transitions from a
root, rather than a root itself, in a special way. The
relation between rooted equivalences and corresponding
congruences is fully analyzed. For D-congruence and branching
D-congruence a unique root-unwound minimal lts is found, for
an image-finite (without an infinite sequence of distinct
t-connected states). Moreover, D-congruence and branching
D-congruence are compositional with respect to Basic LOTOS
operations. Since the same result holds for CFFD-equivalence,
D-congruence, branching D-congruence and CFFD-equivalence are
compatible with compositional reduction.
In order to apply and demonstrate the theoretical results,
we specify and verify several concurrent systems. The
specifications are reduced with respect to several
equivalences, including D-congruence, branching D-congruence
and CFFD-equivalence. The analysis shows that it is possible
to find errors, as well as reasons for divergences by
studying reduced lts's. Moreover, these case studies are used
to test the effect of the transition-minimization algorithm,
to compare divergence preserving and divergence ignoring
equivalences, and to compare branching and weak versions of
equivalences. In some studies compositional reduction is used
and the results are analyzed. Furthermore, based on the
results of these applications, we propose some modifications
to the reduction algorithm with respect to CFFD-equivalence.