﻿ A Combinatorial Analysis of Tree-Like Sentences

Open Journal of Discrete Mathematics
Vol.05 No.03(2015), Article ID:58263,21 pages
10.4236/ojdm.2015.53004

A Combinatorial Analysis of Tree-Like Sentences

Gilbert Labelle1, Louise Laforest2

1Département de Mathématiques, Université du Québec à Montréal, Montreal, Canada

2Département d’Informatique, Université du Québec à Montréal, Montreal, Canada   Received 3 June 2015; accepted 21 July 2015; published 24 July 2015

ABSTRACT

A sentence over a finite alphabet A, is a finite sequence of non-empty words over A. More generally, we define a graphical sentence over A by attaching a non-empty word over A to each arrow and each loop of a connected directed graph (digraph, for short). Each word is written according to the direction of its corresponding arrow or loop. Graphical sentences can be used to encode sets of sentences in a compact way: the readable sentences of a graphical sentence being the sentences corresponding to directed paths in the digraph. We apply combinatorial equations on enriched trees and rooted trees, in the context of combinatorial species and Pólya theories, to analyze parameters in classes of tree-like sentences. These are graphical sentences constructed on tree-like digraphs.

Keywords:

Pólya Theory, Combinatorial Species, Digraphs, Tree-Like Sentences 1. Introduction

Figure 1 (left) shows a completely unlabelled1 connected digraph. We define a graphical sentence over a finite alphabet A by attaching a non-empty word over A to each arrow and each loop of a completely unlabelled connected digraph. Each word must be written according to the direction of its corresponding arrow or loop, from source to target. Figure 1 (middle) shows a graphical sentence over alphabet {A, C, G, T} and Figure 1 (right) shows another over alphabet .

Graphical sentences can be used to encode sets of ordinary sentences in a compact way: The readable sentences of a graphical sentence being the sentences corresponding to directed paths in its digraph. For example,

Figure 1. Unlabelled digraph and graphical sentences over alphabets {A, C, G, T} and .

TTT C GCCTG CAT CAT GCAATT, is a readable sentence arising from the graphical sentence of Figure 1 (middle).

In the present paper we focus our attention on the structure of graphical sentences as combinatorial objects using methods from the theory of combinatorial species   and classical Pólya theory  . We leave aside the generation of the readable sentences of a graphical sentence since this is easily done via the computation of powers of incidence matrices2. Of course, special sentences among the readable sentences can be selected by adding extra structure to graphical sentences (such as source points, sink points, STOP points, counters, extensions of the alphabet by adding special characters such as ò, !, ?, etc). We also leave aside this aspect in our analysis of graphical sentences.

Various descriptive parameters can be attached to each graphical sentence over a given alphabet A. For example, the graphical sentence of Figure 1 (middle) is made of 7 vertices, 11 arrows, 2 loops, 52 letters, letter A appears 13 times, letter C appears 9 times, letter G appears 10 times and letter T appears 20 times.

As usual in enumerative combinatorics, families of parameters associated to structures are conveniently encoded by weight-monomials.

Definition 1.1. The weight of a graphical sentence s over an alphabet A is the (commutative) formal monomial3 (1)

where is the number of occurrences of letter a in s.

In (1), each letter is reinterpreted as a formal variable. For example, the weight of the graphical sentence s of Figure 1 (middle) is given by (2)

Definition 1.2. Let be any class of totally unlabelled connected digraphs and . Let be the (countable) set of all graphical sentences over alphabet A arising from digraphs in , the word on each arrow or loop having a length . The inventory of is the formal sum of the weights of all graphical sentences in : (3)

As usual in enumeration problems, the (explicit or recursive) computation of an inventory of a class of structures provides a great deal of information about the structures to which it is associated. This information is extracted from the inventory through expansion, collection of terms, specialization/confluence of variables, algebraic/differential manipulations and coefficient extraction.

For example, in the present situation, expanding and collecting terms in (3) gives, of course, (4)

where the coefficient is the total number of graphical sentences having m vertices, n ar-

rows, p loops, a total number of q letters, of which are letter a, for each.

Assigning the value 1 to each letter a and collecting terms gives

(5)

where is the number of having m vertices, n arrows, p loops and q letters. Letting, gives

(6)

where is the number of made of q letters occurring with frequencies.

Letting and assigning the value 1 to each letter a gives

(7)

where is the number of made of q letters.

Let be the number of graphical sentences made of p words (i.e., p is the total number of arrows and loops in s) and q letters. Then

(8)

Moreover, if we let in (8) and if is a finite set, then

(9)

is a polynomial, where is the number of made of p words. Differentiation gives

(10)

Of course, a variety of other similar manipulations of the inventory are possible. In Sec-

tion 2 we apply methods from the theory of species and Pólya theory, to express inventories of general classes of graphical sentences in terms of cycle index series. Section 3 deals with specific classes of graphical sentences: linear sentences (corresponding to path-like digraphs) and general tree-like sentences (corresponding to classes of tree-like digraphs). We conclude (Section 4) by giving suggestions for possible extensions and generalizations of our results. Various explicit examples are given and to make the text easier to read, the proofs of the main results are collected in Section 5. In a previous paper,  , we studied the distribution of runs in arborescent words.

We assume that the reader is familiar with Pólya theory  and with the basic concepts of the theory of combinatorial species   .

2. Inventory of Graphical Sentences via Cycle Index Series

In order to give a rigorous meaning to the notion of a totally unlabelled digraph and to be able to take into account the possible symmetries within graphical sentences, we must recall first some definitions concerning labelled digraphs. A digraph on (or labelled by) a finite set V of vertices, a finite set of arrows, and a finite set of loops is an ordered pair of injections,

(11)

where and. Given any loop and any vertex, the equality means that is a loop at v in the digraph g. Given any arrow and any ordered pair of distinct vertices, the equality means that arrow is going from v to w in the digraph g. In other words

(12)

Figure 2(a) shows a digraph on the sets of vertices, of arrows and of loops, with loop at vertex 4 and loop at vertex 1. Let be a digraph on and be a digraph on. An isomorphism from digraph g to digraph, is an ordered triple, , of bijections, , , such that

(13)

(14)

If then is called an automorphism (or symmetry) of the digraph g. For example, the triple of permutations, defined (in cyclic notation) by

(15)

is an automorphism of the digraph of Figure 2(a). Note that labelled digraphs are elastic and not considered as embedded in the plane. Only the incidence relations between vertices, arrows and loops are taken into account. A totally unlabelled digraph (see Figure 2(b)), is simply an isomorphism class of labelled digraphs. The class of a labelled digraph g is denoted [g]; so that [g] is a totally unlabelled digraph with representative g.

We now give a rigorous definition of the notion of a graphical sentence.

Definition 2.1. Let A be a finite alphabet and, be the set of non-empty words over A. A graphical sentence over A is an equivalence class, s, of ordered triples, , where g is a connected labelled digraph on, and, are arbitrary functions assigning a non-empty word to each loop and each arrow of g. Two such triples and being equivalent if there exists an isomorphism such that and. We write to mean that s is a graphical sentence with representative.

4This means that is a class of connected digraphs on arbitrary finite sets, V of vertices, V1 of arrows and V0 of loops, which is closed under arbitrary isomorphisms.

5More precisely, X is the species of singleton vertices, Y is the species of singleton arrows and Z is the species of singleton loops.

Now take any species of connected labelled digraphs4. Our goal is to compute the inventory (3) of the class of all graphical sentences where. To emphasize the fact that digraphs are made of three sorts of elements, vertices, arrows and loops, the given species of digraphs can be written in the form, where X is the sort of vertices, Y is the sort of arrows, and Z is the sort of loops5. Any digraph is called a -structure for short.

Following standard notations from the theory of species, the set of all -structures on a set V of vertices, a set of arrows and a set of loops is denoted by (note the square brackets). Given bijections, , , we denote by

(16)

the bijection that transforms (or transports) each digraph into a corresponding isomorphic digraph, as described above. Note that if, and, then is a permutation of the set (i.e., a bijection of into itself).

Many power series can be associated to any species. An important one is the Pólya-Joyal cycle index series. In the context of a species of digraphs, this is a power series in a triple infinity of variables, , defined by

(17)

(18)

(a) (b)

Figure 2. (a) A labelled digraph g; (b) [g] = totally unlabelled g.

where, for each permutations, and, is the number6 of all digraphs, on the sets of vertices, of arrows and of loops, for which is an automorphism (i.e., is the cardinality (or total weight) of the set of fixed points of the permutation). For a permutation, the notation is used to denote the number of cycles7 of length i in the cyclic decomposition of.

6Or total weight, in the case of weighted digraphs.

7Not to be confused with which is the image of i under.

8In particular, the empty sequence, 0, has no component and satisfies 0  0.

The sequence of integers is called the cyclic type of and it is well known that the number of having cyclic type, with, is

(19)

Note that each sequence k has a finite number of nonzero terms and will be considered, in the present text, as a finite sequence with components. By this convention, k can be viewed as a partition of the integer, having parts of length i, for and we use the classical notation8

(20)

to express this fact. Note also that, in (18), only depends on the cyclic types of the three permutations. Hence, every three permutations having given cyclic types, , contribute to the same monomial

(21)

in (18). In order to eliminate this redundancy, we regroup monomials which correspond to each of these types, and taking (19) into account we obtain the following more compact variant expression for the cycle index series of:

(22)

in which each monomial appears only once and is the number (or total weight) of all -struc- tures for which any given of types is an automorphism.

The following proposition is a consequence of general principles from the theory of species and Pólya theory. It shows that the computation of is an essential step in the determination of the inventory series (3) of classes of graphical sentences arising from digraphs.

Proposition 2.1. Let be any species of connected digraphs, let, and let be the set of all graphical sentences over alphabet A arising from, the word on each arrow or loop having a length. Then, the inventory of is given by

(23)

where, , , and is the formal k-th power sum of the letters of the alphabet.

Proof. See Section 5.

Making use of the compact expression (22) for the cycle index series and collecting terms, inventory (23) can be rewritten in the following more explicit form.

Corollary 2.2. The inventory, , of the set of all graphical sentences arising from a species of digraphs is given by

(24)

where, , , and

(25)

Note. If in Proposition 2.1, then and no restrictions are put on the lengths of the words in inventory (23). If, then and the lengths of the words are all odd. If, then and the lengths of the words are bounded by N, etc.

3. Analysis of Classes of Tree-Like Sentences

9Where only the vertices are labelled.

10However, for the 3-sort species of all digraphs, ZDig can be computed explicitly (see Section 4).

As shown in the preceding section, the computation of the inventory of a class of graphical sentences can be reduced to the computation of the cycle index series provided that arises from a 3-sort species of connected digraphs. However, the explicit or recursive computation of the cycle index series of most species of graphical structures is a very difficult (or intractable) task. For example, even in the ordinary one-sort case9, the complete cycle index series of the species of all ordinary plane digraphs and all transitive digraphs are still unknown10.

For this reason, we focus our study on the following basic classes of graphical sentences:

1) Linear sentences (arising from the species of path-shaped digraphs).

2) General tree-like sentences (arising from various species of tree-like digraphs).

Note that linear sentences are special kinds of tree-like sentences. Due to their close relationship with ordinary sentences, we have chosen to present first a separate subsection devoted to their study. Our methods will use the fact that species of tree-like digraphs can be built from simpler species by making use of basic combinatorial operations and that cycle index series behave well with respect to these operations. For example, if F, G and H are species, then

(26)

where denotes the classical plethystic substitution of cycle index series (see  ).

3.1. Linear Sentences

We say that a digraph is path-shaped if its underlying simple graph is a simple path. A graphical sentence is linear if it comes from a path-shaped digraph. Figure 3 shows a path-shaped digraph, together with its underlying simple path and a linear sentence over alphabet. Note that a path-shaped digraph can have non-trivial automorphisms. For example, the 180˚ rotation, , where the cyclic decompositions of the permutations are given by

(27)

(28)

is an automorphism of the path-shaped digraph of Figure 3.

Special kinds of linear sentences over an alphabet include ordinary sentences (Figure 4 top), corresponding to directed paths without loops, and ordinary sentences with (possible) loops (Figure 4 bottom), corresponding to directed paths with (possible) loops.

For example, the sentence

(29)

is one of the readable sentences in Figure 4 bottom.

Proposition 3.1 Let be the set of all ordinary sentences, , the set of all ordinary sentences with loops, , the set of all linear sentences without loops, and, the set of all linear sentences with loops over an alphabet A and a set of allowed word-lengths. Then, the following inventories hold

(30)

(31)

(32)

where, ,.

Proof. See Section 5.

In view of (30)-(32), the alternate general inventory formula (24) in the case of any class of linear sentences, does not involve. We have the following explicit expansions.

Corollary 3.2. For the classes of linear sentences, we have

(33)

(34)

(35)

Figure 3. Path-shaped digraph, its underlying simple path and a linear sentence.

Figure 4. An ordinary sentence and an ordinary sentence with loops.

(36)

where

(37)

(38)

(39)

(40)

using the convention if and are not both integers.

Proof. (Sketch) Formulas (34) and (36) are immediate since no loops are involved. Careful computations, starting from (30) and (32) using geometric series, the binomial theorem, and manipulation or indices lead to expansions (33) and (35).

Sample of explicit examples of computations.

Example 3.1. General shape of the inventory of the class.

The first few terms of read as follows

(41)

The coefficients of in (41) are polynomials in and each having one or two terms, despite the fact that (35) suggests three terms. This is true for every m, n, p since and cannot be both non

zero, due to the fact that and cannot be both integral in (39) and (40).

Example 3.2. Counting linear sentences with given parameters.

Corollary 3.2 is particularly useful when one wants to compute an individual term in the inventory of linear sentences. For example, consider the 3-letter alphabet and take. This means that we impose no restrictions on the lengths of the words that are assigned to each arrow or loop in linear sentences. Suppose that we want to know the number of such linear sentences having vertices, arrows, loops which are made of 7 times the letter a, 6 times the letter b and 12 times the letter c. In this case, the coefficient of is given by

(42)

where

(43)

The required number of linear sentences equals the coefficient of in (42). Using Maple, this number is equal to

(44)

If we take, then the words that are assigned to each arrow or loop in linear sentences are of length 1 or 2. In this case, (43) is replaced by

(45)

and (44) goes down to 16882686796464720.

Example 3.3. Manipulating the inventory of linear sentences.

As we have seen in Section 1, manipulations of inventories (specialization of variables, expansions, etc) can be made to analyze various parameters in graphical sentences. Proposition 3.1 is generally more suitable for such manipulations than Corollary 3.2. For example, let be the number of letters in alphabet A, , and assign the value 1 to x, y and to each letter in given by (31). Then, , and, by (7)

(46)

where is the number of linear sentences without loops that are made of q letters. Note that (46) is a rational function of t, so that the sequence satisfies a linear recurrence with constant coefficients and the

asymptotic expansion of, as, can be established using standard classical methods. The first few terms in expansion (46) are given by

(47)

Example 3.4. Fibonacci numbers versus ordinary sentences with loops.

Consider now the class of ordinary sentences with possible loops. Let and assign the value 1 to x and to each letter in given by (30). Then we have

(48)

11, for and.

where are the Fibonacci numbers11 and is the number of ordinary sentences with possible loops made of p words and q letters. Now, fix and consider the finite class of all made of q letters. Since, , then collecting the coefficient of in (48) we have the polynomial inventory

(49)

Finally, making use of (10) and invoking Binet’s formula

(50)

where is the golden number, , we find that the expected number of words in a random ordinary sentence with possible loops made of letters is

(51)

The reader can check that if we do not allow loops, then (51) is replaced by.

A multitude of other similar examples can be obtained using Proposition 3.1 and Corollary 3.2.

3.2. General Tree-Like Sentences

We say that a digraph is tree-like if its underlying simple graph is a simple tree or a simple rooted tree. A graphical sentence is tree-like if it comes from a tree-like digraph. Figure 5 shows a tree-like digraph, its underlying simple tree and a tree-like sentence over alphabet.

The tree-like structures of Figure 5 are free in the sense that they are not restricted to be embedded in the plane and no other constraints are assumed on the vertices, arrows and loops. More generally, by allowing such constraints, one can consider, for example, the above linear sentences (see Figure 3 and Figure 4), one way free binary rooted tree sentences (see Figure 6 left), one way free full binary rooted tree sentences (see Figure 6 right), plane tree sentences (see Figure 5 right) where, this time, the underlying tree is considered as being embedded in the plane), etc. We shall deal with these cases in a uniform manner by adding extra structure on the underlying trees or rooted trees. More precisely, the underlying trees or rooted trees will be enriched according to the following definition.

Definition 3.1.  Let be any given one-sort species.

1) A R-enriched rooted tree is a rooted tree in which the set of immediate descendants (away from the root) of every vertex is equipped with a R-structure (see Figure 7 left, in which each dotted arc represents a R-structure).

Figure 5. A tree-like digraph, its underlying simple tree, a tree-like sentence over A = {0, 1}.

Figure 6. A one way free binary rooted tree sentence and a full one.

Figure 7. A R-enriched rooted tree and a R-enriched tree.

2) A R-enriched tree is a tree in which the set of immediate neighbors of each vertex is equipped with a R-structure (see Figure 7 right, in which each dotted circle represents a R-structure).

Lemma 3.3  The species of R-enriched rooted trees is characterized recursively by the combinatorial equation

(52)

and the species of R-enriched trees satisfies the combinatorial equality12

(53)

where is the species of R'-enriched rooted trees (R' being the combinatorial derivative of the species R).

It is easy to see that the species of ordinary rooted trees (resp. ordinary trees) corresponds to the species AR (resp. aR) with the choice, the species of all finite sets. The species of binary rooted trees (resp. full binary rooted trees) corresponds to the species AR with the choice (resp.), where 1 denotes, as usual, the species of the empty set and E2, the species of 2-element sets. The species of all plane trees corresponds to the species aR with the choice, where C is the species of cyclic permutations (see Example 3.8 below), etc.

For the computation of the inventories of various classes of enriched tree-like graphical sentences, we will make use of the following 3-sort extension of Lemma 3.3 which includes a new extension, (56) below, of the dissymmetry formula (53).

Lemma 3.4. The species of one-way R-enriched rooted trees and of R-enriched rooted trees on the sorts X of vertices, Y of arrows and Z of loops are characterized recursively by the combinatorial equations

(54)

where. They can also be expressed explicitly in terms of the 1-sort species as follows

(55)

The species of R-enriched trees on sorts X of vertices, Y of arrows and Z of loops satisfies the combinatorial equality (extended dissymmetry formula)

(56)

where is the species of R'-enriched rooted trees on sorts X, Y, Z.

Proof. See Section 5.

In our analysis of tree-like graphical sentences, we will use of the following useful compact “plethystic notation” which is classical in the theory of species and cycle index series.

Notation 3.5. Let be a (formal) power series in the variables. For any integer, denotes the series S in which each variable is raised to the power k:

(57)

In particular,. Furthermore, given power series,

(58)

(59)

then, , etc, denote the series

(60)

in the variables.

For example, taking the variables for the’s, then formula (23) of Proposition 2.1 takes the compact form

(61)

where, , and is the formal sum of the letters in A.

We now describe how to compute the inventory of classes of tree-like sentences.

Proposition 3.6 Given an arbitrary species, let

(62)

(63)

(64)

over an alphabet A and a set of allowed word-lengths. Then, using Notation 3.5, the inventories and can be computed recursively as follows

(65)

(66)

They can also be expressed explicitly in terms of the cycle index series of the 1-sort species by

(67)

(68)

where, ,. Moreover, let R' be the combinatorial derivative of the species R. Then, the inventory has the form

(69)

In the case of the corresponding sets, , , , in which no loops are allowed, we have

(70)

Proof. (sketch) Apply Proposition 2.1, taking into account Lemma 3.4.

Note. When written explicitly, (65) takes the form

(71)

where, , ×××, ,

. Formulas (65) and (66) give rise to iterative schemes for the computation of the inven- tories,. See, for example,  - for descriptions of efficient ways to do such computations, including adaptations of quadratically convergent Newtonian methods. Formula (69) reduces the computation of to that of. Recall that.

Sample of explicit examples of computations.

Example 3.5. One way free binary rooted tree sentences.

Consider the set of one way free binary rooted tree sentences without loops (Figure 6 left, shows such a tree-like sentence over the 26-letter alphabet) and the set of such sentences where loops are allowed. These tree-like sentences correspond to R-enriched rooted trees with. Since , formula (65) of Proposition 3.6 immediately gives the following recursive scheme for the computation of the inventory

(72)

(73)

Of course, as many terms as we want in (72) and (73) can be computed using a computer algebra system. For a more specific application, let be the number of letters in alphabet A and let and as in (7). Then, and (72), (73) give, after some symbolic manipulation,

(74)

(75)

(76)

(77)

The coefficient of in (75) (resp. (77)) is the number of one way free binary rooted tree sentences with loops (resp. without loops) on a -letter alphabet A that are made of q letters. As an illustration, for the usual 26-letter alphabet, series (75) and (77) read as follows up to:

(78)

(79)

Example 3.6. Ordinary tree and rooted tree sentences.

Let, be the species of finite sets. Then, by Definition 3.1, the species of E-enriched rooted trees coincides with the species of ordinary (free) rooted trees and the species of E-enriched trees coincides with the species of ordinary (free) trees. Lemma 3.3 produces the familiar combinatorial equations,

(80)

the second equation being the classical dissymmetry formula of Leroux. Taking cycle index series in (80) and using and, we obtain the classical formulas

(81)

(82)

(83)

(84)

from which and can be computed to arbitrary degree13.

Now, let

(85)

(86)

(87)

Then, by (67)-(69),

(88)

(89)

(90)

For more specific applications, let be the number of letters in the alphabet and consider the specializations and. Then, , and by (66) and (69) with, we have

(91)

(92)

(93)

(94)

The coefficient of in (92) (resp. (94)) is the number of free rooted tree sentences (resp. free tree sentences) with loops on a -letter alphabet A that are made of q letters. For example, in the case of a 4-letter alphabet, say, series (92) and (94) read as follows up to:

(95)

(96)

Consider now the 2-letter alphabet, and make the substitutions, , , , then, no loops are allowed, and (89)-(90) take the forms

(97)

(98)

where (resp.) is the number of rooted tree sentences (respected tree sentences) without loops that contain exactly i times the letter a and j times the letter b.

If we choose (only words of length 1 or 2 are allowed), then, and (89)-(90) begin with the terms

(99)

(100)

etc. Again, all the above series, and many variants, can be expanded to arbitrary orders.

Example 3.7. Back to linear sentences.

Since linear sentences are special kinds of tree-like sentences, it is interesting to look at the dissymmetry formula (56) in the context of path-shaped graphs. Take the 1-sort species. Then a R-structure is either void, a singleton, or an unordered pair of singletons. This means that a R-enriched tree is a simple path (see Definition 3.1, Figure 7 right and Figure 3 middle). Hence, the 2-sort species of all path-shaped digraphs without loops (see Figure 9) coincides with the 2-sort species of

R-enriched trees. Moreover, since, a R'-enriched rooted tree is a simple path pointed at an extremity (see Definition 3.1 and Figure 7, left). Hence, the 2-sort species of all path-shaped digraphs without loops pointed at an extremity (see Figure 9) coincides with the 2-sort species of R'-enriched rooted trees. In this setting, the dissymmetry formula (56), with, becomes

(101)

where Now, using, we can solve (101) for P as follows,

(102)

This formula coincides with formula (124) which is used in the proof of Proposition 3.1.

Example 3.8. Plane tree sentences.

A plane tree is a (unrooted) tree that is embedded in a plane. Such tree-structures have fewer automorphisms than free trees. Take any vertex p of a plane tree and draw a vector starting at p which is perpendicular to the plane in which is embedded. This gives an orientation to that plane and the vertices that are adjacent to p are cyclically turning around p according to that orientation (see Figure 7). In other words, the set of immediate neighbors of p is equipped with a -structure, where C is the species of non-empty oriented cycles (the empty set species, 1, corresponds to the special case where the tree is reduced to one point, , for which the the set of immediate neighbors of p is empty). Since p is arbitrary, this shows that the species of plane trees coincides with the species of -enriched trees.

Take, then is the species of linear orders. The species of L- enriched rooted trees coincides with the species of linearly ordered rooted trees (the set of immediate descendants, away from the root, of every vertex is linearly ordered). Using the classical formulas,

(where denotes Euler function) and in Lemma 3.3, we have

(103)

(104)

Using the expansion14, we get, from (68) and (69), the following

explicit expressions for the inventory (resp.,) of linearly ordered rooted tree sentences with loops (resp., plane tree sentences with loops)

(105)

(106)

(107)

where

(108)

Note that since linearly ordered rooted tree structures are asymmetric structures, no except appear in (105). As before, let be the number of letters in alphabet A and let and. Then

(109)

(110)

This time, the coefficient of in (109) (resp. (110)) is the number of linearly ordered rooted tree sentences (resp. plane tree sentences) with loops on a -letter alphabet A that are made of q letters.

As a final illustration, fix and consider the inventory of linearly ordered rooted tree sentences without loops having exactly m vertices. Letting in (105) and (108), we have

(111)

Now, let be the number of letters in the alphabet and assume that the length of the word on each arrow is at most k. Then, making the substitutions, , , , we get

(112)

which is a polynomial in t, since has k terms. Differentiation gives

(113)

where is the expected total number of letters in random m-vertex linearly ordered rooted tree sentence without loops on a -letter alphabet in which the word on each arrow has at most k letters. Further computations give

(114)

Again, all the above inventories can be manipulated in a great number of ways.

4. Concluding Remarks

It would be interesting to extend the above analysis to other classes of graphical sentences arising from other families of 3-sort species of connected digraphs. As said before, this is generally a very difficult task. However, the analysis can be done, for example, for the class of cyclic graphical sentences (for which the underlying simple graphs are unoriented cycles) by making use of (3-sort) cycle index series related to subgroups of the dihedral groups. The analysis can also be done for the whole class of all graphical sentences since the cycle index series of the 3-sort species, , of all digraphs (with labelled vertices of sort X, arrows of sort Y and loops of sort Z) turns out to be tractable15. In fact,

(115)

15The cycle index series of the 1-sort species, , of ordinary digraphs (labelled vertices only) is well known: , where = greatest common divisor of i and j (see  , for example).

where is the 2-sort species of all digraphs with vertices of sort X, arrows of sort Y and no loop, which, in the spirit of   , can be expressed in terms of simpler species by making use of a 2-sort version of the more advanced operation, , called functorial composition of species.

Another direction of investigation would be to replace digraphs by dimultigraphs (directed multigraphs) and study associated inventories of classes of multigraphical sentences. For example, in the case of the 3-sort (resp. 2-sort) species (resp.) of all dimultigraphs, equation (115) must be replaced by and operation can be used.

5. Proofs of the Main Results

Proof of Proposition 2.1. First, consider the weighted species of all -struc- tures in which each vertex is given a weight x, each arrow a weight y and each loop a weight z. Then

(116)

Next, let be the species of all k-lists of arrows, where. Figure 8(a) shows an unla-

belled -structure with. Now, define the 2-sort species, by substituting for Y

and for Z in:

(117)

Figure 8(b) shows an unlabelled -structure of weight on a set of vertices and arrows.

Taking the cycle index series of (117) we get

(118)

Next, assigning a weight, , to every arrow of every -structure, gives the species

(119)

whose cycle index, , is obtained by the substitutions, , in (118). Note that an unlabelled -structure can be canonically identified with a graphical sentence. So that (23) follows by the substitutions, in that unlabels the arrows and vertices in -structures.

Proof of Proposition 3.1. The inventories (30) of the two special kinds of linear graphical sentences, and, are very easy to compute since directed paths are sequences of very simple structures with trivial automorphism groups. More precisely, let be the species of dipaths without loops and be the species of dipaths with possible loops. Since XY-structures are of the form and ZXY-structures are of the

form, we have

(120)

(121)

and, since, Proposition 2.1 immediately gives (30).

On the other hand, the inventories of the sets and of linear graphical sentences are more difficult to compute since a path-shaped digraph can have a nontrivial automorphism (as we saw above). We first analyze the species of all path-shaped digraphs without loops (see Figure 9 top). Introduce the auxiliary species. An -structure is of the form

(122)

Let be the species of all P-structures pointed at an extremity (see Figure 9 bottom).

This pointing induces a global orientation to these pointed structures (see dotted arrow) and implies that the species K is a species of sequences:

(123)

As a consequence of the general dissymmetry formula (56) the species P can be expressed in terms of K and as follows (see details in Example 3.7)

(124)

where denotes the species of 2-element sets. Formula (31) then follows from Proposition 2.1 by taking the cycle index series of (124) and using the fact that. Finally, let be the species of all path-shaped digraphs possible with loops, then the following combinatorial equation holds

(125)

since every -structure is obtained from a P-structure by adding a loop to each vertex (that is,) or doing nothing to the vertex (that is,). So that (32) follows by substituting for X in (124). The computations are elementary but long and are left to the reader.

Proof of Lemma 3.4. Consider an -structure and look at its underlying R-enriched rooted tree t (see Figure 7 left). To reconstruct from t, one must replace the root of t by adding a possible loop (that is by replacing the root by an -structure) and by replacing each edge adjacent to the root of t by an (outward)

(a) (b)

Figure 8. (a) Unlabelled -structure, (b) Unlabelled - structure (weight:).

Figure 9. A -structure and a -structure.

arrow (that is, by replacing each such edge by an Y-structure). This establishes (54a). The proof of the combinatorial Equation (54b) is similar, where, this time, each edge adjacent to the root of t is replaced by an outward arrow, an inward arrow or a double arrow (that is, by replacing each such edge by a -structure). To obtain the explicit formula (55a), multiply first both sides of (54a) by Y. This gives

(126)

But, by (52), the species also satisfies (126). Hence, by the unicity of solution in the implicit species Theorem of Joyal  , we must have, and (55a) follows by factoring out Y. A similar argumentation can be used to prove (55b) from (54b).

The dissymmetry formula (56) is much more difficult to establish since more automorphisms are involved in enriched trees. To prove this combinatorial equality, we express in two ways the auxiliary species of -structures which are pointed either at a single vertex or at two adjacent vertices:

(127)

・ The first expression for reads as follows

(128)

To prove it, consider a -structure and look at its underlying pointed or bipointed R-enriched tree, f. We have two cases to consider:

1) If is a -structure, then by Figure 10 we see that is canonically equivalent to a

-structure since to recover from f, the vertices of f must be replaced by -struc- tures and the edge adjacent to the pointed vertex (and subsequently, all other edges) must be replaced by an Ω-structure.

2) If is a -structure, then Figure 11 shows that is canonically equivalent to a

-structure since to recover from f, the edge of f between the two adjacent pointed

vertices must be replaced either by a double arrow (i.e., is equivalent to a -structure) or by a single arrow (i.e., is equivalent to a -structure). This establishes (128).

・ The second expression for reads as follows

(129)

To prove it, we first split the species into two subspecies according to whether the pointing(s) coincides exactly with the center or not:

(130)

Since the center of a tree is a canonical object, pointing a tree exactly at its center is naturally equivalent to doing nothing to the tree and we have

(131)

(132)

Now, consider a -structure that is not pointed at its center. We have two cases to consider:

1) If is an -structure, then Figure 12 shows that is canonically equivalent to a

-structure. Indeed, the pointing induces an orientation on the edge from the pointed vertex of f in the direction of the center (see dotted arrow) giving rise to an ordered pair of rooted trees. Moreover, to

Figure 10. Underlying structures for.

Figure 11. Underlying structures for.

Figure 12. Underlying structures for.

recover from f, that edge must be replaced by an arrow going in same direction, or in the opposite direction of the dotted arrow, or by a double arrow. That is, the edge must be replaced by a Ω-structure. Note that is not an arbitrary -structure, since the global center is on the side pointed by the dotted arrow.

2) If is an -structure, then Figure 13 shows that is canonically equivalent to a

-structure. Indeed, the bi-pointing induces an orientation on the edge between the pointed vertices of f in the direction opposite to the center (see dotted arrow) giving rise to an ordered pair of rooted trees. To recover from f, that edge must be replaced, as above, by an Ω-structure. Note that is not

Figure 13. Underlying structures for.

an arbitrary -structure, since the global center is now on the side of the source of the dotted arrow.

This establishes (129) since any -structure is either of the form or of the form. The general 3-sort dissymmetry formula (56) follows by cancelleing the common term in the right-hand- sides of the expressions (128) and (129) for.

Cite this paper

GilbertLabelle,LouiseLaforest, (2015) A Combinatorial Analysis of Tree-Like Sentences. Open Journal of Discrete Mathematics,05,32-53. doi: 10.4236/ojdm.2015.53004

References

1. 1. Bergeron, F., Labelle, G. and Leroux, P. (1998) Combinatorial Species and Tree-Like Structures (Encyclopedia of Mathematics and Its Applications). Cambridge University Press, Cambridge.

2. 2. Labelle, G. (1992) Counting Asymmetric Enriched Trees. Journal of Symbolic Computation, 14, 211-242.
http://dx.doi.org/10.1016/0747-7171(92)90037-5

3. 3. Pólya, G. and Read, R.C. (1987) Combinatorial Enumeration of Groups, Graphs and Chemical Compounds. Springer-Verlag, Berlin, Heidelberg, and New-York.

4. 4. Décoste, H., Labelle, G. and Leroux, P. (1982) Une approche combinatoire pour l’itération de Newton-Raphson. Advances in Applied Mathematics, 3, 407-416.
http://dx.doi.org/10.1016/S0196-8858(82)80013-4

5. 5. Joyal. A. (1981) Une théorie combinatoire des séries formelles. Advances in Mathematics, 42, 1-82.
http://dx.doi.org/10.1016/0001-8708(81)90052-9

6. 6. Labbé, J.-P. and Labelle, G. (2013) Counting Types of Runs in Classes of Arborescent Words. Open Journal of Discrete Mathematics, 3, 7-15.
http://dx.doi.org/10.4236/ojdm.2013.31002

7. 7. Leroux, P. and Miloudi, B. (1992) Généralisations de la formule d’Otter. Annales des Sciences Mathématiques du Québec, 16, 53-80.

8. 8. Labelle, G. (1986) Some New Computational Methods in the Theory of Species. In: Labelle, G. and Leroux, P., Eds., Combinatoire Enumérative, Lecture Notes in Mathematics 1234, Springer Berlin Heidelberg, Berlin, 192-209.
http://dx.doi.org/10.1007/bfb0072517

9. 9. Labelle, G. (1981) Une nouvelle démonstration combinatoire des formules d’inversion de Lagrange. Advances in Mathematics, 42, 217-247.
http://dx.doi.org/10.1016/0001-8708(81)90041-4

10. 10. Pivoteau, C., Salvy, B. and Soria, M. (2012) Algorithms for Combinatorial Structures: Well-Founded Systems and Newton Iterations. Journal of Combinatorial Theory, Series A, 119, 1711-1773.
http://dx.doi.org/10.1016/j.jcta.2012.05.007

11. 11. Décoste, H., Labelle, G. and Leroux, P. (1992) The Functorial Composition of Species, a Forgotten Operation. Discrete Mathematics, 99, 31-48.
http://dx.doi.org/10.1016/0012-365X(92)90363-K

NOTES

1In the sense that both its vertices, arrows and loops are unlabelled.

2Entry (i, j) being defined as the word on the arrow or loop from vertex i to vertex j, etc.

3Variables x, y, z and t are taken distinct from the letters in alphabet A.

12Equality (53) is called the dissymmetry formula for trees. It is due P. Leroux  for R = E and to the first author  for general R.

13Explicit expressions for the individual coefficients of ZA and of Za have also been obtained, see  .

14Which can be proved by taking the derivative of both sides.