Thứ Ba, 25 tháng 2, 2014

Categories and Subcategories

The adjacency model

The fundamental structure of the adjacency model is a one-to-many relationship between a parent entry and its child entries. As with any one-to-many relationship, the child entries carry a foreign key to their parent. What makes the adjacency model different is that the parent and child entries are both stored in the same table.

create table categories
( id       integer     not null  primary key 
, name     varchar(37) not null
, parentid integer     null
, foreign key parentid_fk (parentid) 
      references categories (id)
);
 
Here's some sample data that might populate this table, and we should be able to get an idea of the parent-child relationships (if not grasp the entire hierarchy) just by looking at the data:

idnameparentid
1animalNULL
2vegetableNULL
3mineralNULL
4doggie1
5kittie1
6horsie1
7gerbil1
8birdie1
9carrot2
10tomato2
11potato2
12celery2
13rutabaga2
14quartz3
15feldspar3
16silica3
17gypsum3
18hunting4
19companion4
20herding4
21setter18
22pointer18
23terrier18
24poodle19
25chihuahua19
26shepherd20
27collie20

Displaying all categories and subcategories: site maps and navigation bars

To display the hierarchy, we must first retrieve it. The following method involves using as many LEFT OUTER JOINs as necessary to cover the depth of the deepest tree. For our sample data, the deepest tree has four levels, so the query requires four self-joins. Each join goes "down" a level from the node above it. The query begins at the root nodes.

select root.name  as root_name
     , down1.name as down1_name
     , down2.name as down2_name
     , down3.name as down3_name
  from categories as root
left outer
  join categories as down1
    on down1.parentid = root.id
left outer
  join categories as down2
    on down2.parentid = down1.id
left outer
  join categories as down3
    on down3.parentid = down2.id
 where root.parentid is null
order 
    by root_name 
     , down1_name 
     , down2_name 
     , down3_name
 
Notice how the WHERE clause ensures that only paths from the root nodes are followed. This query produces the following result set:

root_namedown1_namedown2_namedown3_name
animalbirdieNULLNULL
animaldoggiecompanionchihuahua
animaldoggiecompanionpoodle
animaldoggieherdingcollie
animaldoggieherdingshepherd
animaldoggiehuntingpointer
animaldoggiehuntingsetter
animaldoggiehuntingterrier
animalgerbilNULLNULL
animalhorsieNULLNULL
animalkittieNULLNULL
mineralfeldsparNULLNULL
mineralgypsumNULLNULL
mineralquartzNULLNULL
mineralsilicaNULLNULL
vegetablecarrotNULLNULL
vegetableceleryNULLNULL
vegetablepotatoNULLNULL
vegetablerutabagaNULLNULL
vegetabletomatoNULLNULL

Each row in the result set represents a distinct path from a root node to a leaf node. Notice how the LEFT OUTER JOIN, when extended "below" the leaf node in any given path, returns NULL (representing the fact that there was no node below that node, i.e. satisfying that join condition).
As we can see, this result set contains all our original categories and subcategories. If the categories and subcategories are being displayed on a web site, this query can therefore be used to generate the complete site map. An abbreviated query, that goes down only a certain number of levels from the roots, regardless of whether there may be nodes at deeper levels, can be used for the site's navigation bar.
We can display this sample data using nested unordered lists like this:
  • animal
    • birdie
    • doggie
      • companion
        • chihuahua
        • poodle
      • herding
        • collie
        • shepherd
      • hunting
        • pointer
        • setter
        • terrier
    • gerbil
    • horsie
    • kittie
  • mineral
    • feldspar
    • gypsum
    • quartz
    • silica
  • vegetable
    • carrot
    • celery
    • potato
    • rutabaga
    • tomato

The path to the root: the breadcrumb trail

Retrieving the path from any given node, whether it is a leaf node or not, to the root at the top of its path, is very similar to the site map query. Again, we use LEFT OUTER JOINs, but this time we go "up" the tree from the node, rather than "down."
select node.name as node_name 
     , up1.name as up1_name 
     , up2.name as up2_name 
     , up3.name as up3_name 
  from categories as node
left outer 
  join categories as up1 
    on up1.id = node.parentid  
left outer 
  join categories as up2
    on up2.id = up1.parentid  
left outer 
  join categories as up3
    on up3.id = up2.parentid
order
    by node_name    
Here's the result set from this query:

node_nameup1_nameup2_nameup3_name
animalNULLNULLNULL
birdieanimalNULLNULL
carrotvegetableNULLNULL
celeryvegetableNULLNULL
chihuahuacompaniondoggieanimal
collieherdingdoggieanimal
companiondoggieanimalNULL
doggieanimalNULLNULL
feldsparmineralNULLNULL
gerbilanimalNULLNULL
gypsummineralNULLNULL
herdingdoggieanimalNULL
horsieanimalNULLNULL
huntingdoggieanimalNULL
kittieanimalNULLNULL
mineralNULLNULLNULL
pointerhuntingdoggieanimal
poodlecompaniondoggieanimal
potatovegetableNULLNULL
quartzmineralNULLNULL
rutabagavegetableNULLNULL
setterhuntingdoggieanimal
shepherdherdingdoggieanimal
silicamineralNULLNULL
terrierhuntingdoggieanimal
tomatovegetableNULLNULL
vegetableNULLNULLNULL

Here each row in the result set is a single path, one for every node in the table. On a web site, such a path is often called a breadcrumb trail. (This name is somewhat misleading, because it suggests that it might represent how the visitor arrived at the page, which is not always the case. The accepted meaning of breadcrumb is simply the path from the root.)
In practice, we'd have a WHERE clause that would specify a single node, so in effect, the results above are all of the breadcrumbs in the table.
To display a breadcrumb trail in the normal fashion, from root to node, just display the result set columns in reverse order, and ignore the nulls. For example, let's say we run the above query for the category "companion" and get this:

node_nameup1_nameup2_nameup3_name
companiondoggieanimalNULL

The breadcrumb would look like this:
Simple, eh?

Resources

Listamatic: one list, many options
The power of CSS when applied to the lowly UL.
Trees in SQL by Joe Celko
The nested set model, alternative to the adjacency list model.
Storing Hierarchical Data in a Database
Modified Preorder Tree Traversal method.
Relational Integrity
Primary and foreign keys and stuff like that.

Không có nhận xét nào:

Đăng nhận xét