IWETHEY v. 0.3.0 | TODO
1,095 registered users | 0 active users | 0 LpH | Statistics
Login | Create New User
IWETHEY Banner

Welcome to IWETHEY!

New How to store a tree structure in a relational db?
We have a tree data structure in Java which is currently stored in LDAP (precisely because an LDAP directory is a tree). We would like to use a relational DB for the datastore. Anyone have any good ideas on how to store a tree in a relational db in a way that will make it easy to get at the data? The tree consists of containers which can be nested, which contain other objects.
New Ask Bryce!
He seems to know exactly how to do this!
--
[link|mailto:greg@gregfolkert.net|greg],
[link|http://www.iwethey.org/ed_curry|REMEMBER ED CURRY!] @ iwethey

I've decided to become a perfectionist.
That way I'll have more reasons to hate people.
Your recycled electrons annoy me. Please use new electrons.
New wiki links
For nodes being in the same table, the simplest approach is to have a ParentID that references the parent node's key. Compound keys require more "toParent" columns, but think about using a generated key instead perhaps. Use zero or null for the top-most. That is all that is needed to encode a tree, but there are others ways some claim are faster. Anyhow, here are some wiki topics:

[link|http://www.c2.com/cgi/wiki?TreeInSql|http://www.c2.com/cgi/wiki?TreeInSql]

[link|http://www.c2.com/cgi/wiki?RelationalAndTrees|http://www.c2.com/cg...elationalAndTrees]
________________
oop.ismad.com
Expand Edited by tablizer June 3, 2004, 11:57:01 PM EDT
New Which database engine?
And what are the characteristics of the data? More reads than writes?
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Re: Which database engine?
A number, for now both MS-SQL and Oracle 9i. The data is configuration data that we display and that the user can change. There are probably more reads then writes, but writes definately need to be fast.
New CONNECT BY is out then. :-)
CONNECT BY is out if you have to support Transact-SQL.

You could use nested set trees, which for small hierarchies are probably fast enough on inserts, or simple parentID columns and some code, which can be considerably slower for reading the entire tree. A third method, encoding the path to the node with the node itself, limits the depth you can represent (not an issue if you have shallow trees) and requires string compares for lookups.

With a nested set tree, simple updates will be just as fast as any keyed update would be. Only inserts, deletes, and rare operations like moves are slower, and then only when you are manipulating the left side of the tree. I use a hybrid here, as finding the immediate children/ancestor only of a node can be problematic with a pure nested set.

If you want some example code, I have both Transact-SQL and plpgsql (Postgres, similar to PL/SQL).
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New why not store it in a document
it seems that ldap would be a natural solution because it is a database designed for such uses.
thanx,
bill
Time for Lord Stanley to get a Tan
questions, help? [link|mailto:pappas@catholic.org|email pappas at catholic.org]
New We want to move away from LDAP ...
for a number of reasons:

  1. the system already uses a relational database to store most of the data it is very expensive to support another database.

  2. it means that some of the data is duplicated in both LDAP and the relational db causing data synchronization problems

  3. the system needs to be reliable, i.e. transactional, to the best of my knowledge LDAP doesn't support transactions meaning that we could end up with inconsistent data

  4. we use the data in reports

Expand Edited by bluke June 4, 2004, 02:06:02 AM EDT
New Just use Windows Server 2003 and Active Directory!
/me runs away


Peter
[link|http://www.debian.org|Shill For Hire]
[link|http://www.kuro5hin.org|There is no K5 Cabal]
[link|http://guildenstern.dyndns.org|Blog]
New **thwak!**
jb4
shrub\ufffdbish (Am., from shrub + rubbish, after the derisive name for America's 43 president; 2003) n. 1. a form of nonsensical political doubletalk wherein the speaker attempts to defend the indefensible by lying, obfuscation, or otherwise misstating the facts; GIBBERISH. 2. any of a collection of utterances from America's putative 43rd president. cf. BULLSHIT

New See what Scott said
There are multiple strategies for storing a tree in a relational database. Each results in some operations being much easier on the database than others. Which one you'll want to do depends strongly on what you'll be doing with the data.

In particular the "nested set-trees" implementation makes insertions expensive, but reads cheap. The naive "each child points at its parent" implementation makes insertions cheap, but reads expensive. Unless you have a database with extensions specifically for this job (I think that Oracle introduced that in 9i), you'll probably have to do some mix of procedural and relational logic to access the data.

Cheers,
Ben
To deny the indirect purchaser, who in this case is the ultimate purchaser, the right to seek relief from unlawful conduct, would essentially remove the word consumer from the Consumer Protection Act
- [link|http://www.techworld.com/opsys/news/index.cfm?NewsID=1246&Page=1&pagePos=20|Nebraska Supreme Court]
New Before 9i
At least 8.1.6, IIRC, possibly before then.

The Oracle extension (CONNECT BY seems to ring a bell) does not work with joins, so everything has to be in the same table.

For small hierarchies, or for insertions into the rightish side of the tree (like most forum posts),the nested set tree is not that expensive at all.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New I don't use it, so I wouldn't know. Correction appreciated.
To deny the indirect purchaser, who in this case is the ultimate purchaser, the right to seek relief from unlawful conduct, would essentially remove the word consumer from the Consumer Protection Act
- [link|http://www.techworld.com/opsys/news/index.cfm?NewsID=1246&Page=1&pagePos=20|Nebraska Supreme Court]
New Re: How to store a tree structure in a relational db?
[link|http://www.intelligententerprise.com/001020/celko.jhtml?_requestid=52999|Trees in SQL ]
New AKA Nested Set Trees. :-)
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New After reading about nested sets ...
It sounds like that they are very expensive for inserts in that you have to shift all the left and rights after the node that you inserted and therefore if you have a tree where there are a lot of updates this becomes a problem. Since in my case teh data is fairly dynamic and is updated often it doesn't sound like nested sets fit the bill. Our tree is fairly shallow so encoding the hierarchy in the nodes may be the best solution.
New Updates aren't an issue.
Only inserts and deletes. If the tree itself stays fairly stable and only the data is changed, then you won't have an issue. And an update of a tree is only expensive if 1) the tree is very large and 2) there is lock contention from more than one insert at a time.

We have very large trees here (each thread is a full nested set hierarchy) without inserts being an issue. It's probably worth testing.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New If I had to do it over....
...I think I might've put the nested set data in a seperate table. This would help on the inserts/deletes by minimizing the amount of data to traverse.

One other side note: In my nested sets, I have different customers with unrelated trees. The nested set is keyed of the customer id first, meaning that the insert/delete operations for one customer do not have to effect all the data in the table. In effect, I have multiple roots in the table. Each root acts independently.

Not sure if it is similar, but I was thinking that in something like zIWTHEY, you could have each top level message be a seperate tree.
New That's exactly how I do it.
Each thread (top level message) is its own tree.

This version separates the tree data from the content data, but it's a pain in the ass to deal with. The new one keeps them in the same table.
Regards,

-scott anderson

"Welcome to Rivendell, Mr. Anderson..."
New Guess I'll have to try it for myself
I figure a view would be in order for the times I need to merge the tree data with the data itself. The problem in my case is that I have to keep historical data on the values (similar to you keeping the edit history on the messages), but the tree data only needs to be current (or if it needs to be kept, it needs to have it's own archive seperate from the rest).

New We have a lot of inserts
Thanks for all the info
     How to store a tree structure in a relational db? - (bluke) - (20)
         Ask Bryce! - (folkert) - (1)
             wiki links - (tablizer)
         Which database engine? - (admin) - (2)
             Re: Which database engine? - (bluke) - (1)
                 CONNECT BY is out then. :-) - (admin)
         why not store it in a document - (boxley) - (1)
             We want to move away from LDAP ... - (bluke)
         Just use Windows Server 2003 and Active Directory! - (pwhysall) - (1)
             **thwak!** -NT - (jb4)
         See what Scott said - (ben_tilly) - (2)
             Before 9i - (admin) - (1)
                 I don't use it, so I wouldn't know. Correction appreciated. -NT - (ben_tilly)
         Re: How to store a tree structure in a relational db? - (johnu) - (7)
             AKA Nested Set Trees. :-) -NT - (admin) - (6)
                 After reading about nested sets ... - (bluke) - (5)
                     Updates aren't an issue. - (admin) - (3)
                         If I had to do it over.... - (ChrisR) - (2)
                             That's exactly how I do it. - (admin) - (1)
                                 Guess I'll have to try it for myself - (ChrisR)
                     We have a lot of inserts - (bluke)

Ubersoft - Standing On The Necks Of Giants
525 ms