PostgreSQL & PHP Tutorials - Database Design Issues - Big Query

PostgreSQL »  Database Design Issues
Random Stuff »  Database Design Issues
Forum Project »  Database Design Issues

Posted By Chris Smith Posted on 20 Apr 2006, 10:54 PM
Viewing page 2 of 5
« | Back | Next | »

Using that database design, to get the number of threads and posts per forum, you end up with a complicated query looking something like this:


SELECT f.forumid, f.forumtitle, forumnode, f.createdate, COUNT(threadid) AS threadcount, (SELECT COUNT(postid) FROM myforum_posts p WHERE p.threadid=t.threadid) AS postcount FROM myforum_forums f LEFT OUTER JOIN myforum_threads t ON f.forumid=t.forumid GROUP BY f.forumid, f.forumtitle, f.forumnode, f.createdate, t.threadid ORDER BY forumnode ASC;


Woah!

We have to use a LEFT OUTER JOIN between the forum and thread tables in case a forum has no threads. If we used a straight or inner join, the forum would not show up if there were no threads - which means we can't go in and add a thread.

Then we have to do a subquery using that threadid against the posts table to see how many posts are in a thread. This is all added together by the group by clause.

This will become quite slow once a few forums, threads and posts are included because of postgresql's MVCC architecture - it has to go through the table and scan it to see how many rows are in that table, it can't be cached (if you have a lot of rows, a count will be quite slow).

What can we do?
Viewing page 2 of 5
« | Back | Next | »



Avg Rating: 5
Vote Count: 6


              

  New Reply


You've missed an important point. Consider the following:

---
create table forums (
forumid int primary key
);

create table threads (
threadid int primary key,
forumid int references forums
);

create table posts (
postid int primary key,
threadid int
);

insert into forums values(1);
insert into forums values(2);

insert into threads values(1, 1);
insert into threads values(2, 1);

insert into posts values(1, 1);
insert into posts values(2, 1);

insert into posts values(3, 2);
insert into posts values(4, 2);
insert into posts values(5, 2);
insert into posts values(6, 2);

select f.forumid, count(distinct t.threadid), count(distinct p.postid) from forums f
left outer join threads t on(f.forumid = t.forumid) left outer join
posts p on(t.threadid = p.threadid) group by 1;

---

This is a correct design and simpler query. However, this is not the best way to do what you're trying to achieve in PostgreSQL. Instead, you should use rules and triggers to maintain a forum, thread and post count in a count cache table. This will give you O(n) count performance and can be done without modifying your application code. Just create insert and delete triggers/rules to manage the counts when posts/threads are added and removed.
Gavin Sherry 24 Apr 2006 Reply
`-- I thought that's what I was trying to demonstrate.. or do you mean having a separate count table instead of including it in the forum/thread table ? Chris Smith 24 Apr 2006 Reply


Want to post a comment? Fill in the details below.

Your Name  : 
Your Email  : 
Your Website  : 
Spam Check! Please answer this question  : 5 + 7 =
Comment