mySQL statements - order by and group by
Anyone out there know mySQL really well? I understand what order by does, and I understand what group by does. Can these two be used together? If so, how does it work? Does it group stuff together and then WITHIN the groups do the order_by? ie. If we had a database of people’s info - and we did:
SELECT lastname, age FROM people GROUP BY lastname ORDER BY age DESC
would it list the the youngest to oldest people with the same last name - and then move onto the next group of lastnames, etc.? Would that be sorting the last names arbitrarily? If you wanted to sort the last names *reverse* alphabetically as well would you do this?:
SELECT lastname, age FROM people GROUP BY lastname ORDER BY lastname ASC, age DESC
Actually re-reading the GROUP BY descriptiong… maybe I DON’T understand GROUP BY… =sigh= anyone wanna help explain?
November 15th, 2002 at 10:41 am
Ok, (maybe) answering some of my own questions here:
I still don’t get the “GROUP BY” thing - but I think neither of my statements up there make use of it… in both cases, what I’m doing would be accomplished with ORDER BY.
So, for the first statement, it would be:
SELECT lastname, age FROM people ORDER BY lastname DESC, age DESC
To flip the lastnames, but keep the age from young to old:
SELECT lastname, age FROM people ORDER BY lastname ASC, age DESC
Is that right?
November 15th, 2002 at 10:42 am
I’m still looking for a good GROUP BY description…
November 15th, 2002 at 10:49 am
I think it’s GROUP BY that’s giving you trouble here. You use GROUP BY only when you want subaggregates; your sample query contains no aggregate functions, like SUM, and would return an error. If you wanted to know, say, the combined age of people with the same last name, then you would write:
SELECT lastname, SUM(age) FROM people GROUP BY lastname
ORDER BY can be combined with GROUP BY to your heart’s content. If there is no ORDER BY clause then the records are returned in what’s called “natural record order” (meaning it depends on how they were loaded in the first place, and you can’t rely on it). You should note, though, that you can only ORDER BY fields that appear in the aggregate. For instance, in the above query, you can order by lastname, or by SUM(age), but not by age, or firstname.
I hope that helps.
November 15th, 2002 at 12:08 pm
Here’s a blog-related example, since that’s the main things I’ve tested SQL on
For one project, I have recreated the Main Index template completely from the MySQL. I was having a hard time with the MTDateHeader and trying to replicate that.
Here’s the SELECT that worked for me:
$datearray = “SELECT entry_id, date_format(entry_created_on, ‘%m/%d/%Y’) AS date
FROM mt_entry WHERE entry_blog_id = $blogid GROUP BY date ORDER BY entry_id DESC LIMIT 20″;
I think that this meant that it would return a row for each date, as opposed to a row for each entry id. Then I just had another select statement inside of that to display the entries, and I used an if statement to test it to make sure that each entry date matched the header date.
I’m not 100% sure that that’s gonna help you, because I wasn’t positive that I was using it right at the time, but I suppose it might!!
I think Group By would be particularly powerful when used with a count. Here’s a couple sites that talked a bit more about that: this might help and so might this.
November 15th, 2002 at 12:38 pm
another blog related example.. say you wanted a list of all your entries in a particular category. You’d want to select the category and count from the table (let’s call it “weblog”). In order to return proper results, you’d need to group by category - otherwise it doesn’t know how to return the results.
Here’s an example (of a made-up table) that would give you a list of each category and the count in each category ordered by the count:
SELECT category, count(*)
FROM weblog
GROUP BY category
ORDER BY count(*);
Here’s a link that describes the GROUP BY keyword in SQL (which is what I’m using here. . . it’s more or less the same as mySQL anyway): here ya go
November 15th, 2002 at 3:57 pm
Ok, I’m being totally dense here. (Some things just don’t get through my brain so easily - Flash is like that too… but that’ll be another “call for help” post)
Let me “paraphrase” those select statements and put them into real English. It’s the only way I’m really going to understand what’s going on there… Please correct me if I’m wrong. First, Kristine’s select statement:
From the mt_entry table, get the entry_id column values, and the date values (format them like m/d/y - and I’ll refer to them as just “date” from now on) so long as that row’s entry-blog-id column value is $blogid…
but that’s where I get lost… I understand the rest - Sort the list of results by the entry id - biggest to smallest, and stop when you get to 20. But what is GROUP BY date doing in that statement?
As for Lynda’s, here’ my English for it:
Go to table “weblog” and grab the all the values from the category column. Squash the list so that we have one row per category - and count how many rows in each case we had to squash to do that. (default order by is ascending) so return the result-set from smallest to highest number of squashed rows, each category value.
In Lynda’s case - if you were to then display this info: the php code is usually:
$row = fetch_array($result_set) (I’m not looking up the function, but I think it’s fetch_array… you get the idea…)
then to display a value:
$row["category"]
but how would you display the count(*)? just $row["count(*)"] ??
November 15th, 2002 at 4:16 pm
I *think* in my case if you wanted to use php to call things up instead of:
SELECT category, count(*)
you would use
SELECT category, count(*) as count
and then do $row["count"]
I’m looking around to see if that’s right..
November 15th, 2002 at 4:48 pm
Also, order by is pretty easy. More or less it just orders ascending and descending by alphabetical order or count. Group by, as Aaron stated is only really used in aggregate functions (avg, count, max, min, sum, etc)
Here’s another example that might help you (I’m not good at explaining things and I only learn by example, so hopefully this will help.)
Say you wanted to pull up a list of categories like before, but also wanted to separate the info based on open entries versus closed entries. You’ll want to order by the category and then by the count so it’ll display the categories in alphabetical order, then the most amount of entries (open or closed).
In my own made-up little table, you’d use this SQL:
SELECT category, status, count(*) AS count
FROM weblog
GROUP BY category, source
ORDER BY category ASC, count(*) DESC;
This would display something like this (1 = open, 2 = closed):
November 15th, 2002 at 4:49 pm
Oh, phooey. Ignore the “graphical” I was trying to show what the result would look like, but it just didn’t work out.
November 15th, 2002 at 7:38 pm
(Lynda, I fixed that formatting… I needed to add table tags in what was allowed for html in comments)
Ok, now I think I understand it a little better, but I thought that count(*) brought back TOTAL number of rows - in this example here they show this example:
SELECT state, count(state)
FROM customers
GROUP BY state;
So why wouldn’t your statement be:
SELECT category, status, count(category), count(source)..etc.
What am I missing here?
November 15th, 2002 at 10:58 pm
COUNT(*) (as well as COUNT(any_column)) brings back the total number of returned rows, yes, but only in the absence of a GROUP BY clause.
So in your sample query COUNT(source), and COUNT(category) would all return the total number of rows, i.e., the same number. That’s not what you want. But in Lynda’s query the COUNT() function is applied over the groups that are included in the GROUP BY clause, and that is what you want (I think).
GROUP BY just applies aggregate functions to the individual groups that are specified. That’s its only reason for living.
November 16th, 2002 at 12:59 am
Jennifer, are you doing any mySQL commands through telnet or ssh? If you’re not, I recommend you login to your mysql database through a telnet or ssh prompt and do some select statements, etc to see what is returned.
Doing things that way will probably help you better understand both the order by and group by keywords. It’s kind of difficult to visualize what is being returned by just putting the mysql statements into php and hoping for the best.
November 17th, 2002 at 2:26 am
I agree with Lynda on the last comment. Use telnet or ssh and actually run these queries and look at the results, it will all become clear then. Also, it might be a good idea to downloat mysql and load it on your pc for practicing.
November 17th, 2002 at 5:57 pm
Could you please direct me to the “How To” location on how to do the LINK, BOLD and ITALICS buttons.
What about the “Email this Entry” feature? Is that a standard MT feature?
November 17th, 2002 at 11:09 pm
Well, I think I have a better understanding of the group by thing. I’m sure doing some tests on a real database will clarify further, but I had to get this far so I had a *vague* idea of what I was testing…
November 20th, 2002 at 9:43 am
DISCLAIMER: I have not done more than skim the above comments. I do not think what I will say is listed there. If it is, my apologies.
As mentioned, in SQL “GROUP BY” is used to aggregate information. Several examples were mentioned. (I find COUNT, SUM, and MAX to be the most useful.) What was not mentioned is this:
If all variables other than those in the GROUP BY section are aggregated in the SELECT section, then everything works the way you think. If, however, you include other variables, then SQL creates the requested aggregate variables and then merges them back. An example will make this clearer.
If you have a database with date and title and did this:
SELECT date, count(title) as count
FROM db
GROUP BY date
you would get a list of dates with the number of titles for each date.
If you did this instead
SELECT date, title, count(title) as count
FROM db
GROUP BY date
you would, instead get your original table with a new column containing the daily count of titles. You could get the same result by running the first statement and then another which joined the result back to the original table (by date).
As far as the ORDER BY part: it simply takes the final output and sorts it.
November 20th, 2002 at 10:14 am
Ok, ORDER BY - I get 100%. That’s easy. It’s just sorting the data.
I’m *beginning* to get GROUP BY - but I don’t see how your two examples are different…
If you had just:
SELECT date title FROM db
Each time you get a row (mysql_fetch_array) you can get the date and the title. (I’m assuming here it would be possible to have dates duplicated… more than one title per date)
In your example:
SELECT date, count(title) as count FROM db GROUP BY date
then each time you get a row, you can display the date and the count (number of titles for that date)
like:
11/20/2002 - count:2
11/21/2002 - count: 3
And here is where (I think) I get lost,
in your last example:
SELECT date, title, count(title) as count FROM db GROUP BY date
then each time you get a row you can display the date, and the count (number of titles for that date) AND(?) the title… so again, we’re back to showing mulitple dates? so displaying the results could look like:
11/20/2002 - This is a title - count: 2
11/20/2002 - This is another title - count: 2
11/21/2002 - This is a new title - count: 3
11/21/2002 - This is yet a new title - count: 3
11/21/2002 - And our last title - count: 3
Is that correct?
December 2nd, 2002 at 1:47 pm
Yes, that is exactly correct. (Sorry I took so long to come back.)
December 11th, 2002 at 11:37 pm
Just a note to myself here - found this:
SELECT seminar_name
FROM student_table GROUP BY seminar_name
SELECT DISTINCT seminar_name
FROM student_table
on this page on phpbuilder. (I had better luck with that “DISTINCT” version)