mySQL statements - order by and group by

Anyone out there know mySQL really well? I understand what order by does, and I understand what group by does. Can these two be used together? If so, how does it work? Does it group stuff together and then WITHIN the groups do the order_by? ie. If we had a database of people’s info - and we did:

SELECT lastname, age FROM people GROUP BY lastname ORDER BY age DESC

would it list the the youngest to oldest people with the same last name - and then move onto the next group of lastnames, etc.? Would that be sorting the last names arbitrarily? If you wanted to sort the last names *reverse* alphabetically as well would you do this?:

SELECT lastname, age FROM people GROUP BY lastname ORDER BY lastname ASC, age DESC

Actually re-reading the GROUP BY descriptiong… maybe I DON’T understand GROUP BY… =sigh= anyone wanna help explain?

19 Responses to “mySQL statements - order by and group by”

  1. Jennifer Says:

    Ok, (maybe) answering some of my own questions here:
    I still don’t get the “GROUP BY” thing - but I think neither of my statements up there make use of it… in both cases, what I’m doing would be accomplished with ORDER BY.

    So, for the first statement, it would be:

    SELECT lastname, age FROM people ORDER BY lastname DESC, age DESC

    To flip the lastnames, but keep the age from young to old:

    SELECT lastname, age FROM people ORDER BY lastname ASC, age DESC

    Is that right?

  2. Jennifer Says:

    I’m still looking for a good GROUP BY description…

  3. Aaron Haspel Says:

    I think it’s GROUP BY that’s giving you trouble here. You use GROUP BY only when you want subaggregates; your sample query contains no aggregate functions, like SUM, and would return an error. If you wanted to know, say, the combined age of people with the same last name, then you would write:

    SELECT lastname, SUM(age) FROM people GROUP BY lastname

    ORDER BY can be combined with GROUP BY to your heart’s content. If there is no ORDER BY clause then the records are returned in what’s called “natural record order” (meaning it depends on how they were loaded in the first place, and you can’t rely on it). You should note, though, that you can only ORDER BY fields that appear in the aggregate. For instance, in the above query, you can order by lastname, or by SUM(age), but not by age, or firstname.

    I hope that helps.

  4. kristine Says:

    Here’s a blog-related example, since that’s the main things I’ve tested SQL on :) For one project, I have recreated the Main Index template completely from the MySQL. I was having a hard time with the MTDateHeader and trying to replicate that.

    Here’s the SELECT that worked for me:

    $datearray = “SELECT entry_id, date_format(entry_created_on, ‘%m/%d/%Y’) AS date
    FROM mt_entry WHERE entry_blog_id = $blogid GROUP BY date ORDER BY entry_id DESC LIMIT 20″;

    I think that this meant that it would return a row for each date, as opposed to a row for each entry id. Then I just had another select statement inside of that to display the entries, and I used an if statement to test it to make sure that each entry date matched the header date.

    I’m not 100% sure that that’s gonna help you, because I wasn’t positive that I was using it right at the time, but I suppose it might!! :)
    I think Group By would be particularly powerful when used with a count. Here’s a couple sites that talked a bit more about that: this might help and so might this.

  5. lynda Says:

    another blog related example.. say you wanted a list of all your entries in a particular category. You’d want to select the category and count from the table (let’s call it “weblog”). In order to return proper results, you’d need to group by category - otherwise it doesn’t know how to return the results.

    Here’s an example (of a made-up table) that would give you a list of each category and the count in each category ordered by the count:

    SELECT category, count(*)
    FROM weblog
    GROUP BY category
    ORDER BY count(*);

    Here’s a link that describes the GROUP BY keyword in SQL (which is what I’m using here. . . it’s more or less the same as mySQL anyway): here ya go

  6. Jennifer Says:

    Ok, I’m being totally dense here. (Some things just don’t get through my brain so easily - Flash is like that too… but that’ll be another “call for help” post)

    Let me “paraphrase” those select statements and put them into real English. It’s the only way I’m really going to understand what’s going on there… Please correct me if I’m wrong. First, Kristine’s select statement:

    From the mt_entry table, get the entry_id column values, and the date values (format them like m/d/y - and I’ll refer to them as just “date” from now on) so long as that row’s entry-blog-id column value is $blogid…

    but that’s where I get lost… I understand the rest - Sort the list of results by the entry id - biggest to smallest, and stop when you get to 20. But what is GROUP BY date doing in that statement?

    As for Lynda’s, here’ my English for it:

    Go to table “weblog” and grab the all the values from the category column. Squash the list so that we have one row per category - and count how many rows in each case we had to squash to do that. (default order by is ascending) so return the result-set from smallest to highest number of squashed rows, each category value.

    In Lynda’s case - if you were to then display this info: the php code is usually:

    $row = fetch_array($result_set) (I’m not looking up the function, but I think it’s fetch_array… you get the idea…)

    then to display a value:
    $row["category"]

    but how would you display the count(*)? just $row["count(*)"] ??

  7. lynda Says:

    I *think* in my case if you wanted to use php to call things up instead of:
    SELECT category, count(*)
    you would use
    SELECT category, count(*) as count

    and then do $row["count"]

    I’m looking around to see if that’s right..

  8. lynda Says:

    Also, order by is pretty easy. More or less it just orders ascending and descending by alphabetical order or count. Group by, as Aaron stated is only really used in aggregate functions (avg, count, max, min, sum, etc)

    Here’s another example that might help you (I’m not good at explaining things and I only learn by example, so hopefully this will help.)

    Say you wanted to pull up a list of categories like before, but also wanted to separate the info based on open entries versus closed entries. You’ll want to order by the category and then by the count so it’ll display the categories in alphabetical order, then the most amount of entries (open or closed).
    In my own made-up little table, you’d use this SQL:

    SELECT category, status, count(*) AS count
    FROM weblog
    GROUP BY category, source
    ORDER BY category ASC, count(*) DESC;

    This would display something like this (1 = open, 2 = closed):

    CATEGORY STATUS COUNT
    Bookmarks open 50
    Bookmarks closed 7
    JavaScript open 10
    JavaScript closed 2
    PHP open 30
    PHP closed 1

  9. lynda Says:

    Oh, phooey. Ignore the “graphical” I was trying to show what the result would look like, but it just didn’t work out. :)

  10. Jennifer Says:

    (Lynda, I fixed that formatting… I needed to add table tags in what was allowed for html in comments)

    Ok, now I think I understand it a little better, but I thought that count(*) brought back TOTAL number of rows - in this example here they show this example:

    SELECT state, count(state)
    FROM customers
    GROUP BY state;

    So why wouldn’t your statement be:

    SELECT category, status, count(category), count(source)..etc.

    What am I missing here?

  11. Aaron Haspel Says:

    COUNT(*) (as well as COUNT(any_column)) brings back the total number of returned rows, yes, but only in the absence of a GROUP BY clause.

    So in your sample query COUNT(source), and COUNT(category) would all return the total number of rows, i.e., the same number. That’s not what you want. But in Lynda’s query the COUNT() function is applied over the groups that are included in the GROUP BY clause, and that is what you want (I think).

    GROUP BY just applies aggregate functions to the individual groups that are specified. That’s its only reason for living.

  12. lynda Says:

    Jennifer, are you doing any mySQL commands through telnet or ssh? If you’re not, I recommend you login to your mysql database through a telnet or ssh prompt and do some select statements, etc to see what is returned.

    Doing things that way will probably help you better understand both the order by and group by keywords. It’s kind of difficult to visualize what is being returned by just putting the mysql statements into php and hoping for the best. :)

  13. Mad Bull Says:

    I agree with Lynda on the last comment. Use telnet or ssh and actually run these queries and look at the results, it will all become clear then. Also, it might be a good idea to downloat mysql and load it on your pc for practicing.

  14. Russ Says:

    Could you please direct me to the “How To” location on how to do the LINK, BOLD and ITALICS buttons.

    What about the “Email this Entry” feature? Is that a standard MT feature?

  15. Jennifer Says:

    Well, I think I have a better understanding of the group by thing. I’m sure doing some tests on a real database will clarify further, but I had to get this far so I had a *vague* idea of what I was testing…

  16. Steven Gallaher Says:

    DISCLAIMER: I have not done more than skim the above comments. I do not think what I will say is listed there. If it is, my apologies.

    As mentioned, in SQL “GROUP BY” is used to aggregate information. Several examples were mentioned. (I find COUNT, SUM, and MAX to be the most useful.) What was not mentioned is this:

    If all variables other than those in the GROUP BY section are aggregated in the SELECT section, then everything works the way you think. If, however, you include other variables, then SQL creates the requested aggregate variables and then merges them back. An example will make this clearer.

    If you have a database with date and title and did this:

    SELECT date, count(title) as count
    FROM db
    GROUP BY date

    you would get a list of dates with the number of titles for each date.

    If you did this instead

    SELECT date, title, count(title) as count
    FROM db
    GROUP BY date

    you would, instead get your original table with a new column containing the daily count of titles. You could get the same result by running the first statement and then another which joined the result back to the original table (by date).

    As far as the ORDER BY part: it simply takes the final output and sorts it.

  17. Jennifer Says:

    Ok, ORDER BY - I get 100%. That’s easy. It’s just sorting the data.

    I’m *beginning* to get GROUP BY - but I don’t see how your two examples are different…

    If you had just:
    SELECT date title FROM db
    Each time you get a row (mysql_fetch_array) you can get the date and the title. (I’m assuming here it would be possible to have dates duplicated… more than one title per date)

    In your example:
    SELECT date, count(title) as count FROM db GROUP BY date
    then each time you get a row, you can display the date and the count (number of titles for that date)

    like:
    11/20/2002 - count:2
    11/21/2002 - count: 3

    And here is where (I think) I get lost,
    in your last example:
    SELECT date, title, count(title) as count FROM db GROUP BY date
    then each time you get a row you can display the date, and the count (number of titles for that date) AND(?) the title… so again, we’re back to showing mulitple dates? so displaying the results could look like:

    11/20/2002 - This is a title - count: 2
    11/20/2002 - This is another title - count: 2

    11/21/2002 - This is a new title - count: 3
    11/21/2002 - This is yet a new title - count: 3
    11/21/2002 - And our last title - count: 3

    Is that correct?

  18. Steven Gallaher Says:

    Yes, that is exactly correct. (Sorry I took so long to come back.)

  19. Jennifer Says:

    Just a note to myself here - found this:

    SELECT seminar_name
    FROM student_table GROUP BY seminar_name

    SELECT DISTINCT seminar_name
    FROM student_table

    on this page on phpbuilder. (I had better luck with that “DISTINCT” version)