SQLAlchemy - “必须出现在 GROUP BY 子句中或在聚合函数中使用”

刚刚将我的代码迁移到使用 Postgres 而不是 SQLite 的一切工作正常的地方。我正在使用 SQLAlchemy 生成我的所有 SQL,现在我收到此错误:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 580, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.GroupingError: column "movie_category_scores.votes" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT movie_category_scores.votes >= 10 AS anon_1, movie_ca...
               ^

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/movieelo/cinelow/my_server/database/pers_movie_dbf.py", line 187, in get_common_categories
    cats = MovieCategoryScores.query.filter(MovieCategoryScores.movie_id.in_([movie1, movie2])).group_by(MovieCategoryScores.category_id).having(func.count(MovieCategoryScores.category_id) > 1).all()
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3186, in all
    return list(self)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3342, in __iter__
    return self._execute_and_instances(context)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3367, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1252, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1473, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 580, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.GroupingError) column "movie_category_scores.votes" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT movie_category_scores.votes >= 10 AS anon_1, movie_ca...
               ^

[SQL: SELECT movie_category_scores.votes >= %(param_1)s AS anon_1, movie_category_scores.movie_id AS movie_category_scores_movie_id, movie_category_scores.category_id AS movie_category_scores_category_id, movie_category_scores.score AS movie_category_scores_score, movie_category_scores.votes AS movie_category_scores_votes 
FROM movie_category_scores 
WHERE movie_category_scores.movie_id IN (%(movie_id_1)s, %(movie_id_2)s) GROUP BY movie_category_scores.category_id 
HAVING count(movie_category_scores.category_id) > %(count_1)s]
[parameters: {'param_1': 10, 'movie_id_1': 550, 'movie_id_2': 11, 'count_1': 1}]
(Background on this error at: http://sqlalche.me/e/f405)

运行此功能时触发错误:

def get_common_categories(movie1, movie2):
    cats = MovieCategoryScores.query.filter(MovieCategoryScores.movie_id.in_([movie1, movie2])).group_by(MovieCategoryScores.category_id).having(func.count(MovieCategoryScores.category_id) > 1).all()
    categories = []
    for cat in cats:
        categories.append(cat.category.serialize)
    return categories

但错误的顶部实际上是引用另一个与我正在触发的函数没有任何关系的函数,即这个:

def get_category_score(movie_id, category_id):
    query = db.session.query(
    MovieCategoryScores,
    func.rank()
        .over(
            order_by=MovieCategoryScores.score.desc(),
            partition_by=MovieCategoryScores.category_id,
        )
        .label('rank')
    ).filter(or_(MovieCategoryScores.votes >= 10, MovieCategoryScores.movie_id == movie_id))
    # now filter
    query = query.filter(MovieCategoryScores.category_id == category_id)
    query = query.order_by(MovieCategoryScores.category_id, 'rank')
    all_movies = query.subquery()
    new_query = db.session.query(all_movies).filter(all_movies.c.movie_id == movie_id)
    my_movie = new_query.first()
    return my_movie

它们与如下所示的关联表和电影表交互:

class MovieCategoryScores(db.Model):
    movie_id = db.Column(db.Integer, db.ForeignKey('movie.id'), primary_key=True)
    category_id = db.Column(db.Integer, db.ForeignKey('category.id'), primary_key=True)
    score = db.Column(db.Integer)
    votes = db.Column(db.Integer)
    category = relationship("Category", back_populates="movies")
    movie = relationship("Movie", back_populates="categories")

class MoviePersonScores(db.Model):
    movie_id = db.Column(db.Integer, db.ForeignKey('movie.id'), primary_key=True)
    person_id = db.Column(db.Integer, db.ForeignKey('person.id'), primary_key=True)
    job = db.Column(db.String(30), primary_key=True) #0 = actor, 1 = director, 2 = writer
    score = db.Column(db.Integer)
    votes = db.Column(db.Integer, default=0)
    person = relationship("Person", back_populates="movies")
    movie = relationship("Movie", back_populates="people")

class Movie(db.Model):
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(64), nullable = False)
    poster_path = db.Column(db.String(30))
    categories = relationship("MovieCategoryScores", back_populates="movie")
    people = relationship("MoviePersonScores", back_populates="movie")

    @property
    def serialize(self):
        return {
            'id'    : self.id,
            'name'  : self.name,
            'poster_path' : self.poster_path
        }

(Person 和 category 是与 Movie 非常相似的基本类)这些是函数可用的所有导入:

from .. import db
from .dbhandler import Movie, Category, MovieCategoryScores, MoviePersonScores, Person, MovieUserScores
from .user_dbf import getUserById
from sqlalchemy.sql import func
import requests
import json
import random
from itertools import islice
from sqlalchemy import or_

当我在命令行中执行简单的 get_common_people(5, 10) 时,会触发上述错误。

如果我尝试做 get_category_score(550, 0) 我得到:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 580, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/movieelo/cinelow/my_server/database/pers_movie_dbf.py", line 171, in get_category_score
    my_movie = new_query.first()
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3240, in first
    ret = list(self[0:1])
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3018, in __getitem__
    return list(res)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3342, in __iter__
    return self._execute_and_instances(context)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3367, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1252, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1473, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    self.dialect.do_execute(
  File "/usr/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 580, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.InternalError: (psycopg2.errors.InFailedSqlTransaction) current transaction is aborted, commands ignored until end of transaction block
[SQL: SELECT anon_1.anon_2 AS anon_1_anon_2, anon_1.movie_id AS anon_1_movie_id, anon_1.category_id AS anon_1_category_id, anon_1.score AS anon_1_score, anon_1.votes AS anon_1_votes, anon_1.rank AS anon_1_rank 
FROM (SELECT movie_category_scores.votes >= %(param_1)s AS anon_2, movie_category_scores.movie_id AS movie_id, movie_category_scores.category_id AS category_id, movie_category_scores.score AS score, movie_category_scores.vo
tes AS votes, rank() OVER (PARTITION BY movie_category_scores.category_id ORDER BY movie_category_scores.score DESC) AS rank 
FROM movie_category_scores 
WHERE (movie_category_scores.votes >= %(votes_1)s OR movie_category_scores.movie_id = %(movie_id_1)s) AND movie_category_scores.category_id = %(category_id_1)s ORDER BY movie_category_scores.category_id, rank) AS anon_1 
WHERE anon_1.movie_id = %(movie_id_2)s 
 LIMIT %(param_2)s]
[parameters: {'param_1': 10, 'votes_1': 10, 'movie_id_1': 550, 'category_id_1': 0, 'movie_id_2': 550, 'param_2': 1}]
(Background on this error at: http://sqlalche.me/e/2j85)

我找到了一群遇到同样问题的人,但这些解决方案似乎都不适合我在这里所做的,特别是因为我知道这适用于 SQLite,但现在不适用于 Postgres?这似乎与 the order 有很大关系,当您订购东西与获取他们的信息时,但在我的情况下,我只是不知道在订单之前/之后我可能会切换到什么?

知道“get_common_people”函数的哪一部分触发了错误吗?非常感谢!

stack overflow SQLAlchemy - "must appear in the GROUP BY clause or be used in an aggregate function"
原文答案

答案:

作者头像

好吧,经过大量艰苦的研究,我设法解决了这个问题。 This article 很好地解释了正在发生的事情以及一些关于如何解决它的想法(尽管是在常规 SQL 中)。

问题基本上是您不能让具有相同值的列在组中重叠,所以就像错误解释的那样,您需要将它们包含在组中或使用函数使它们唯一。因此,由于是我的“movie_id”属性弄乱了常见类别,我只是在查询中将它们加在一起,这样它们现在只是一个值并且不会重叠。

在 common people 函数中,我可以简单地完全排除所有不必要的值,新函数如下所示:

def get_common_categories(movie1, movie2):
    cats = (
        db.session.query(
            MovieCategoryScores.category_id,
            func.sum(MovieCategoryScores.movie_id)
        )
        .filter(MovieCategoryScores.movie_id.in_([movie1, movie2]))
        .group_by(MovieCategoryScores.category_id)
        .having(func.count(MovieCategoryScores.category_id) > 1)
        .all()
    )
    print(cats)
    categories = []
    for cat in cats:
        categories.append(get_category(cat[0]).serialize)
    return categories

def get_common_people(movie1, movie2):
    myQuery = (
        db.session.query(MoviePersonScores.movie_id, MoviePersonScores.person_id)
        .filter(MoviePersonScores.movie_id.in_([movie1, movie2]))
        .group_by(MoviePersonScores.movie_id, MoviePersonScores.person_id)
        .subquery()
    )
    peps = (
        db.session.query(myQuery.c.person_id)
        .group_by(myQuery.c.person_id)
        .having(func.count(myQuery.c.person_id) > 1)
        .all()
    )
    people = []
    for pep in peps:
        person = get_person(pep[0])
        people.append(person.serialize)
    return people