explain

前面有很多探索，希望看正确解答可直接看 "right"；最终结果见 "pass" 。

pacman.py 的参数对于对于帮助我们进行测试很有用：

$ python pacman.py -h

Usage: 
    USAGE:      python pacman.py <options>
    EXAMPLES:   (1) python pacman.py
                    - starts an interactive game
                (2) python pacman.py --layout smallClassic --zoom 2
                OR  python pacman.py -l smallClassic -z 2
                    - starts an interactive game on a smaller board, zoomed in
    

Options:
  -h, --help            show this help message and exit
  -n GAMES, --numGames=GAMES
                        the number of GAMES to play [Default: 1]
  -l LAYOUT_FILE, --layout=LAYOUT_FILE
                        the LAYOUT_FILE from which to load the map layout
                        [Default: mediumClassic]
  -p TYPE, --pacman=TYPE
                        the agent TYPE in the pacmanAgents module to use
                        [Default: KeyboardAgent]
  -t, --textGraphics    Display output as text only
  -q, --quietTextGraphics
                        Generate minimal output and no graphics
  -g TYPE, --ghosts=TYPE
                        the ghost agent TYPE in the ghostAgents module to use
                        [Default: RandomGhost]
  -k NUMGHOSTS, --numghosts=NUMGHOSTS
                        The maximum number of ghosts to use [Default: 4]
  -z ZOOM, --zoom=ZOOM  Zoom the size of the graphics window [Default: 1.0]
  -f, --fixRandomSeed   Fixes the random seed to always play the same game
  -r, --recordActions   Writes game histories to a file (named by the time
                        they were played)
  --replay=GAMETOREPLAY
                        A recorded game file (pickle) to replay
  -a AGENTARGS, --agentArgs=AGENTARGS
                        Comma separated values sent to agent. e.g.
                        "opt1=val1,opt2,opt3=val3"
  -x NUMTRAINING, --numTraining=NUMTRAINING
                        How many episodes are training (suppresses output)
                        [Default: 0]
  --frameTime=FRAMETIME
                        Time to delay between frames; <0 means keyboard
                        [Default: 0.1]
  -c, --catchExceptions
                        Turns on exception handling and timeouts during games
  --timeout=TIMEOUT     Maximum length of time an agent can spend computing in
                        a single game [Default: 30]

若加入下列语句，测试运行：

print("Start:", problem.getStartState())
print("Is the start a goal?", problem.isGoalState(problem.getStartState()))
print("Start's successors:", problem.getSuccessors(problem.getStartState()))

# Start: (5, 5)
# Is the start a goal? False
# Start's successors: [((5, 4), 'South', 1), ((4, 5), 'West', 1)]

窥见了 problem/state 的结构，其中 successors 是两个 successor，其内容依次代表：(state，action，cost) => （坐标，方向，步长）。

随便补全 dfs 观察地图：

def depthFirstSearch(problem: SearchProblem):
    return []
    util.raiseNotDefined()

可以看出来，“坐标原点”在左下角，是经典的“2D函数坐标”而非“矩阵坐标”。

Q1 - Q4

explore

根据以前所学，dfs尝试 Q1：

def depthFirstSearch(problem: SearchProblem):
    reached = set()  # 记录已经访问过的节点
    _stack = util.Stack()  # 用栈来实现DFS
    _stack.push(
        (problem.getStartState(), [])
    )  # 将初始状态（坐标，到达该坐标需要的方向组）压入栈中
    while not _stack.isEmpty():
        state, actions = _stack.pop()
        if problem.isGoalState(state):
            return actions
        if state not in reached:
            reached.add(state)
            for successor, action, _ in problem.getSuccessors(state):
                new_path = actions + [action]
                _stack.push((successor, new_path))
    return actions
    util.raiseNotDefined()

顺利通过 Q1。

right

下一个？NO! 在 note02 中我们知道三个搜索算法总体差不多，且这里都是 graph search:

[!CITE]

Now it’s time to write full-fledged generic search functions to help Pacman plan routes! Pseudocode for the search algorithms you’ll write can be found in the lecture slides.

function GRAPH-SEARCH(problem, frontier) return a solution or failure
    reached ← an empty set
    frontier ← INSERT(MAKE-NODE(INITIAL-STATE[problem]), frontier)
    while not IS-EMPTY(frontier) do
        node ← POP(frontier)
        if problem.IS-GOAL(node.STATE) then
            return node
        end
        if node.STATE is not in reached then
            add node.STATE in reached
            for each child-node in EXPAND(problem, node) do
                frontier ← INSERT(child-node, frontier)
            end
        end
    end
    return failure

所以我们照猫画虎，给出：

def graphSearch(problem, frontier):
    """
    Implements the graph search algorithm using the provided frontier data structure.
    """
    reached = set()
    frontier.push((problem.getStartState(), []))  # take (state, actions) as node
    
    while not frontier.isEmpty():
        state, actions = frontier.pop()
    
        if problem.isGoalState(state):
            return actions
    
        if state not in reached:
            reached.add(state)
            for successor, action, _ in problem.getSuccessors(state):
                new_path = actions + [action]
                frontier.push((successor, new_path))
    
    return None  # No solution found

显然，这里的 frontier 在 DFS 中就是 Stack，在 BFS 中就是 Queue；在 UCS/ $A^*S$ 中是 PriorityQueue。

前两个好说；PrigirityQueue 的 push 与前二者参数就不同，自然是报错了；但是注意到 util.py 中的：

class PriorityQueueWithFunction(PriorityQueue): # success PriorityQueue
    """
    Implements a priority queue with the same push/pop signature of the
    Queue and the Stack classes. This is designed for drop-in replacement for
    those two classes. The caller has to provide a priority function, which
    extracts each item's priority.
    """
    def  __init__(self, priorityFunction):
        "priorityFunction (item) -> priority"
        self.priorityFunction = priorityFunction      # store the priority function
        PriorityQueue.__init__(self)        # super-class initializer
    
    def push(self, item):
        "Adds an item to the queue with priority from the priority function"
        PriorityQueue.push(self, item, self.priorityFunction(item))

根据注释我们可以看出来这就是对 PriorityQueue 的封装了，唯一不同的是我们需要使用一个 priorityFunction(item) 来实例化它（按照 Stack 类中的使用，item 应该指我们设定的元组 node）；再看：

class SearchProblem:
    ...
    def getCostOfActions(self, actions):
        """
         actions: A list of actions to take
    
        This method returns the total cost of a particular sequence of actions.
        The sequence must be composed of legal moves.
        """
        util.raiseNotDefined()

用这个方法获取 cost，作为 UCS minheap 的依据；而 $A^*S$ 多加了一个 heuristic value 即可；你说巧不巧，aStarSearch 的参数就有这么一个函数，Q1 - Q4 通过。

Q5 (3 pts): Finding All the Corners (Lecture 3)

完成这题使用了 bfs，所以需要先运行 $ python pacman.py -l tinyCorners -p SearchAgent -a fn=bfs 确保通过。

explore

这关实现一个 CornersProblem 类；首先我们看 class CornersProblem(search.SearchProblem): 其继承了我们之前 search.py 中的 SearchProblem 定好的模板，所以我们应当参考 Q1-Q4 所期望的功能来实现它；又在该文件中搜 search.SearchProblem ，找到了 class PositionSearchProblem(search.SearchProblem): 这个兄弟，还是实现好的；那么我们继续照猫画虎：

def getStartState(self):
    return self.startingPosition
    util.raiseNotDefined()

def isGoalState(self, state: Any):
    # 暂时不会，让其始终返回False
    util.raiseNotDefined()

def getSuccessors(self, state: Any):
    successors = []
    for action in [
        Directions.NORTH,
        Directions.SOUTH,
        Directions.EAST,
        Directions.WEST,
    ]:
        x, y = state
        dx, dy = Actions.directionToVector(action)
        nextx, nexty = int(x + dx), int(y + dy)
        if not self.walls[nextx][nexty]:
            nextState = (nextx, nexty)
            # cost = self.costFn(nextState)
            cost = 1
            successors.append((nextState, action, cost))
    self._expanded += 1  # DO NOT CHANGE
    return successors

我们让 IsGoalState 始终返回 False，那么 pacman 始终没有一条路径可以走，必然静止不动；运行 $ python pacman.py -l tinyCorners -p SearchAgent -a fn=bfs,prob=CornersProblem 看看。

那么我们该怎么写 IsGoalState 呢？

right

下面内容参考 szzxljr 的代码，其实也体现在后面的 FoodSearchProblem 中。

我们假想我们上述的 node 不仅带着位置信息和动作，还带着一个 corners；每经过一个对应的 corner，我们就丢掉这个 corner；当我们丢完时，就达到目标了；但是直接(state, actions, corners) 是不行的¹。所以我们将其和原 state 作为新元组，即 state <= (state, corners):

def getStartState(self):
    return (self.startingPosition, self.corners)
    util.raiseNotDefined()

def isGoalState(self, state: Any):
    return len(state[1]) == 0
    util.raiseNotDefined()

def getSuccessors(self, state: Any):
    successors = []
    for action in [
        Directions.NORTH,
        Directions.SOUTH,
        Directions.EAST,
        Directions.WEST,
    ]:
        x, y = state[0]
        corners = state[1]
        dx, dy = Actions.directionToVector(action)
        nextx, nexty = int(x + dx), int(y + dy)
        if not self.walls[nextx][nexty]:
            nextState0 = (nextx, nexty)
            corners = tuple(filter(lambda c: c != nextState0, corners))
            # cost = self.costFn(nextState0) # accoding to the hint, cost is 1
            cost = 1
            successors.append(((nextState0, corners), action, cost))
    # print(successors)
    self._expanded += 1  # DO NOT CHANGE
    return successors

Q5通过

Q6-7

explore

Q6 其实就是让我们想一个比较好的 def cornersHeuristic(state: Any, problem: CornersProblem): 函数；在数学中描述“距离”的范数就比较适合先试试手；曼哈顿距离也是其中一种，且 SearchAgent.py 中有实现：

def manhattanHeuristic(position, problem, info={}):
    "The Manhattan distance heuristic for a PositionSearchProblem"
    xy1 = position
    xy2 = problem.goal
    return abs(xy1[0] - xy2[0]) + abs(xy1[1] - xy2[1])

我们可以计算此时相比于剩下的 corner 的曼哈顿距离，并取最小者即可：

def manhattanDistance(m, n):
    return abs(m[0] - n[0]) + abs(m[1] - n[1])

def cornersHeuristic(state: Any, problem: CornersProblem):
    corners = problem.corners  # These are the corner coordinates
    walls = problem.walls  # These are the walls of the maze, as a Grid (game.py)
    if len(state[1]) == 0:
        return 0
    min_distance = float("inf")
    for corner in state[1]:
        tmp = manhattanDistance(corner, state[0])
        min_distance = min(min_distance, tmp)
    return min_distance

Q6半通过

right

但是想想，manhattenDistance 本身就一定不比实际需要的步数多，应当取最大值来贴近；同时发现 util.py 中实现了曼哈顿距离：

def cornersHeuristic(state: Any, problem: CornersProblem):
    if len(state[1]) == 0:
        return 0
    max_distance = 0
    for corner in state[1]:
        # tmp = manhattanDistance(corner, state[0])
        tmp = util.manhattanDistance(corner, state[0])
        max_distance = max(max_distance, tmp)
    return max_distance

Q6通过

Q7 也是同理，之前是 4 个豆子在四个角落，现在是若干个豆子在某某个位置；所以思路基本一致（其中的 mazeDistance 是后来看了其他人的题解发现的）：

def foodHeuristic(state: Tuple[Tuple, List[List]], problem: FoodSearchProblem):
    position, foodGrid = state
    distances = [0]
    for food in foodGrid.asList():
        distances.append(mazeDistance(position, food, problem.startingGameState))
    return max(distances)

Q7通过

Q8 (3 pts): Suboptimal Search

right

根据提示，我们先补全 AnyFoodSearchProblem：

class AnyFoodSearchProblem(PositionSearchProblem):
    ...
    def isGoalState(self, state: Tuple[int, int]):
    x, y = state
    return self.food[x][y] # 去看这个成员的定义就懂了
    util.raiseNotDefined()

也就是说，这个类在此实现的功能就是检查某个位置食物是否被吃了。

Q8 本身更简单了，Returns a path (a list of actions) to the closest dot, starting from gameState ，在每一步成本都为 1 的情况下，最近既是成本最小，uniformCostSearch 即可：

def findPathToClosestDot(self, gameState: pacman.GameState):
    """
    Returns a path (a list of actions) to the closest dot, starting from
    gameState.
    """
    # Here are some useful elements of the startState
    startPosition = gameState.getPacmanPosition()
    food = gameState.getFood()
    walls = gameState.getWalls()
    problem = AnyFoodSearchProblem(gameState)
    
    "*** YOUR CODE HERE ***"
    return search.uniformCostSearch(problem)
    util.raiseNotDefined()

Q8 通过

pass

project-1 全部通过
全代码

[!SUMMARY]

多看相关注释和已有的类似实现；开始前清楚自己能用哪些东西。

Footnotes

我们需要和之前实现的 BFS 接口，这个 node 应该只有两个成员。 ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

project-1.md

project-1.md

explain

Q1 - Q4

explore

right

Q5 (3 pts): Finding All the Corners (Lecture 3)

explore

right

Q6-7

explore

right

Q8 (3 pts): Suboptimal Search

right

pass

Files

project-1.md

Latest commit

History

project-1.md

File metadata and controls

explain

Q1 - Q4

explore

right

Q5 (3 pts): Finding All the Corners (Lecture 3)

explore

right

Q6-7

explore

right

Q8 (3 pts): Suboptimal Search

right

pass

Footnotes