求二叉树中两结点的最小公共祖先

据说这是微软的一道面试题，谁知道呢。

问题描述：找出二叉树上任意两个指定结点的最近共同父结点（LCA，Least Common Ancestor）。

这算不上是一道算法题了，主要还是看数据结构基本知识和编程能力。

没有父指针，方法一

通常二叉树结点中并没有父结点指针，这时候就要遍历二叉树找到这两个结点，并找出它们的 LCA。

在遍历二叉树的时候，很容易就能够记录下根结点到任何结点的分支路径，只要有了分支路径，就可以对比找出 LCA。

我们采取前序遍历，即 N-L-R 的顺序，使用堆栈来避免递归并且记录完整的分支路径。那么，在二叉树中查找指定结点的算法可以这样写：

class Dir:
  (Undef, Left, Right) = range(3)

def FindNodes(root, nodeSet, findAll=True):
  if not root or not nodeSet:
    return None

  pathDict = {}
  path = []
  curr = root
  while curr or path:
    while curr:   # Go down along left branch
      path.append((curr, Dir.Left))
      if curr in nodeSet:
        pathDict[curr] = list(path)
        nodeSet.remove(curr)
        if not nodeSet or not findAll:
          return pathDict
      curr = curr.left
    (curr, dir) = path.pop()
    while dir == Dir.Right:   # Back from right branch
      if not path: return pathDict
      (curr, dir) = path.pop()
    path.append((curr, Dir.Right))  # Trun to right from left
    curr = curr.right

  return pathDict

其中 Dir 这个类相当于是一个枚举，用来定义当前的分支方向。FindNodes 除了需要二叉树根结点外，还需要一个待查找的结点集合。这个函数可以在二叉树中找到所有（或第一个）待查找结点的分支路径，并返回一个字典（结点 --> 路径）。

可以看出，FindNodes 函数按照前序顺序遍历整个二叉树，查找指定结点。每遇到一个结点，首先判断它是不是我们要找的，如果不是就沿着左边的分支下降到底，然后转入右侧分支。

有了 FindNodes 函数的支持，我们就可改写前面的 FindLCA 函数，即先遍历二叉树求出两个结点的分支路径，然后比较这两条路径找出 LCA：

def FindLCA(root, node1, node2):
  # Special cases.
  if not root or not node1 or not node2:
    return None
  if node1 is node2:
    return node1

  # Try to find the two nodes in the tree, and get their branch paths.
  nodeSet = set([node1, node2])
  pathDict = FindNodes(root, nodeSet)
  if nodeSet:
    return None

  path1 = [i[0] for i in pathDict[node1]]
  path2 = [i[0] for i in pathDict[node2]]

  # Compare the two paths, find out the LCA.
  lca = None
  minLen = min(len(path1), len(path2))
  for i in xrange(minLen):
    if path1[i] is not path2[i]:
      break
    lca = path1[i]

  return lca

遍历二叉树查找所有指定的结点需要 O(n) 时间，O(h) 额外空间；对比两条分支路径需要 O(h) 的时间，因此总的时间代价为 O(n)，空间代价为 O(h)。

没有父结点，方法二（2012-04-05 23:31 更新）

上面的代码有点儿太啰嗦了，如果不想缓存整条分支路径，或者只是想让代码更简洁一些，也很容易做到，只需要在遍历查找的时候做点儿小小的改动。关于遍历二叉树可以参考后面的一篇文章：程序基本功之遍历二叉树。这里我将在非递归的前序（N-L-R）遍历基础上修改得到求 LCA 的程序。

为什么用前序遍历？

首先考察一下 LCA 的特性，只有两种可能：

LCA 就是其中的一个结点，而另一个结点是它的子孙；
两个结点分别位于 LCA 的左子树和右子树中。

对于第一种可能，前序遍历时首先找到的结点就是 LCA，剩下的事情就是确定第二个结点在它下面。中序和后序也都可以做，但没有这么美妙。

对于第二种可能，假设在前序遍历过程中，首先找到了一个结点（比如下面的 H），根据非递归前序遍历的算法特性，这时候栈里一定是依次存储了结点 A（根节点）、B、D、G（请自行思考为什么没有 C、E、F），再结合 LCA 的特性，很容易发现，LCA 要么是 H 自身（对应于上面第一种情况），要么就只能是 A、B、D 或 G。剩下的事情就太美妙，继续遍历二叉树，直到找到另外一个结点。这时候看看 A、B、D、G 和 H 中还有谁在栈里，最靠下的那个就是 LCA。怎么判定谁在栈里？怎么判定最靠下？用辅助变量呗。

    A
   /
  B
 /
C
 \
  D
 /
E
 \
  F
   \
    G
   /
  H

示意程序代码：

def FindLCA(root, node1, node2):
  nodeset = set([node1, node2])   # Also supports 3 or more nodes.
  s = []          # A stack to help performing N-L-R traversing.
  lca = None      # Records the most possible least common ancestor.
  mindepth = -1   # The depth of lca.
  while root or s:
    if root:
      if root in nodeset:
        nodeset.remove(root)
        if mindepth < 0:
          # Yeah, found the first node. The lca must be itself or already in s.
          lca = root
          mindepth = len(s)
        if not nodeset:
          break
      s.append(root)
      root = root.left
    else:
      root = s.pop()
      if mindepth > len(s):
        lca = root
        mindepth = len(s)
      root = root.right
  return None if nodeset else lca

可以跟程序基本功之遍历二叉树中的 非递归前序遍历 的程序对比一下，会发现改动之处是非常小的。

这段程序时间复杂度都是 O(n)，空间复杂度是 O(h)，这些都是遍历二叉树所需的时间和空间消耗。在遍历之外，就只剩下常数量的时空开销了。

So what do you think? Did I miss something? Is any part unclear? Leave your comments below.

Comments

comments powered by Disqus

求二叉树中两结点的最小公共祖先

有父指针，方法一

有父指针，方法二（2012-04-02 22:10 添加）

没有父指针，方法一

没有父结点，方法二（2012-04-05 23:31 更新）

Comments

Keep Reading

Published

Last Updated

Category

Tags

Stay in Touch