元素唯一性问题

元素唯一性问题（英語：element distinctness/uniqueness problem）是計算複雜性理論中判断列表内所有元素是否唯一的问题。关于这一问题的研究已经完备，可以通过许多不同的计算模型解决。其中一种方法就是通过给列表排序，检验是否存在连续相等元素；也可以借由随机化算法将元素插入哈希表中，比较放在哈希表相同位置元素，从而在线性时间复杂度解决这一问题。^[1]通过将问题转化为查找问题可以最大程度优化算法的时间复杂度。

决策树的复杂度[编辑]

已知对于一组数字列表，其时间复杂度是O(n log n)，或言之其时间复杂度的界限由代数决策树（英语：algebraic decision tree）模型的线性对数所决定^[2]，而在决策树模型中数据不一定需要存入内存（插入哈希表中），只需要计算和比较运算树节点的代数值，这一解法又被称之为渐进最优算法。这一算法可以进一步优化为随机代数决策树（英语：algebraic decision tree）。^[3]^[4]

查找重复元素[编辑]

查找在在含有n项元素的多重集合（multiset）中查找出现超过n/k次的元素所需的时间复杂度为O(n log k)，而元素唯一性问题是前一问题在k=n时的一个特例，决策树是这种算法的理想解决方案。^[5]这种算法可以看成多数投票法（k=2时的特例）的一种归纳推演，两者在研究历史上关系相依。^[6]

上述算法仅仅依赖于检验识别元素，如果可以排序，那么就可以利用基于顺序统计量的搜索算法。比方说，当k=2时，查找中位数的时间复杂度最低，进而方便检测是否有多余n/2个中位数元素，这种算法的比较次数比顺序统计算法来的少。^[6]

参考资料[编辑]

^ Gil, J.; Meyer auf der Heide, F.; Wigderson, A., Not all keys can be hashed in constant time, Proc. 22nd ACM Symposium on Theory of Computing（英语：Symposium on Theory of Computing）: 244–253, 1990, doi:10.1145/100216.100247 .
^ Ben-Or, Michael, Lower bounds for algebraic computation trees, Proc. 15th ACM Symposium on Theory of Computing（英语：Symposium on Theory of Computing）: 80–86, 1983, doi:10.1145/800061.808735 .
^ Grigoriev, Dima; Karpinski, Marek; Heide, Friedhelm Meyer; Smolensky, Roman, A lower bound for randomized algebraic decision trees, Computational Complexity, 1996, 6 (4): 357, doi:10.1007/BF01270387 .
^ Grigoriev, Dima, Complexity lower bounds for randomized computation trees over zero characteristic fields, Computational Complexity, 1999, 8 (4): 316–329, doi:10.1007/s000370050002 .
^ Misra, J.; Gries, D., Finding repeated elements, Science of Computer Programming, 1982, 2 (2): 143–152, doi:10.1016/0167-6423(82)90012-0, hdl:1813/6345  .
^ ^6.0 ^6.1 Boyer, R. S.; Moore, J S., MJRTY - A Fast Majority Vote Algorithm, Boyer, R. S. (编), Automated Reasoning: Essays in Honor of Woody Bledsoe, Automated Reasoning Series, Dordrecht, The Netherlands: Kluwer Academic Publishers: 105–117, 1991 .

[1] Gil, J.; Meyer auf der Heide, F.; Wigderson, A., Not all keys can be hashed in constant time, Proc. 22nd ACM Symposium on Theory of Computing（英语：Symposium on Theory of Computing）: 244–253, 1990, doi:10.1145/100216.100247 .

[2] Ben-Or, Michael, Lower bounds for algebraic computation trees, Proc. 15th ACM Symposium on Theory of Computing（英语：Symposium on Theory of Computing）: 80–86, 1983, doi:10.1145/800061.808735 .

[3] Grigoriev, Dima; Karpinski, Marek; Heide, Friedhelm Meyer; Smolensky, Roman, A lower bound for randomized algebraic decision trees, Computational Complexity, 1996, 6 (4): 357, doi:10.1007/BF01270387 .

[4] Grigoriev, Dima, Complexity lower bounds for randomized computation trees over zero characteristic fields, Computational Complexity, 1999, 8 (4): 316–329, doi:10.1007/s000370050002 .

[5] Misra, J.; Gries, D., Finding repeated elements, Science of Computer Programming, 1982, 2 (2): 143–152, doi:10.1016/0167-6423(82)90012-0, hdl:1813/6345  .

[bm-6] 6.0 ^6.1 Boyer, R. S.; Moore, J S., MJRTY - A Fast Majority Vote Algorithm, Boyer, R. S. (编), Automated Reasoning: Essays in Honor of Woody Bledsoe, Automated Reasoning Series, Dordrecht, The Netherlands: Kluwer Academic Publishers: 105–117, 1991 .

[1]

[2]

[3]

[4]

[5]

[6]

决策树的复杂度[编辑]

查找重复元素[编辑]

相关条目[编辑]

参考资料[编辑]