LeetCode 692. Top K Frequent Words – 前缀树(Trie Tree or Prefix Tree)系列题4
Given an array of strings words
and an integer k
, return the k
most frequent strings.
Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.
Example 1:
Input: words = ["i","love","leetcode","i","love","coding"], k = 2 Output: ["i","love"] Explanation: "i" and "love" are the two most frequent words. Note that "i" comes before "love" due to a lower alphabetical order.
Example 2:
Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4 Output: ["the","is","sunny","day"] Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.
Constraints:
1 <= words.length <= 500
1 <= words[i] <= 10
-
words[i]
consists of lowercase English letters. -
k
is in the range[1, The number of unique words[i]]
题目意思是统计一个数组里单词出现的频率,然后再找出k个频率最高的单词。处理一个单词数组可以用前缀树(Trie Tree)数据结构,便于单词的插入和查找。求Top K频率单词可以用优先级队列(Priority Queue)。
1)由于要统计单词频率,所以前缀树(Trie Tree)树定义一个计数变量,既用于标识单词的结束又用于表示单词出现的次数。
2)实现一个标准的前缀树插入函数,把单词数组中的所有单词插入前缀树中。
3)遍历整棵前缀树,把遇到的单词及其频率放入一个优先级队列(Priority Queue)中,优先级队列大小始终保持小于等于K。最后队列中的单词就是Top K频率单词。
import heapq
class FreqWord:
def __init__(self, f, w):
self.freq = f
self.word = w
def __lt__(self, other):
if self.freq == other.freq :
return self.word > other.word
else:
return self.freq < other.freq
class TrieNode:
def __init__(self):
self.cnt = 0
self.children = [None] * 26
class Solution:
def topKFrequent(self, words: List[str], k: int) -> List[str]:
self.root = TrieNode()
for word in words:
self.insert(word)
pq = []
self.search(self.root, '', pq, k)
res = []
tmp = []
prevFreq = 0
while len(pq) > 0:
fw = heapq.heappop(pq)
res.append(fw.word)
return res[::-1]
def search(self, root, word, pq, k):
if not root:
return
for i in range(26):
if not root.children[i]:
continue
node = root.children[i]
word += (chr(ord('a') + i))
if node.cnt > 0:
heapq.heappush(pq, FreqWord(node.cnt, word))
if len(pq) > k:
heapq.heappop(pq)
self.search(node, word, pq, k)
word = word[:-1]
def insert(self, word):
node = self.root
for c in word:
idx = ord(c) - ord('a')
if not node.children[idx]:
node.children[idx] = TrieNode()
node = node.children[idx]
node.cnt += 1