LeetCode 692. Top K Frequent Words – 前缀树(Trie Tree or Prefix Tree)系列题4

Given an array of strings words and an integer k, return the k most frequent strings.

Return the answer sorted by the frequency from highest to lowest. Sort the words with the same frequency by their lexicographical order.

Example 1:

Input: words = ["i","love","leetcode","i","love","coding"], k = 2
Output: ["i","love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:

Input: words = ["the","day","is","sunny","the","the","the","sunny","is","is"], k = 4
Output: ["the","is","sunny","day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words, with the number of occurrence being 4, 3, 2 and 1 respectively.

Constraints:

  • 1 <= words.length <= 500
  • 1 <= words[i] <= 10
  • words[i] consists of lowercase English letters.
  • k is in the range [1, The number of unique words[i]]

题目意思是统计一个数组里单词出现的频率,然后再找出k个频率最高的单词。处理一个单词数组可以用前缀树(Trie Tree)数据结构,便于单词的插入和查找。求Top K频率单词可以用优先级队列(Priority Queue)。

1)由于要统计单词频率,所以前缀树(Trie Tree)树定义一个计数变量,既用于标识单词的结束又用于表示单词出现的次数。

2)实现一个标准的前缀树插入函数,把单词数组中的所有单词插入前缀树中。

3)遍历整棵前缀树,把遇到的单词及其频率放入一个优先级队列(Priority Queue)中,优先级队列大小始终保持小于等于K。最后队列中的单词就是Top K频率单词。

import heapq
class FreqWord:
    def __init__(self, f, w):
        self.freq = f
        self.word = w
    def __lt__(self, other):
        if self.freq == other.freq :
            return self.word > other.word
        else:
            return self.freq < other.freq
        
class TrieNode:
    def __init__(self):
        self.cnt = 0
        self.children = [None] * 26
        
class Solution:
    def topKFrequent(self, words: List[str], k: int) -> List[str]:
        self.root = TrieNode()
       
        for word in words:
            self.insert(word)
        
        pq = []
        self.search(self.root, '', pq, k)
        
        res = []
        tmp = []
        prevFreq = 0
        while len(pq) > 0:
            fw = heapq.heappop(pq)
            res.append(fw.word)
        
        return res[::-1]
            
        
        
    def search(self, root, word, pq, k):
        if not root:
            return
        
        for i in range(26):
            if not root.children[i]:
                continue
            node = root.children[i]
            word += (chr(ord('a') + i))
            if node.cnt > 0:
                heapq.heappush(pq, FreqWord(node.cnt, word))
                if len(pq) > k:
                    heapq.heappop(pq)
            self.search(node, word, pq, k)
            word = word[:-1]
        
        
    def insert(self, word):
        node = self.root
        
        for c in word:
            idx = ord(c) - ord('a')
            if not node.children[idx]:
                node.children[idx] = TrieNode()
            node = node.children[idx]
        
        node.cnt += 1

本图文内容来源于网友网络收集整理提供,作为学习参考使用,版权属于原作者。
THE END
分享
二维码
< <上一篇
下一篇>>