Table of contents
- Given problem
- Analyze our problem
- Using Binary Search algorithm
- Using TreeMap data structure
- Using TreeSet data structure
- Wrapping up
Given problem
A social media company is trying to monitor activity on their site by analyzing the number of tweets that occur in select periods of time. These periods can be partitioned into smaller time chunks based on a certain frequency (every minute, hour, or day).
For example, the period [10, 10000] (in seconds) would be partitioned into the following time chunks with these frequencies:
- Every minute (60-second chunks):
[10,69], [70,129], [130,189], ..., [9970,10000]
. - Every hour (3600-second chunks):
[10,3609], [3610,7209], [7210,10000]
. - Every day (86400-second chunks):
[10,10000]
Notice that the last chunk may be shorter than the specified frequency’s chunk size and will always end with the end time of the period (10000 in the above example).
Design and implement an API to help the company with their analysis.
Implement the TweetCounts
class:
-
TweetCounts()
Initializes the
TweetCounts
object. -
void recordTweet(String tweetName, int time)
Stores the tweetName at the recorded time (in seconds).
-
List<Integer> getTweetCountsPerFrequency(String freq, String tweetName, int startTime, int endTime)
Returns a list of integers representing the number of tweets with
tweetName
in each time chunk for the given period of time[startTime, endTime]
(in seconds) and frequencyfreq
.freq
is one of “minute
”, “hour
”, or “day
” representing a frequency of every minute, hour, or day respectively.
For example:
Input
["TweetCounts","recordTweet","recordTweet","recordTweet","getTweetCountsPerFrequency","getTweetCountsPerFrequency","recordTweet","getTweetCountsPerFrequency"]
[[],["tweet3",0],["tweet3",60],["tweet3",10],["minute","tweet3",0,59],["minute","tweet3",0,60],["tweet3",120],["hour","tweet3",0,210]]
Output
[null,null,null,null,[2],[2,1],null,[4]]
Explanation
TweetCounts tweetCounts = new TweetCounts();
tweetCounts.recordTweet("tweet3", 0); // New tweet "tweet3" at time 0
tweetCounts.recordTweet("tweet3", 60); // New tweet "tweet3" at time 60
tweetCounts.recordTweet("tweet3", 10); // New tweet "tweet3" at time 10
tweetCounts.getTweetCountsPerFrequency("minute", "tweet3", 0, 59); // return [2]; chunk [0,59] had 2 tweets
tweetCounts.getTweetCountsPerFrequency("minute", "tweet3", 0, 60); // return [2,1]; chunk [0,59] had 2 tweets, chunk [60,60] had 1 tweet
tweetCounts.recordTweet("tweet3", 120); // New tweet "tweet3" at time 120
tweetCounts.getTweetCountsPerFrequency("hour", "tweet3", 0, 210); // return [4]; chunk [0,210] had 4 tweets
Constraints:
0 <= time, startTime, endTime <= 109
0 <= endTime - startTime <= 104
- There will be at most
10^4
calls in total torecordTweet
andgetTweetCountsPerFrequency
.
Analyze our problem
In this problem, we have recordTweet()
and getTweetCountsPerFrequency()
methods.
-
The
recordTweet()
method will save thetweetName
andtime
of this tweet. -
The
getTweetCountsPerFrequency()
method will check that for eachtweetName
, how many tweets per frequency.
Then we can use some suitable data structure for this problem:
-
HashMap data structure
In the
getTweetCountsPerFrequency()
method, we will get a set of times per tweet to produce the frequency of a tweet per minute, or hour, or day. -
LinkedList data structure
Normally, we will append a new tweet at the end. So the time complexity of this operation is
O(1)
.The drawback of this data structure is that in
getTweetCountsPerFrequency()
method, we will get frequency of a tweet. So using Linked List data structure can takeO(n)
.So we will not use the LinkedList data structure here.
-
In the previous lines, we will use HashMap data structure to map between
tweetName
and its a set oftime
s. So which data structure will be used to contain these times?There are two data structures here:
-
With the time complexity of write operation, read operation that is O(logn), we can use TreeMap, TreeSet data structure.
-
With the time complexity of write operation - O(n), read operation - O(logn), we can use ArrayList data structure that is sorted.
In the
recordTweet()
method, we will use Binary Search algorithm to find which position will be inserted. Then, insert thetime
that is corresponding totweetName
into that position.
-
Using Binary Search algorithm
Below is the solution that use Binary Search algorithm.
class TweetCounts {
private Map<String, List<Integer>> tweetInfos;
private Map<String, Integer> freqs;
public TweetCounts() {
this.tweetInfos = new HashMap<>();
this.freqs = new HashMap<>();
this.freqs.put("minute", 60);
this.freqs.put("hour", 3600);
this.freqs.put("day", 86400);
}
public void recordTweet(String tweetName, int time) {
List<Integer> currentTimesPerTweet = this.tweetInfos.computeIfAbsent(tweetName, t -> new ArrayList<>());
int idx = this.lowerBound(currentTimesPerTweet, time);
currentTimesPerTweet.add(idx, time);
}
public List<Integer> getTweetCountsPerFrequency(String freq, String tweetName, int startTime, int endTime) {
List<Integer> timesPerTweet = this.tweetInfos.get(tweetName);
int startIdx = this.lowerBound(timesPerTweet, startTime);
int endIdx = this.upperBound(timesPerTweet, endTime);
int duration = this.freqs.get(freq);
int outputSize = ((endTime - startTime) / duration) + 1;
int[] res = new int[outputSize];
for (int i = startIdx; i < endIdx; ++i) {
int tmpResIdx = (timesPerTweet.get(i) - startTime) / duration;
res[tmpResIdx] += 1;
}
List<Integer> output = new ArrayList<>();
for (int tmp : res) {
output.add(tmp);
}
return output;
}
/**
* Find the first element that is greater than or equal to target
*
* @param timesPerTweet
* @param currentTime
* @return
*/
private int lowerBound(List<Integer> timesPerTweet, int currentTime) {
int blue = -1;
int red = timesPerTweet.size();
while (blue + 1 != red) {
int mid = blue + (red - blue) / 2;
int midTime = timesPerTweet.get(mid);
if (midTime < currentTime) {
blue = mid;
} else {
red = mid;
}
}
if (red < timesPerTweet.size() && timesPerTweet.get(red) >= currentTime) {
return red;
}
return timesPerTweet.size();
}
/**
* Find the first element that is greater than target
*
* @param timesPerTweet
* @param currentTime
* @return
*/
private int upperBound(List<Integer> timesPerTweet, int currentTime) {
int blue = -1;
int red = timesPerTweet.size();
while (blue + 1 != red) {
int mid = blue + (red - blue) / 2;
int midTime = timesPerTweet.get(mid);
if (midTime <= currentTime) {
blue = mid;
} else {
red = mid;
}
}
if (red < timesPerTweet.size() && timesPerTweet.get(red) > currentTime) {
return red;
}
return timesPerTweet.size();
}
}
In this implementation, we will define the custom implementation of upper_bound()
and lower_bound()
methods in C++.
Using TreeMap data structure
public class TweetCounts {
private Map<String, TreeMap<Integer, Integer>> map;
private Map<String, Integer> freqMap;
public TweetCounts() {
this.map = new HashMap<>();
this.freqMap = new HashMap<>();
this.freqMap.put("minute", 60);
this.freqMap.put("hour", 3600);
this.freqMap.put("day", 86400);
}
public void recordTweet(String tweetName, int time) {
if (!this.map.containsKey(tweetName)) {
this.map.put(tweetName, new TreeMap<>());
}
TreeMap<Integer, Integer> tweetMap = this.map.get(tweetName);
tweetMap.put(time, tweetMap.getOrDefault(time, 0) + 1);
}
public List<Integer> getTweetCountsPerFrequency(String freq, String tweetName, int startTime, int endTime) {
if (!this.map.containsKey(tweetName)) {
return null;
}
List<Integer> res = new LinkedList<>();
int interval = this.freqMap.get(freq);
int size = ((endTime - startTime) / interval) + 1;
int[] buckets = new int[size];
TreeMap<Integer, Integer> tweetMap = this.map.get(tweetName);
for (Map.Entry<Integer, Integer> entry : tweetMap.subMap(startTime, endTime + 1).entrySet()) {
int idx = (entry.getKey() - startTime) / interval;
buckets[idx] += entry.getValue();
}
for (int num : buckets) {
res.add(num);
}
return res;
}
}
Using TreeSet data structure
public class TweetCounts {
private Map<String, TreeSet<Integer>> map = new HashMap<>();
private Map<String, Integer> meta = new HashMap<>();
public TweetCounts() {
this.meta.put("minute", 60);
this.meta.put("hour", 3600);
this.meta.put("day", 86400);
}
public void recordTweet(String tweetName, int time) {
this.map.computeIfAbsent(tweetName, t -> new TreeSet<Integer>()).add(time);
}
public List<Integer> getTweetCountsPerFrequency(String freq, String tweetName, int startTime, int endTime) {
TreeSet<Integer> subset = (TreeSet<Integer>) map.getOrDefault(tweetName, new TreeSet<Integer>())
.subSet(startTime, true, endTime, true);
List<Integer> out = new ArrayList<>();
int chunk = this.meta.get(freq);
for (int i = startTime; i <= endTime; i = i + chunk) {
final int iEndRange = Math.min(i + chunk - 1, endTime);
out.add(subset.subSet(i, true, iEndRange, true)
.size());
}
return out;
}
}
Wrapping up
Refer: