Table of contents
Description
Question Links: LeetCode 3167, HackerRank
Consider a string, S, that is a series of characters each followed by its frequency as an integer. The string is not compressed correctly, so there may be multiple occurrences of the same character. A properly compressed string will consist of one instance of each character in alphabetical order followed by the total count of that character within the string.
Example 1
The string ‘a3c9b2c1’ has two instances where ‘c’ is followed by a count: once with 9 occurrences, and again with 1. It should be compressed to ‘a3b2c10’.
Example 2
‘a12b56c1’ → ‘a12b56c1’
Explanation: Nothing is changed because each character occurred only once, and they are already sorted ascending.
Example 3
al2c56a1b5 → ‘a13b5c56’ Explanation: ‘a’ occurs twice: 12 times for the first occurrence and 1 time in the second occurrence for a total 13. Sort ‘b’ and ‘c’ in alphabetic order.
Example 4:
Input: compressed = “c2b3a1”
Output: “a1b3c2”
Example 5:
Input: compressed = “a2b4c1”
Output: “a2b4c1”
Function Description
betterCompression has the following parameter:
S: a compressed string
Returns:
string: the properly compressed string
HackerRank Constraints:
- 1 ≤ size of S ≤ 100000
- ‘a’ <= characters in S <= ‘z’
- 1 ≤ frequency of each character in S ≤ 1000
LeetCode Constraints:
1 <= compressed.length <= 6 * 10^4compressedconsists only of lowercase English letters and digits.compressedis a valid compression, i.e., each character is followed by its frequency.- Frequencies are in the range
[1, 10^4]and have no leading zeroes.
Solution
let n=s.length
Idea
We can process the string input and accumulate the count for each of the characters in a-z. And finally we concat the correct count for each character for the result.
If we use a ordered map, the time complexity would be O(nlog26).
Complexity: Time O(n), Space O(26) or O(1).
Possible Bugs:
- The integer count may have multiple digits, do not assume it is single digit.
- Do not include the characters having a zero count.
Python
def better_compression(s: str) -> str:
n, res, i = len(s), [0] * 26, 0
while i < n:
c, cnt = s[i], 0
i += 1
while i < n and s[i].isdigit():
cnt = cnt * 10 + int(s[i])
i += 1
res[ord(c) - ord('a')] += cnt
return ''.join(chr(i + ord('a')) + str(res[i]) for i in range(26) if res[i])
Java
// O(n) time, O(1) space.
class Solution {
public String betterCompression(String s) {
int[] res = new int[26];
int n = s.length(), i = 0;
while (i < n) {
int c = s.charAt(i) - 'a';
i++;
int cnt = 0;
while (i < n && Character.isDigit(s.charAt(i))) {
cnt = cnt * 10 + (s.charAt(i) - '0');
i++;
}
res[c] += cnt;
}
StringBuilder sb = new StringBuilder();
for (int j = 0; j < 26; j++) {
if (res[j] > 0) {
sb.append((char) ('a' + j)).append(res[j]);
}
}
return sb.toString();
}
}
C++
// leet 3167 HackerRank Better Compression. O(n) time, O(1) space.
class Solution3167 {
public:
string betterCompression(const string &s) {
int res[26] = {};
int n = s.size(), i = 0;
while (i < n) {
int c = s[i] - 'a';
i++;
int cnt = 0;
while (i < n && isdigit(s[i])) {
cnt = cnt * 10 + (s[i] - '0');
i++;
}
res[c] += cnt;
}
string result;
for (int j = 0; j < 26; j++) {
if (res[j] > 0) {
result += (char) ('a' + j);
result += to_string(res[j]);
}
}
return result;
}
};
Rust
impl Solution {
pub fn better_compression(s: &str) -> String {
let mut res = [0u32; 26];
let bytes = s.as_bytes();
let n = bytes.len();
let mut i = 0;
while i < n {
let c = (bytes[i] - b'a') as usize;
i += 1;
let mut cnt: u32 = 0;
while i < n && bytes[i].is_ascii_digit() {
cnt = cnt * 10 + (bytes[i] - b'0') as u32;
i += 1;
}
res[c] += cnt;
}
let mut result = String::new();
for j in 0..26 {
if res[j] > 0 {
result.push((b'a' + j as u8) as char);
result.push_str(&res[j].to_string());
}
}
result
}
}
Unit Test
class Test(TestCase):
def test_better_compression(self):
cases = [
('a3c9b2c1', 'a3b2c10'),
('a12b56c1', 'a12b56c1'),
('a12c56a1b5', 'a13b5c56'),
('c2b3a1', 'a1b3c2'),
('a2b4c1', 'a2b4c1'),
]
for s, exp in cases:
with self.subTest(s=s):
self.assertEqual(exp, better_compression(s))