Question
Design an algorithm to encode a list of strings to a string. The encoded string is then sent over the network and is decoded back to the original list of strings.
Machine 1 (sender) has the function:
string encode(vectorstrs) { // ... your code return encoded_string;}
Machine 2 (receiver) has the function:
vectordecode(string s) { //... your code return strs;}
So Machine 1 does:
string encoded_string = encode(strs);
and Machine 2 does:
vectorstrs2 = decode(encoded_string);
strs2
in Machine 2 should be the same as strs
in Machine 1.
Implement the encode
and decode
methods.
Note:
- The string may contain any possible characters out of 256 valid ascii characters. Your algorithm should be generalized enough to work on any possible characters.
- Do not use class member/global/static variables to store states. Your encode and decode algorithms should be stateless.
- Do not rely on any library method such as
eval
or serialize methods. You should implement your own encode/decode algorithm.
Solution 1 -- JSON format
第一种方法是参考的的规则。
Encode: 我们将输入的字符串数组封装成JSON中的array。[ (left bracket) and ] (right bracket) 表示开头的结尾。中间的分隔用, (comma)。然后对于每个字符串,是 wrapped in double quotes。
由于字符串中本来就可能有双引号或是back slash (\),所以我们需要对这两种符号做转义。方法是多加一个back slash
如 原字符串 \"aafg" -> \\\"aafg\"
JSON里还有更复杂的字符串处理方法。但我们这里的目标只是让encode,再decode后的字符串相同,所以不必那么复杂。
Decode处理原则如下
1. 一个boolean的variable记录当前应该是下一个字符串的开头还是当前字符串的结束
2. 碰到bracket,根据是开始/结束,新建一个空字符串/将当前的字符串存入结果中
3. 碰到back slash,看它下一个元素是否是back slash / bracket,如果是,则将它下一个元素加到字符串中,计数加一。
1 public class Codec { 2 private final char start = '['; 3 private final char end = ']'; 4 private final char include = '"'; 5 private final char strSplit = ','; 6 7 // Encodes a list of strings to a single string. 8 public String encode(Liststrs) { 9 StringBuilder sb = new StringBuilder();10 sb.append(start);11 for (String str : strs) {12 sb.append(include);13 int len = str.length();14 for (int i = 0; i < len; i++) {15 char current = str.charAt(i);16 if (current == '"' || current == '\\') {17 sb.append('\\');18 }19 sb.append(current);20 }21 sb.append(include);22 sb.append(strSplit);23 }24 sb.append(end);25 return sb.toString();26 }27 28 // Decodes a single string to a list of strings.29 public List decode(String s) {30 List result = new ArrayList ();31 if (s == null || s.length() < 1) {32 return result;33 }34 int len = s.length();35 if (s.charAt(0) != start || s.charAt(len - 1) != end) {36 return result;37 }38 boolean startSymbol = true;39 StringBuilder sb = new StringBuilder();40 for (int i = 1; i < len - 1; i++) {41 char current = s.charAt(i);42 if (current == include) {43 if (startSymbol) {44 sb = new StringBuilder();45 } else {46 result.add(sb.toString());47 }48 startSymbol = !startSymbol;49 continue;50 }51 if (current == strSplit && startSymbol) {52 continue;53 }54 if (current == '\\') {55 char next = s.charAt(i + 1);56 if (next == '\\' || next == '"') {57 sb.append(next);58 i++;59 continue;60 }61 }62 sb.append(current);63 }64 return result;65 }66 }67 68 // Your Codec object will be instantiated and called as such:69 // Codec codec = new Codec();70 // codec.decode(codec.encode(strs));
Solution 2
利用了Java里String的 int (int ch, int fromIndex)函数。
同时存入字符串和字符串的长度。
1 public class Codec { 2 3 // Encodes a list of strings to a single string. 4 public String encode(Liststrs) { 5 StringBuilder sb = new StringBuilder(); 6 for (String str : strs) { 7 sb.append(str.length()).append('/').append(str); 8 } 9 return sb.toString();10 }11 12 // Decodes a single string to a list of strings.13 public List decode(String s) {14 List result = new ArrayList ();15 int length = s.length();16 int i = 0;17 while (i < length) {18 int slash = s.indexOf('/', i);19 int size = Integer.valueOf(s.substring(i, slash));20 result.add(s.substring(slash + 1, slash + size + 1));21 i = slash + size + 1;22 }23 return result;24 }25 }