Interview question
Here’s a nice simple interview question:
In a given .NET string, assume there are line breaks in standard \r\n form (basically Environment.NewLine).
Write a method that inserts a space between two consecutive line breaks to separate any two line breaks from each other.
Also, anyone venture a guess what is a practical application for such a method?
Comments
Anonymous
March 25, 2010
string.Replace("rnrn", "rn rn");Anonymous
March 25, 2010
Good one Robert, I like it, its simpler than mine. string insertSpaceBetweenLineBreaks(string input) { return Regex.Replace(input, "(rn)(\1)", "$1 $2"); }Anonymous
March 25, 2010
Beaten to it. I was thinking exactly the same as Robert.Anonymous
March 25, 2010
I'm afraid it's not that easy. What about rnrnrn ? :)Anonymous
March 25, 2010
As Ben mentionned, you need to take care of more than 2 consecutive breaks. I think it is necessary to iterate through the chars and create a new string as you go (which isn't an optimum solution): string text = "hello rnrnrn world!"; string textResult = ""; char[] caFromText = text.ToCharArray(); for (int i = 0; i < caFromText.Length; ++i) { textResult += caFromText[i]; if ((int)caFromText[i] == CARRIAGE_RETURN) { // check if not reaching the end of the array if (i + 3 < caFromText.Length) { if ((int)caFromText[i + 1] == NEW_LINE && (int)caFromText[i + 2] == CARRIAGE_RETURN && (int)caFromText[i + 3] == NEW_LINE) { // two consecutive breaks where detected textResult += "n "; // jump to the next break ++i; } } } }Anonymous
March 25, 2010
Hmmm, this version works via loop but I'd like to see a recursive version. This version uses a dash instead of space so its easy to see result. string insertSpaceBetweenLineBreaks(string input) { while (Regex.Match(input, "(rn)(\1)").Success) { input = Regex.Replace(input, "(rn)(\1)", "$1-$1"); } return input; } The test string I used was "ArnBrnrnCrnrnrnDrnrnrnrnE". This is the result A B
C
D
E
Anonymous
March 25, 2010
Recursive version, same result as previous while loop version. string insertSpaceBetweenLineBreaksRec(string input) { if (!Regex.Match(input, "(rn)(\1)").Success) { return input; } else { input = Regex.Replace(input, "(rn)(\1)", "$1-$1"); return insertSpaceBetweenLineBreaksRec(input); } }Anonymous
March 25, 2010
Ben is right, Robert's original way doesn't work. Longer solutions will work, yes, but what about performance? I'm still not happy :) Any other ideas?Anonymous
March 25, 2010
BTW it is amazing how the comments reflect my own train of thought as I was solving it.Anonymous
March 25, 2010
In terms of why you might want this - SMTP?Anonymous
March 25, 2010
Jon - pretty close. Not SMTP, but another web protocol, yes. Thanks for not revealing the answer just yet ;)Anonymous
March 25, 2010
Sorry, not protocol - I should say "markup language"Anonymous
March 25, 2010
Slightly faster : public static string Method3(string input) { int startIndex = 0; var sb = new StringBuilder(input.Length * 2); while (true) { int index = input.IndexOf("rn", startIndex); if (index == -1) { break; } int length = index - startIndex; sb.Append(input.Substring(startIndex, length)); if (length == 1) { sb.Append("-"); } startIndex = index + 1; } return sb.ToString(); } Not much of a difference though.Anonymous
March 25, 2010
straigtforward solution let insertSpaces (s : string) = let rec loop (i : int) acc = let newIndex = s.IndexOf("rnrn", i) if newIndex = -1 then acc + s.Substring(i) else loop (newIndex + 2) (acc + s.Substring(i, newIndex - i) + "rn ") loop 0 "" or let insertSpaces2 (s : string) = let checkCrlf i = if i < s.Length - 1 then s.[i] = 'r' && s.[i+1] = 'n' else false let rec loop i crlf (acc : System.Text.StringBuilder) = if i >= s.Length then acc.ToString() else let found = checkCrlf i match found, crlf with | true, true -> loop i false (acc.Append(' ')) | true, false -> loop (i + 2) true (acc.Append("rn")) | false, _ -> loop (i + 1) false (acc.Append(s.[i])) loop 0 false (new System.Text.StringBuilder())Anonymous
March 25, 2010
Here's my take at it : public static string InsertSpacesBetweenLineBreaks(string s) { StringBuilder sb = new StringBuilder(); bool prevCrLf = false; char prev = '�'; foreach (char c in s) { if (c == 'n' && prev == 'r') { if (prevCrLf) sb.Append(' '); prevCrLf = true; } else if (c != 'r') { prevCrLf = false; } sb.Append(c); prev = c; } return sb.ToString(); }Anonymous
March 25, 2010
Let's take the declarative route ;) . // 4.0 Join overload string.Join(Environment.NewLine, input.Split(new[] { Environment.NewLine }, StringSplitOptions.None).Select(s => s == "" ? " " : s)) Oh, who said it has to be C# anyway? input.Split([| Environment.NewLine |]. StringSplitOptions.None) |> Seq.map (function "" -> " " | s -> s) |> String.concat Environment.NewLineAnonymous
March 25, 2010
Here's my F# solution let addSpaces (data:string) = let addStateToList state list = match state with | 0 -> list | 1 -> 'r'::list | 2 -> 'n'::'r'::list | 3 -> 'r'::'n'::'r'::list | _ -> failwith "invalid" let list,state= data |> Seq.fold ( fun (l,state) cur -> match state,cur with | 0,'r' -> (l,1) | 1,'n' -> (l,2) | 2,'r' -> (l,3) | 3,'n' -> (' '::'n'::'r'::l,2) | _ -> (cur :: (l |> addStateToList state), 0)) (List.empty,0) let list = addStateToList state list let arr = list |> List.rev |> Array.ofList new System.String(arr)Anonymous
March 25, 2010
I vote for Sebastian U, because:
- Efficient enough (if we make some assumptions).
- Takes a very small amount of time to read and comprehend.
Anonymous
March 25, 2010
What about string.Replace("nr", "n r");Anonymous
March 25, 2010
private string Process(string str) { return str .Split(new string[] { Environment.NewLine }, StringSplitOptions.None) .Select(x => x.Length == 0 ? " " : x) .Aggregate((a, x) => a + Environment.NewLine + x); }Anonymous
March 25, 2010
var s = "SIGNLErnDOUBLErnrnTRIPLErnrnrnENDrn**********rn"; System.Diagnostics.Trace.Write( s ); var sa = s.Split( new string[] { Environment.NewLine }, System.StringSplitOptions.None ); s = string.Empty; foreach( string ss in sa ) s += ( ss == string.Empty ? " " : ss ) + Environment.NewLine; System.Diagnostics.Trace.Write( s );Anonymous
March 25, 2010
ha-ha. banko placed same solution while I was testing mine! :)Anonymous
March 25, 2010
string output = Regex.Replace(input, @"(rn)(?=rn)", "$1 ");Anonymous
March 26, 2010
The comment has been removedAnonymous
March 26, 2010
I'm surprised but seems like it's faster than the split/join method : public static string Method4(string input) { string result = input; string tmp = string.Empty; while (result.Length != tmp.Length) { tmp = result; result = tmp.Replace("rnrn", "rn-rn"); } return result; } (but the fastest yet on my machine is Thomas Levesque's )Anonymous
March 26, 2010
Whipped this up in about ten minutes. Seems too simple compared to the other solutions presented. static String ReplaceNewlines(String s) { String origNewlines = Environment.NewLine + Environment.NewLine; String newNewlines = Environment.NewLine + "-" + Environment.NewLine; int offset = Environment.NewLine.Length + 1; int startIndex = 0; int idx = s.IndexOf(origNewlines, startIndex); while (idx >= 0) { s = s.Replace(origNewlines, newNewlines); startIndex += offset; idx = s.IndexOf(origNewlines, startIndex); } return s; }Anonymous
March 26, 2010
The comment has been removedAnonymous
March 26, 2010
The comment has been removedAnonymous
March 26, 2010
Can't you simply run it twice? string.Replace("rnrn", "rn rn"); string.Replace("rnrn", "rn rn"); The first pass would leave a maximum of two consecutive rn sequences and the second pass would clear those up. Might not be the most efficient, but it's certainly easy to understand and read.Anonymous
March 26, 2010
@SebastianU - that is some sweet declarative code, thanks!Anonymous
March 26, 2010
My first version didn't work, but this will. Quite pragmatic, and probably also the fastest. test.Replace("rnrn", "rn rn").Replace("rnrn", "rn rn");Anonymous
March 26, 2010
here's one: static string ReplaceNewlines (string str) { // check for null if (str == null) throw new ArgumentNullException("str"); int cch = str.Length; // simple boundary condition, helps later if (cch < 4) return str; // pre-allocate space StringBuilder sb = new StringBuilder(cch * 4 / 3 + 1); // handle boundary condtion later, instead of in loop cch -= 3; int ich = 0; while (ich < cch) { char ch = str [ich++]; sb.Append (ch); // check for double newlines, won't overflow if (ch == 'r' && str [ich] == 'n' && str [ich + 1] == 'r' && str [ich + 2] == 'n') { sb.Append ("n "); ich++; } } // copy the remaining characters sb.Append (str, ich, str.Length - ich); return sb.ToString(); }Anonymous
March 26, 2010
oops, that should have been: // pre-allocate space StringBuilder sb = new StringBuilder(cch * 3 / 2 + 1);Anonymous
March 26, 2010
The comment has been removedAnonymous
March 26, 2010
My take: static string Transform(string s) { var input = new StringReader(s); var output = new StringBuilder(); string line = null; while ((line = input.ReadLine()) != null) output.AppendLine(line.Equals(string.Empty) ? " " : line); return output.ToString(); }Anonymous
March 26, 2010
I'm curious about your response to Tad Smith's solution, because it certainly seems to work. Starting with string s = "rnrnrnrn"; s.Replace("rnrn", "rn-rn"); // - for clarity gives: "rn-rnrn-rn" repeating the operation gives: "rn-rn-rn-rn" isn't that what's required?Anonymous
March 26, 2010
Actually you're right, my bad, sorry. It works indeed! However it does twice as much work and twice as much allocations then text.Replace("nr", "n r")Anonymous
March 26, 2010
text.Replace("nr", "n r") is very clever, and very efficient. You are making an assumption which wasn't stated in the original problem though: that the only time your string has the characters "nr" is when it's part of the sequence "rnrn". That's probably a reasonable assumption. If there was any doubt though, I'd probably err on the safe side - correctness must trump performance surely?Anonymous
March 26, 2010
The problem is stated ambiguously, since "rn" is not the same as Environment.NewLine, which is platform dependent. Correctness should come first , optimization second. What if the source gets compiled under Mono and Environment.NewLine is suddenly "n" ?Anonymous
March 28, 2010
Kirill - Just a suggestion: you may want to provide some unit tests. Many of the answers given so far have subtle differences. The test should specify how the functionality should handle input like "rnr" or "rnn" (many of the answers would handle these differently). Note: I know this may go above and beyond an "interview question", but I think the conversation has long since passed that point.Anonymous
March 29, 2010
I want to say this has something to do with a multi-part post? Those have some funky rules to them (had to write a method to create one once, was a pain since I had no one to ask questions to)Anonymous
March 29, 2010
Bad performance, but elegant: while(str.Contains("rnrn")) str = str.Replace("rnrn", "rn rn"); If it ends up being on a critical path and too slow, only then I'd try to optimize (loop and use StringBuilder to make a new string).Anonymous
March 29, 2010
What should happens if there are not only complete "rn", but 'r' and 'n' alone, e.g. "ArBCDEnnFGrrH"?Anonymous
March 29, 2010
What hasn't been mentioned yet is that some of these solutions (e.g. the ones using IndexOf) may end up in an infinite loop in the presence of weightless Unicode characters. "nXr".IndexOf("nr") will return 0 if X is a Unicode character that isn't present in the sorting weight table. See http://esmithy.net/2007/10/11/unicode-surprises/ for details. (However, the character used in that blog post was added to the weight tables in .NET 4) string.Replace does not use culture-aware comparisons, so the string will stay unchanged (as expected) and you've got an infinite loop. Solution: 95% of the time, you'll want to pass StringComparison.Ordinal to all string methods. Use CurrentCulture only if you really want culture-specific stuff to be relevant. Don't use InvariantCulture, ever (unless you really, really know what you are doing).Anonymous
March 30, 2010
I've added all the implementations to a solution, with unit tests. I probably haven't thought of all the tests necessary, so if you have any more ideas, please let me know. I've also added simple benchmarking for the implementations which pass the unit tests. http://sites.google.com/site/rikkus/kirill-osenkov-s-interview-questionAnonymous
March 30, 2010
Rik - this is absolutely fantastic! Thanks so much for this!!Anonymous
April 06, 2010
Wow - thanks Rik! Posted about this at: http://blog.jordanterrell.com/post/Never-Underestimate-A-Well-Written-Regular-Expression.aspx