次の方法で共有


Interview question

Here’s a nice simple interview question:

In a given .NET string, assume there are line breaks in standard \r\n form (basically Environment.NewLine).

Write a method that inserts a space between two consecutive line breaks to separate any two line breaks from each other.

Also, anyone venture a guess what is a practical application for such a method?

Comments

  • Anonymous
    March 25, 2010
    string.Replace("rnrn", "rn rn");

  • Anonymous
    March 25, 2010
    Good one Robert, I like it, its simpler than mine. string insertSpaceBetweenLineBreaks(string input) {    return Regex.Replace(input, "(rn)(\1)", "$1 $2"); }

  • Anonymous
    March 25, 2010
    Beaten to it. I was thinking exactly the same as Robert.

  • Anonymous
    March 25, 2010
    I'm afraid it's not that easy. What about rnrnrn ? :)

  • Anonymous
    March 25, 2010
    As Ben mentionned, you need to take care of more than 2 consecutive breaks. I think it is necessary to iterate through the chars and create a new string as you go (which isn't an optimum solution): string text = "hello rnrnrn world!"; string textResult = ""; char[] caFromText = text.ToCharArray(); for (int i = 0; i < caFromText.Length; ++i) { textResult += caFromText[i]; if ((int)caFromText[i] == CARRIAGE_RETURN) {  // check if not reaching the end of the array  if (i + 3 < caFromText.Length)  {   if ((int)caFromText[i + 1] == NEW_LINE   && (int)caFromText[i + 2] == CARRIAGE_RETURN   && (int)caFromText[i + 3] == NEW_LINE)   {    // two consecutive breaks where detected    textResult += "n ";    // jump to the next break    ++i;   }  } } }

  • Anonymous
    March 25, 2010
    Hmmm, this version works via loop but I'd like to see a recursive version. This version uses a dash instead of space so its easy to see result. string insertSpaceBetweenLineBreaks(string input) {    while (Regex.Match(input, "(rn)(\1)").Success)    {        input = Regex.Replace(input, "(rn)(\1)", "$1-$1");    }    return input; } The test string I used was "ArnBrnrnCrnrnrnDrnrnrnrnE". This is the result A B

C

D

E

  • Anonymous
    March 25, 2010
    Recursive version, same result as previous while loop version. string insertSpaceBetweenLineBreaksRec(string input) {    if (!Regex.Match(input, "(rn)(\1)").Success)    {        return input;    }    else    {        input = Regex.Replace(input, "(rn)(\1)", "$1-$1");        return insertSpaceBetweenLineBreaksRec(input);    } }

  • Anonymous
    March 25, 2010
    Ben is right, Robert's original way doesn't work. Longer solutions will work, yes, but what about performance? I'm still not happy :) Any other ideas?

  • Anonymous
    March 25, 2010
    BTW it is amazing how the comments reflect my own train of thought as I was solving it.

  • Anonymous
    March 25, 2010
    In terms of why you might want this - SMTP?

  • Anonymous
    March 25, 2010
    Jon - pretty close. Not SMTP, but another web protocol, yes. Thanks for not revealing the answer just yet ;)

  • Anonymous
    March 25, 2010
    Sorry, not protocol - I should say "markup language"

  • Anonymous
    March 25, 2010
    Slightly faster : public static string Method3(string input)        {            int startIndex = 0;            var sb = new StringBuilder(input.Length * 2);            while (true)            {                int index = input.IndexOf("rn", startIndex);                if (index == -1)                {                    break;                }                int length = index - startIndex;                sb.Append(input.Substring(startIndex, length));                if (length == 1)                {                    sb.Append("-");                }                startIndex = index + 1;            }            return sb.ToString();        } Not much of a difference though.

  • Anonymous
    March 25, 2010
    straigtforward solution let insertSpaces (s : string) =    let rec loop (i : int) acc =        let newIndex = s.IndexOf("rnrn", i)        if newIndex = -1 then acc + s.Substring(i)        else loop (newIndex + 2) (acc + s.Substring(i, newIndex - i) + "rn ")    loop 0 "" or let insertSpaces2 (s : string) =    let checkCrlf i =        if i < s.Length - 1 then s.[i] = 'r' && s.[i+1] = 'n'        else false    let rec loop i crlf (acc : System.Text.StringBuilder) =        if i >= s.Length then acc.ToString()        else            let found = checkCrlf i            match found, crlf with            | true, true -> loop i false (acc.Append(' '))            | true, false -> loop (i + 2) true (acc.Append("rn"))            | false, _ -> loop (i + 1) false (acc.Append(s.[i]))    loop 0 false (new System.Text.StringBuilder())

  • Anonymous
    March 25, 2010
    Here's my take at it :        public static string InsertSpacesBetweenLineBreaks(string s)        {            StringBuilder sb = new StringBuilder();            bool prevCrLf = false;            char prev = '�';            foreach (char c in s)            {                if (c == 'n' && prev == 'r')                {                    if (prevCrLf) sb.Append(' ');                    prevCrLf = true;                }                else if (c != 'r')                {                    prevCrLf = false;                }                sb.Append(c);                prev = c;            }            return sb.ToString();        }

  • Anonymous
    March 25, 2010
    Let's take the declarative route ;) . // 4.0 Join overload string.Join(Environment.NewLine, input.Split(new[] { Environment.NewLine }, StringSplitOptions.None).Select(s => s == "" ? " " : s)) Oh, who said it has to be C# anyway? input.Split([| Environment.NewLine |]. StringSplitOptions.None) |> Seq.map (function "" -> " " | s -> s) |> String.concat Environment.NewLine

  • Anonymous
    March 25, 2010
    Here's my F# solution let addSpaces (data:string) =    let addStateToList state list =        match state with        | 0 -> list        | 1 -> 'r'::list        | 2 -> 'n'::'r'::list        | 3 -> 'r'::'n'::'r'::list        | _ -> failwith "invalid"    let list,state=          data        |> Seq.fold ( fun (l,state) cur ->              match state,cur with            | 0,'r' -> (l,1)            | 1,'n' -> (l,2)            | 2,'r' -> (l,3)            | 3,'n' -> (' '::'n'::'r'::l,2)            | _ -> (cur :: (l |> addStateToList state), 0)) (List.empty,0)    let list = addStateToList state list    let arr = list |> List.rev |> Array.ofList    new System.String(arr)

  • Anonymous
    March 25, 2010
    I vote for Sebastian U, because:

  1. Efficient enough (if we make some assumptions).
  2. Takes a very small amount of time to read and comprehend.
  • Anonymous
    March 25, 2010
    What about string.Replace("nr", "n r");

  • Anonymous
    March 25, 2010
          private string Process(string str)        {            return                str                .Split(new string[] { Environment.NewLine }, StringSplitOptions.None)                .Select(x => x.Length == 0 ? " " : x)                .Aggregate((a, x) => a + Environment.NewLine + x);        }

  • Anonymous
    March 25, 2010
    var s = "SIGNLErnDOUBLErnrnTRIPLErnrnrnENDrn**********rn"; System.Diagnostics.Trace.Write( s ); var sa = s.Split( new string[] { Environment.NewLine }, System.StringSplitOptions.None ); s = string.Empty; foreach( string ss in sa )   s += ( ss == string.Empty ? " " : ss ) + Environment.NewLine; System.Diagnostics.Trace.Write( s );

  • Anonymous
    March 25, 2010
    ha-ha. banko placed same solution while I was testing mine! :)

  • Anonymous
    March 25, 2010
    string output = Regex.Replace(input, @"(rn)(?=rn)", "$1 ");

  • Anonymous
    March 26, 2010
    The comment has been removed

  • Anonymous
    March 26, 2010
    I'm surprised but seems like it's faster than the split/join method : public static string Method4(string input)        {            string result = input;            string tmp = string.Empty;            while (result.Length != tmp.Length)            {                tmp = result;                result = tmp.Replace("rnrn", "rn-rn");            }            return result;        } (but the fastest yet on my machine is Thomas Levesque's )

  • Anonymous
    March 26, 2010
    Whipped this up in about ten minutes.  Seems too simple compared to the other solutions presented. static String ReplaceNewlines(String s) {  String origNewlines = Environment.NewLine + Environment.NewLine;  String newNewlines = Environment.NewLine + "-" + Environment.NewLine;  int offset = Environment.NewLine.Length + 1;  int startIndex = 0;  int idx = s.IndexOf(origNewlines, startIndex);  while (idx >= 0)  {    s = s.Replace(origNewlines, newNewlines);    startIndex += offset;    idx = s.IndexOf(origNewlines, startIndex);  }  return s; }

  • Anonymous
    March 26, 2010
    The comment has been removed

  • Anonymous
    March 26, 2010
    The comment has been removed

  • Anonymous
    March 26, 2010
    Can't you simply run it twice? string.Replace("rnrn", "rn rn"); string.Replace("rnrn", "rn rn"); The first pass would leave a maximum of two consecutive rn sequences and the second pass would clear those up. Might not be the most efficient, but it's certainly easy to understand and read.

  • Anonymous
    March 26, 2010
    @SebastianU - that is some sweet declarative code, thanks!

  • Anonymous
    March 26, 2010
    My first version didn't work, but this will. Quite pragmatic, and probably also the fastest. test.Replace("rnrn", "rn rn").Replace("rnrn", "rn rn");

  • Anonymous
    March 26, 2010
    here's one: static string ReplaceNewlines (string str) { // check for null if (str == null) throw new ArgumentNullException("str"); int cch = str.Length; // simple boundary condition, helps later if (cch < 4) return str; // pre-allocate space StringBuilder sb = new StringBuilder(cch * 4 / 3 + 1); // handle boundary condtion later, instead of in loop cch -= 3; int ich = 0; while (ich < cch) { char ch = str [ich++]; sb.Append (ch); // check for double newlines, won't overflow if (ch == 'r' && str [ich] == 'n' && str [ich + 1] == 'r' && str [ich + 2] == 'n') { sb.Append ("n "); ich++; } } // copy the remaining characters sb.Append (str, ich, str.Length - ich); return sb.ToString(); }

  • Anonymous
    March 26, 2010
    oops, that should have been: // pre-allocate space StringBuilder sb = new StringBuilder(cch * 3 / 2 + 1);

  • Anonymous
    March 26, 2010
    The comment has been removed

  • Anonymous
    March 26, 2010
    My take: static string Transform(string s) {    var input = new StringReader(s);    var output = new StringBuilder();    string line = null;    while ((line = input.ReadLine()) != null)        output.AppendLine(line.Equals(string.Empty) ? " " : line);    return output.ToString(); }

  • Anonymous
    March 26, 2010
    I'm curious about your response to Tad Smith's solution, because it certainly seems to work. Starting with string s = "rnrnrnrn"; s.Replace("rnrn", "rn-rn"); // - for clarity gives: "rn-rnrn-rn" repeating the operation gives: "rn-rn-rn-rn" isn't that what's required?

  • Anonymous
    March 26, 2010
    Actually you're right, my bad, sorry. It works indeed! However it does twice as much work and twice as much allocations then text.Replace("nr", "n r")

  • Anonymous
    March 26, 2010
    text.Replace("nr", "n r") is very clever, and very efficient. You are making an assumption which wasn't stated in the original problem though: that the only time your string has the characters "nr" is when it's part of the sequence "rnrn". That's probably a reasonable assumption. If there was any doubt though, I'd probably err on the safe side - correctness must trump performance surely?

  • Anonymous
    March 26, 2010
    The problem is stated ambiguously, since "rn" is not the same as Environment.NewLine, which is platform dependent. Correctness should come first , optimization second. What if the source gets compiled under Mono and Environment.NewLine is suddenly "n" ?

  • Anonymous
    March 28, 2010
    Kirill - Just a suggestion: you may want to provide some unit tests.  Many of the answers given so far have subtle differences. The test should specify how the functionality should handle input like "rnr" or "rnn" (many of the answers would handle these differently). Note: I know this may go above and beyond an "interview question", but I think the conversation has long since passed that point.

  • Anonymous
    March 29, 2010
    I want to say this has something to do with a multi-part post? Those have some funky rules to them (had to write a method to create one once, was a pain since I had no one to ask questions to)

  • Anonymous
    March 29, 2010
    Bad performance, but elegant: while(str.Contains("rnrn"))   str = str.Replace("rnrn", "rn rn"); If it ends up being on a critical path and too slow, only then I'd try to optimize (loop and use StringBuilder to make a new string).

  • Anonymous
    March 29, 2010
    What should happens if there are not only complete "rn", but 'r' and 'n' alone, e.g. "ArBCDEnnFGrrH"?

  • Anonymous
    March 29, 2010
    What hasn't been mentioned yet is that some of these solutions (e.g. the ones using IndexOf) may end up in an infinite loop in the presence of weightless Unicode characters. "nXr".IndexOf("nr") will return 0 if X is a Unicode character that isn't present in the sorting weight table. See http://esmithy.net/2007/10/11/unicode-surprises/ for details. (However, the character used in that blog post was added to the weight tables in .NET 4) string.Replace does not use culture-aware comparisons, so the string will stay unchanged (as expected) and you've got an infinite loop. Solution: 95% of the time, you'll want to pass StringComparison.Ordinal to all string methods. Use CurrentCulture only if you really want culture-specific stuff to be relevant. Don't use InvariantCulture, ever (unless you really, really know what you are doing).

  • Anonymous
    March 30, 2010
    I've added all the implementations to a solution, with unit tests. I probably haven't thought of all the tests necessary, so if you have any more ideas, please let me know. I've also added simple benchmarking for the implementations which pass the unit tests. http://sites.google.com/site/rikkus/kirill-osenkov-s-interview-question

  • Anonymous
    March 30, 2010
    Rik - this is absolutely fantastic! Thanks so much for this!!

  • Anonymous
    April 06, 2010
    Wow - thanks Rik! Posted about this at: http://blog.jordanterrell.com/post/Never-Underestimate-A-Well-Written-Regular-Expression.aspx