Kotvy v regulárních výrazech

Článek
08/11/2011

Kotvy nebo atomické výroky s nulovou šířkou určují pozici v řetězec, kde musí dojít ke shodě. Použijete-li ukotvení ve vyhledávacím výrazu, modul regulárních výrazů nebude procházet řetězcem nebo spotřebovávat znaky; hledá shodu pouze na určené pozici. Například znak ^ určuje, že porovnávání musí začít na začátku řádku nebo řetězce. Proto regulárního výrazu ^http: odpovídá "http:" Když ji dochází pouze na začátku řádku. V následující tabulce jsou uvedeny ukotvení podporované regulárními výrazy v rozhraní .NET Framework.

Ukotvení	Popis
^	Ke shodě musí dojít na začátku řetězce nebo řádku. Další informace naleznete v tématu Začátek řetězce nebo řádku.
$	Ke shodě musí dojít na konci řetězce nebo řádku nebo před \n na konci řetězce nebo řádku. Další informace naleznete v tématu Ukončení řetězce nebo řádku.
\A	Ke shodě musí dojít pouze na začátku řetězce (bez víceřádkové podpory). Další informace naleznete v tématu Začátek řetězce.
\Z	Ke shodě musí dojít na konci řetězce nebo před \n na konci řetězce. Další informace naleznete v tématu Ukončení řetězce nebo ukončení před novým řádkem.
\z	Ke shodě musí dojít pouze na konci řetězce. Další informace naleznete v tématu Konec řetězce.
\G	Ke shodě musí dojít na místě, kde bylo předchozí porovnávání ukončeno. Další informace naleznete v tématu Souvislé porovnávání.
\b	Ke shodě musí dojít na hranici slova. Další informace naleznete v tématu Hranice slova.
\B	Ke shodě nesmí dojít na hranici slova. Další informace naleznete v tématu Hranice mimo slovo.

Začátek řetězce nebo řádku: ^

Ukotvení ^ určuje, že následující vzor musí začínat na první pozici znaku řetězce. Použijete-li ^ společně s volbou RegexOptions.Multiline (navštivte Možnosti regulárních výrazů), shoda se musí vyskytovat na začátku každého řádku.

V následujícím příkladu je použito ukotvení ^ v regulárním výrazu, který extrahuje informace o letech během nichž existovaly některé profesionální baseballové týmy. V příkladu jsou volány dvě přetížení metody Regex.Matches():

Volání přetížení Matches(String, String) vyhledá pouze první podřetězec ve vstupním řetězci, který odpovídá vzoru regulárního výrazu.
Volání přetížení Matches(String, String, RegexOptions) s parametrem options nastaveným na RegexOptions.Multiline vyhledá všech pět podřetězců.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim startPos As Integer = 0
      Dim endPos As Integer = 70
      Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf + _
                            "Chicago Cubs, National League, 1903-present" + vbCrLf + _
                            "Detroit Tigers, American League, 1901-present" + vbCrLf + _
                            "New York Giants, National League, 1885-1957" + vbCrLf + _
                            "Washington Senators, American League, 1901-1960" + vbCrLf  

      Dim pattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+"
      Dim match As Match

      ' Provide minimal validation in the event the input is invalid.
      If input.Substring(startPos, endPos).Contains(",") Then
         match = Regex.Match(input, pattern)
         Do While match.Success
            Console.Write("The {0} played in the {1} in", _
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            startPos = match.Index + match.Length 
            endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
            If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
            match = match.NextMatch()            
         Loop
         Console.WriteLine()                               
      End If      

      startPos = 0
      endPos = 70
      If input.Substring(startPos, endPos).Contains(",") Then
         match = Regex.Match(input, pattern, RegexOptions.Multiline)
         Do While match.Success
            Console.Write("The {0} played in the {1} in", _
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            startPos = match.Index + match.Length 
            endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
            If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
            match = match.NextMatch()            
         Loop
         Console.WriteLine()                               
      End If      


'       For Each match As Match in Regex.Matches(input, pattern, RegexOptions.Multiline)
'          Console.Write("The {0} played in the {1} in", _
'                            match.Groups(1).Value, match.Groups(4).Value)
'          For Each capture As Capture In match.Groups(5).Captures
'             Console.Write(capture.Value)
'          Next
'          Console.WriteLine(".")
'       Next
   End Sub
End Module
' The example displays the following output:
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    The Chicago Cubs played in the National League in 1903-present.
'    The Detroit Tigers played in the American League in 1901-present.
'    The New York Giants played in the National League in 1885-1957.
'    The Washington Senators played in the American League in 1901-1960.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      int startPos = 0, endPos = 70;
      string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
                     "Chicago Cubs, National League, 1903-present\n" + 
                     "Detroit Tigers, American League, 1901-present\n" + 
                     "New York Giants, National League, 1885-1957\n" +  
                     "Washington Senators, American League, 1901-1960\n";   
      string pattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+";
      Match match;

      if (input.Substring(startPos, endPos).Contains(",")) {
         match = Regex.Match(input, pattern);
         while (match.Success) {
            Console.Write("The {0} played in the {1} in", 
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);

            Console.WriteLine(".");
            startPos = match.Index + match.Length;
            endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
            if (! input.Substring(startPos, endPos).Contains(",")) break;
            match = match.NextMatch();
         }
         Console.WriteLine();
      }

      if (input.Substring(startPos, endPos).Contains(",")) {
         match = Regex.Match(input, pattern, RegexOptions.Multiline);
         while (match.Success) {
            Console.Write("The {0} played in the {1} in", 
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);

            Console.WriteLine(".");
            startPos = match.Index + match.Length;
            endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
            if (! input.Substring(startPos, endPos).Contains(",")) break;
            match = match.NextMatch();
         }
         Console.WriteLine();
      }
   }
}
// The example displays the following output:
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    The Chicago Cubs played in the National League in 1903-present.
//    The Detroit Tigers played in the American League in 1901-present.
//    The New York Giants played in the National League in 1885-1957.
//    The Washington Senators played in the American League in 1901-1960.

Vzorce regulárního výrazu ^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+ je definován podle následující tabulky.

Maska	Popis
^	Začne porovnávání na začátku vstupního řetězce (nebo na začátku řádku, pokud je metoda volána s možností RegexOptions.Multiline).
((\w+(\s*)){2,}	Porovnává jeden nebo více slovních znaků následovaných žádnou nebo jednou mezerou přesně dvakrát. Toto je první zachytávající skupina. Tento výraz také definuje druhou a třetí zachytávající skupinu: Druhá skupina se skládá ze zachyceného slova a třetí ze zachycené mezery.
,\s	Porovnává čárku následovanou prázdným znakem.
(\w+\s\w+)	Porovnává jeden nebo více slovních znaků následovaných mezerou, následovanou jedním nebo více slovními znaky. Toto je čtvrtá zachytávající skupina.
,	Porovná čárku.
\s\d{4}	Porovná mezeru následovanou čtyřmi desítkovými číslicemi.
(-(\d{4}\|present))*	Porovná žádný nebo jeden výskyt pomlčky následovaný čtyřmi desítkovými číslicemi nebo řetězcem "present". Toto je šestá zachytávající skupina. Zahrnuje také sedmou zachytávající skupinu.
,*	Porovná žádný nebo jeden výskyt čárky.
(\s\d{4}(-(\d{4}\|present)),)+	Porovná jeden nebo více výskytů následujících: mezery, čtyř desítkových číslic, žádný nebo jeden výskyt pomlčky následovaný čtyřmi desítkovými číslicemi nebo řetězcem "present" a žádné nebo jedné čárky. Toto je pátá zachytávající skupina.

Zpět na začátek

Ukončení řetězce nebo řádku: $

Ukotvení $ určuje, že se předchozí vzorek musí objevit na konci vstupního řetězce nebo před \n na konci vstupního řetězce.

Použijete-li $ společně s možností RegexOptions.Multiline shoda může nastat také na konci řádku. Všimněte si, že $ porovnává \n ale neporovnává \r\n (kombinace návratového znaku a znaku nového řádku nebo znaky CR/LF). Chcete-li porovnávat kombinaci znaků CR/LF, zahrňte \r?$ do vzoru regulárního výrazu.

V následujícím příkladu je přidáno ukotvení $ do vzoru regulárního výrazu, použitém v příkladu v oddílu Začátek řetězce nebo řádku. Při použití s původním vstupním řetězcem, který obsahuje pět řádků textu, metoda Regex.Matches(String, String) nenalezne shodu, protože konec prvního řádku neodpovídá vzoru $. Pokud je původní vstupní řetězec rozdělen na pole řetězců, metoda Regex.Matches(String, String) úspěšně porovná každý z pěti řádků. Pokud je parametr Regex.Matches(String, String, RegexOptions) Metoda je volána s options nastaven na RegexOptions.Multiline, nejsou nalezeny žádné shody, protože vzor regulárního výrazu nezahrnuje návratový prvek (\u+000D). Pokud je však vzor regulárního výrazů pozměněn nahrazením $ s \r?$, volání metody Regex.Matches(String, String, RegexOptions) s parametrem options nastaveným na RegexOptions.Multiline znovu najde pět shod.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim startPos As Integer = 0
      Dim endPos As Integer = 70
      Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf + _
                            "Chicago Cubs, National League, 1903-present" + vbCrLf + _
                            "Detroit Tigers, American League, 1901-present" + vbCrLf + _
                            "New York Giants, National League, 1885-1957" + vbCrLf + _
                            "Washington Senators, American League, 1901-1960" + vbCrLf  

      Dim basePattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+"
      Dim match As Match

      Dim pattern As String = basePattern + "$"
      Console.WriteLine("Attempting to match the entire input string:")
      ' Provide minimal validation in the event the input is invalid.
      If input.Substring(startPos, endPos).Contains(",") Then
         match = Regex.Match(input, pattern)
         Do While match.Success
            Console.Write("The {0} played in the {1} in", _
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            startPos = match.Index + match.Length 
            endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
            If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
            match = match.NextMatch()            
         Loop
         Console.WriteLine()                               
      End If      

      Dim teams() As String = input.Split(New String() { vbCrLf }, StringSplitOptions.RemoveEmptyEntries)
      Console.WriteLine("Attempting to match each element in a string array:")
      For Each team As String In teams
         If team.Length > 70 Then Continue For
         match = Regex.Match(team, pattern)
         If match.Success Then
            Console.Write("The {0} played in the {1} in", _
                           match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
         End If
      Next
      Console.WriteLine()

      startPos = 0
      endPos = 70
      Console.WriteLine("Attempting to match each line of an input string with '$':")
      ' Provide minimal validation in the event the input is invalid.
      If input.Substring(startPos, endPos).Contains(",") Then
         match = Regex.Match(input, pattern, RegexOptions.Multiline)
         Do While match.Success
            Console.Write("The {0} played in the {1} in", _
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            startPos = match.Index + match.Length 
            endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
            If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
            match = match.NextMatch()            
         Loop
         Console.WriteLine()                               
      End If      


      startPos = 0
      endPos = 70
      pattern = basePattern + "\r?$" 
      Console.WriteLine("Attempting to match each line of an input string with '\r?$':")
      ' Provide minimal validation in the event the input is invalid.
      If input.Substring(startPos, endPos).Contains(",") Then
         match = Regex.Match(input, pattern, RegexOptions.Multiline)
         Do While match.Success
            Console.Write("The {0} played in the {1} in", _
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            startPos = match.Index + match.Length 
            endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
            If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
            match = match.NextMatch()            
         Loop
         Console.WriteLine()                               
      End If      
   End Sub
End Module
' The example displays the following output:
'    Attempting to match the entire input string:
'    
'    Attempting to match each element in a string array:
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    The Chicago Cubs played in the National League in 1903-present.
'    The Detroit Tigers played in the American League in 1901-present.
'    The New York Giants played in the National League in 1885-1957.
'    The Washington Senators played in the American League in 1901-1960.
'    
'    Attempting to match each line of an input string with '$':
'    
'    Attempting to match each line of an input string with '\r+$':
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
'    The Chicago Cubs played in the National League in 1903-present.
'    The Detroit Tigers played in the American League in 1901-present.
'    The New York Giants played in the National League in 1885-1957.
'    The Washington Senators played in the American League in 1901-1960.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      int startPos = 0, endPos = 70;
      string cr = Environment.NewLine;
      string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + cr +
                     "Chicago Cubs, National League, 1903-present" + cr + 
                     "Detroit Tigers, American League, 1901-present" + cr + 
                     "New York Giants, National League, 1885-1957" + cr +  
                     "Washington Senators, American League, 1901-1960" + cr;   
      Match match;

      string basePattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+";
      string pattern = basePattern + "$";
      Console.WriteLine("Attempting to match the entire input string:");
      if (input.Substring(startPos, endPos).Contains(",")) {
         match = Regex.Match(input, pattern);
         while (match.Success) {
            Console.Write("The {0} played in the {1} in", 
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);

            Console.WriteLine(".");
            startPos = match.Index + match.Length;
            endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
            if (! input.Substring(startPos, endPos).Contains(",")) break;
            match = match.NextMatch();
         }
         Console.WriteLine();
      }

      string[] teams = input.Split(new String[] { cr }, StringSplitOptions.RemoveEmptyEntries);
      Console.WriteLine("Attempting to match each element in a string array:");
      foreach (string team in teams)
      {
         if (team.Length > 70) continue;

         match = Regex.Match(team, pattern);
         if (match.Success)
         {
            Console.Write("The {0} played in the {1} in", 
                          match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);
            Console.WriteLine(".");
         }
      }
      Console.WriteLine();

      startPos = 0;
      endPos = 70;
      Console.WriteLine("Attempting to match each line of an input string with '$':");
      if (input.Substring(startPos, endPos).Contains(",")) {
         match = Regex.Match(input, pattern, RegexOptions.Multiline);
         while (match.Success) {
            Console.Write("The {0} played in the {1} in", 
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);

            Console.WriteLine(".");
            startPos = match.Index + match.Length;
            endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
            if (! input.Substring(startPos, endPos).Contains(",")) break;
            match = match.NextMatch();
         }
         Console.WriteLine();
      }

      startPos = 0;
      endPos = 70;
      pattern = basePattern + "\r?$"; 
      Console.WriteLine(@"Attempting to match each line of an input string with '\r?$':");
      if (input.Substring(startPos, endPos).Contains(",")) {
         match = Regex.Match(input, pattern, RegexOptions.Multiline);
         while (match.Success) {
            Console.Write("The {0} played in the {1} in", 
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);

            Console.WriteLine(".");
            startPos = match.Index + match.Length;
            endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
            if (! input.Substring(startPos, endPos).Contains(",")) break;
            match = match.NextMatch();
         }
         Console.WriteLine();
      }
   }
}
// The example displays the following output:
//    Attempting to match the entire input string:
//    
//    Attempting to match each element in a string array:
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    The Chicago Cubs played in the National League in 1903-present.
//    The Detroit Tigers played in the American League in 1901-present.
//    The New York Giants played in the National League in 1885-1957.
//    The Washington Senators played in the American League in 1901-1960.
//    
//    Attempting to match each line of an input string with '$':
//    
//    Attempting to match each line of an input string with '\r+$':
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.
//    The Chicago Cubs played in the National League in 1903-present.
//    The Detroit Tigers played in the American League in 1901-present.
//    The New York Giants played in the National League in 1885-1957.
//    The Washington Senators played in the American League in 1901-1960.

Zpět na začátek

Začátek řetězce: \A

Ukotvení \A určuje, že ke shodě musí dojít na začátku vstupního řetězce. Je totožná s ukotvením ^ až na to, že \A ignoruje možnost RegexOptions.Multiline. Proto může porovnávat pouze začátek prvního řádku ve víceřádkovém vstupním řetězci.

Následující příklad je podobný příkladům pro ukotvení ^ a $. Je použito ukotvení \A v regulárním výrazu, který extrahuje informace o letech během nichž existovaly některé profesionální baseballové týmy. Vstupní řetězec obsahuje pět řádků. Volání metody Regex.Matches(String, String, RegexOptions) vyhledá pouze první podřetězec ve vstupním řetězci, který odpovídá vzoru regulárního výrazu. Jak ukazuje příklad, možnost Multiline nemá žádný vliv.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim startPos As Integer = 0
      Dim endPos As Integer = 70
      Dim input As String = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957" + vbCrLf + _
                            "Chicago Cubs, National League, 1903-present" + vbCrLf + _
                            "Detroit Tigers, American League, 1901-present" + vbCrLf + _
                            "New York Giants, National League, 1885-1957" + vbCrLf + _
                            "Washington Senators, American League, 1901-1960" + vbCrLf  

      Dim pattern As String = "\A((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+"
      Dim match As Match

      ' Provide minimal validation in the event the input is invalid.
      If input.Substring(startPos, endPos).Contains(",") Then
         match = Regex.Match(input, pattern, RegexOptions.Multiline)
         Do While match.Success
            Console.Write("The {0} played in the {1} in", _
                              match.Groups(1).Value, match.Groups(4).Value)
            For Each capture As Capture In match.Groups(5).Captures
               Console.Write(capture.Value)
            Next
            Console.WriteLine(".")
            startPos = match.Index + match.Length 
            endPos = CInt(IIf(startPos + 70 <= input.Length, 70, input.Length - startPos))
            If Not input.Substring(startPos, endPos).Contains(",") Then Exit Do
            match = match.NextMatch()            
         Loop
         Console.WriteLine()                               
      End If      
   End Sub   
End Module
' The example displays the following output:
'    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      int startPos = 0, endPos = 70;
      string input = "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957\n" +
                     "Chicago Cubs, National League, 1903-present\n" + 
                     "Detroit Tigers, American League, 1901-present\n" + 
                     "New York Giants, National League, 1885-1957\n" +  
                     "Washington Senators, American League, 1901-1960\n";   

      string pattern = @"\A((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+";
      Match match;

      if (input.Substring(startPos, endPos).Contains(",")) {
         match = Regex.Match(input, pattern, RegexOptions.Multiline);
         while (match.Success) {
            Console.Write("The {0} played in the {1} in", 
                              match.Groups[1].Value, match.Groups[4].Value);
            foreach (Capture capture in match.Groups[5].Captures)
               Console.Write(capture.Value);

            Console.WriteLine(".");
            startPos = match.Index + match.Length;
            endPos = startPos + 70 <= input.Length ? 70 : input.Length - startPos;
            if (! input.Substring(startPos, endPos).Contains(",")) break;
            match = match.NextMatch();
         }
         Console.WriteLine();
      }
   }
}
// The example displays the following output:
//    The Brooklyn Dodgers played in the National League in 1911, 1912, 1932-1957.

Zpět na začátek

Ukončení řetězce nebo ukončení před novým řádkem: \Z

Ukotvení \Z určuje, že ke shodě musí dojít na konci vstupního řetězce nebo před \n na konci vstupního řetězce. Je totožné s ukotvením $ až na to, že \Z ignoruje možnost RegexOptions.Multiline. Proto může ve víceřádkovém řetězci porovnávat pouze konec posledního řádku nebo konec posledního řádku před \n.

Všimněte si, že \Z porovnává \n ale neporovnává \r\n (kombinace znaků CR/LF). Chcete-li shodné znaky CR/LF, \r?\Z ve vzorku regulárního výrazu.

V následujícím příkladu je použito ukotvení \Z v regulárním výrazu, který je podobný tomu z příkladu v oddílu Začátek řetězce nebo řádku, který extrahuje informace o letech během nichž existovaly některé profesionální baseballové týmy. Dílčím \r?\Z v regulárním výrazu ^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\Z odpovídá konci řetězce a také se shoduje s řetězcem, který končí \n nebo \r\n. V důsledku toho každý prvek pole odpovídá vzoru regulárního výrazu.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim inputs() As String = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",  _
                            "Chicago Cubs, National League, 1903-present" + vbCrLf, _
                            "Detroit Tigers, American League, 1901-present" + vbLf, _
                            "New York Giants, National League, 1885-1957", _
                            "Washington Senators, American League, 1901-1960" + vbCrLf }  
      Dim pattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\Z"

      For Each input As String In inputs
         If input.Length > 70 Or Not input.Contains(",") Then Continue For

         Console.WriteLine(Regex.Escape(input))
         Dim match As Match = Regex.Match(input, pattern)
         If match.Success Then
            Console.WriteLine("   Match succeeded.")
         Else
            Console.WriteLine("   Match failed.")
         End If
      Next   
   End Sub
End Module
' The example displays the following output:
'    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
'       Match succeeded.
'    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
'       Match succeeded.
'    Detroit\ Tigers,\ American\ League,\ 1901-present\n
'       Match succeeded.
'    New\ York\ Giants,\ National\ League,\ 1885-1957
'       Match succeeded.
'    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
'       Match succeeded.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",  
                          "Chicago Cubs, National League, 1903-present" + Environment.NewLine, 
                          "Detroit Tigers, American League, 1901-present" + Regex.Unescape(@"\n"), 
                          "New York Giants, National League, 1885-1957", 
                          "Washington Senators, American League, 1901-1960" + Environment.NewLine}; 
      string pattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\Z";

      foreach (string input in inputs)
      {
         if (input.Length > 70 || ! input.Contains(",")) continue;

         Console.WriteLine(Regex.Escape(input));
         Match match = Regex.Match(input, pattern);
         if (match.Success)
            Console.WriteLine("   Match succeeded.");
         else
            Console.WriteLine("   Match failed.");
      }   
   }
}
// The example displays the following output:
//    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
//       Match succeeded.
//    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
//       Match succeeded.
//    Detroit\ Tigers,\ American\ League,\ 1901-present\n
//       Match succeeded.
//    New\ York\ Giants,\ National\ League,\ 1885-1957
//       Match succeeded.
//    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
//       Match succeeded.

Zpět na začátek

Ukončení řetězce: \z

Ukotvení \z určuje, že ke shodě musí dojít na konci vstupního řetězce. Podobně jako prvek jazyka $ i \z ignoruje možnost RegexOptions.Multiline. Na rozdíl od prvku jazyka \Z, \z neporovnává znak \n na konci řetězce. Proto může porovnávat pouze poslední řádek vstupního řetězce.

V následujícím příkladu je použito ukotvení \z v regulárním výrazu, který je jinak stejný jako v příkladu v předchozím oddílu a extrahuje informace o letech během nichž existovaly některé profesionální baseballové týmy. V příkladu se pokusí vyhledat všech pět prvků pole řetězců s vzorce regulárního výrazu ^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\z. Dva řetězce jsou ukončeny návratovým znakem a znakem odřádkování, jeden je ukončen znakem odřádkování a dva nejsou ukončeny ani návratovým znakem ani znakem odřádkování. Jak je zobrazeno na výstupu, pouze řetězce bez návratového znaku či znaku pro odřádkování se shodují se vzorem.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim inputs() As String = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957",  _
                            "Chicago Cubs, National League, 1903-present" + vbCrLf, _
                            "Detroit Tigers, American League, 1901-present" + vbLf, _
                            "New York Giants, National League, 1885-1957", _
                            "Washington Senators, American League, 1901-1960" + vbCrLf }  
      Dim pattern As String = "^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\z"

      For Each input As String In inputs
         If input.Length > 70 Or Not input.Contains(",") Then Continue For

         Console.WriteLine(Regex.Escape(input))
         Dim match As Match = Regex.Match(input, pattern)
         If match.Success Then
            Console.WriteLine("   Match succeeded.")
         Else
            Console.WriteLine("   Match failed.")
         End If
      Next   
   End Sub
End Module
' The example displays the following output:
'    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
'       Match succeeded.
'    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
'       Match failed.
'    Detroit\ Tigers,\ American\ League,\ 1901-present\n
'       Match failed.
'    New\ York\ Giants,\ National\ League,\ 1885-1957
'       Match succeeded.
'    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
'       Match failed.

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string[] inputs = { "Brooklyn Dodgers, National League, 1911, 1912, 1932-1957", 
                          "Chicago Cubs, National League, 1903-present" + Environment.NewLine,
                          "Detroit Tigers, American League, 1901-present\\r",
                          "New York Giants, National League, 1885-1957",
                          "Washington Senators, American League, 1901-1960" + Environment.NewLine };  
      string pattern = @"^((\w+(\s*)){2,}),\s(\w+\s\w+),(\s\d{4}(-(\d{4}|present))*,*)+\r?\z";

      foreach (string input in inputs)
      {
         if (input.Length > 70 || ! input.Contains(",")) continue;

         Console.WriteLine(Regex.Escape(input));
         Match match = Regex.Match(input, pattern);
         if (match.Success)
            Console.WriteLine("   Match succeeded.");
         else
            Console.WriteLine("   Match failed.");
      }   
   }
}
// The example displays the following output:
//    Brooklyn\ Dodgers,\ National\ League,\ 1911,\ 1912,\ 1932-1957
//       Match succeeded.
//    Chicago\ Cubs,\ National\ League,\ 1903-present\r\n
//       Match failed.
//    Detroit\ Tigers,\ American\ League,\ 1901-present\n
//       Match failed.
//    New\ York\ Giants,\ National\ League,\ 1885-1957
//       Match succeeded.
//    Washington\ Senators,\ American\ League,\ 1901-1960\r\n
//       Match failed.

Zpět na začátek

Souvislé porovnávání: \G

Ukotvení \G určuje, že ke shodě musí dojít v místě, kde bylo ukončeno předchozí porovnávání. Při použití tohoto ukotvení s metodou Regex.Matches nebo Match.NextMatch je zaručeno, že všechny porovnání jsou souvislé.

V následujícím příkladu je použit regulární výraz pro extrahování názvů druhů hlodavců z řetězce odděleného čárkami.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "capybara,squirrel,chipmunk,porcupine,gopher," + _
                            "beaver,groundhog,hamster,guinea pig,gerbil," + _
                            "chinchilla,prairie dog,mouse,rat"
      Dim pattern As String = "\G(\w+\s?\w*),?"
      Dim match As Match = Regex.Match(input, pattern)
      Do While match.Success
         Console.WriteLine(match.Groups(1).Value)
         match = match.NextMatch()
      Loop 
   End Sub
End Module
' The example displays the following output:
'       capybara
'       squirrel
'       chipmunk
'       porcupine
'       gopher
'       beaver
'       groundhog
'       hamster
'       guinea pig
'       gerbil
'       chinchilla
'       prairie dog
'       mouse
'       rat

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "capybara,squirrel,chipmunk,porcupine,gopher," + 
                     "beaver,groundhog,hamster,guinea pig,gerbil," + 
                     "chinchilla,prairie dog,mouse,rat";
      string pattern = @"\G(\w+\s?\w*),?";
      Match match = Regex.Match(input, pattern);
      while (match.Success) 
      {
         Console.WriteLine(match.Groups[1].Value);
         match = match.NextMatch();
      } 
   }
}
// The example displays the following output:
//       capybara
//       squirrel
//       chipmunk
//       porcupine
//       gopher
//       beaver
//       groundhog
//       hamster
//       guinea pig
//       gerbil
//       chinchilla
//       prairie dog
//       mouse
//       rat

Regulární výraz \G(\w+\s?\w*),? je interpretovat podle následující tabulky.

Maska	Popis
\G	Začíná tam, kde skončilo poslední porovnávání.
\w+	Porovná jeden nebo více slovních znaků.
\s?	Porovná žádnou nebo jednu mezeru.
\w*	Porovná žádný nebo více znaků slova.
(\w+\s? \w*)	Porovná jeden nebo více slovních znaků následovaných žádnou nebo jednou mezerou, následovanou žádným nebo více slovními znaky. Toto je první zachytávající skupina.
,?	Porovná žádný nebo jeden výskyt znaku čárky.

Zpět na začátek

Hranice slova: \b

Ukotvení \b určuje, že ke shodě musí dojít na hranici mezi slovním znakem (prvek jazyka \w) a mimoslovním znakem (prvek jazyka \W). Slovní znaky se skládají z alfanumerických znaků a podtržítka; mimoslovní znak je libovolný znak, který není alfanumerický nebo podtržítko. (Další informace naleznete v tématu Třídy znaků.) Ke shodě může také dojít na hranici slova na začátku nebo konci řetězce.

Ukotvení \b se často používá k zajištění toho, aby podvýraz odpovídal celému slovu a ne jen začátku nebo konci slova. Regulární výraz \bare\w*\b v následujícím příkladu ukazuje toto využití. Odpovídá libovolnému slovu, které začíná podřetězcem "are". Výstup příkladu také ukazuje, že \b porovnává jak začátek tak i konec vstupního řetězce.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "area bare arena mare"
      Dim pattern As String = "\bare\w*\b"
      Console.WriteLine("Words that begin with 'are':")
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("'{0}' found at position {1}", _
                           match.Value, match.Index)
      Next
   End Sub
End Module
' The example displays the following output:
'       Words that begin with 'are':
'       'area' found at position 0
'       'arena' found at position 10

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "area bare arena mare";
      string pattern = @"\bare\w*\b";
      Console.WriteLine("Words that begin with 'are':");
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("'{0}' found at position {1}",
                           match.Value, match.Index);
   }
}
// The example displays the following output:
//       Words that begin with 'are':
//       'area' found at position 0
//       'arena' found at position 10

Vzor regulárního výrazu je interpretován tak, jak je uvedeno v následující tabulce.

Maska	Popis
\b	Začne porovnání na hranici slova.
are	Porovná podřetězec "are".
\w*	Porovná žádný nebo více znaků slova.
\b	Ukončí porovnání na hranici slova.

Zpět na začátek

Hranice mimo slovo: \B

Ukotvení \B určuje, že ke shodě nesmí dojít na hranici slova. Je opakem ukotvení \b.

V následujícím příkladu je použito ukotvení \B pro vyhledání výskytů podřetězce "qu" ve slově. Vzor regulárního výraz \Bqu\w+ odpovídá podřetězci začínajícímu na "qu", který není na začátku slova a který pokračuje na konec slova.

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim input As String = "equity queen equip acquaint quiet"
      Dim pattern As String = "\Bqu\w+"
      For Each match As Match In Regex.Matches(input, pattern)
         Console.WriteLine("'{0}' found at position {1}", _
                           match.Value, match.Index)
      Next
   End Sub
End Module
' The example displays the following output:
'       'quity' found at position 1
'       'quip' found at position 14
'       'quaint' found at position 21

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string input = "equity queen equip acquaint quiet";
      string pattern = @"\Bqu\w+";
      foreach (Match match in Regex.Matches(input, pattern))
         Console.WriteLine("'{0}' found at position {1}", 
                           match.Value, match.Index);
   }
}
// The example displays the following output:
//       'quity' found at position 1
//       'quip' found at position 14
//       'quaint' found at position 21

Vzor regulárního výrazu je interpretován tak, jak je uvedeno v následující tabulce.

Maska	Popis
\B	Nezačne porovnání na hranici slova.
qu	Porovná podřetězec "qu".
\w+	Porovná jeden nebo více slovních znaků.

Zpět na začátek

Viz také

Odkaz

Možnosti regulárních výrazů

Koncepty

Prvky jazyka regulárních výrazů

Sdílet prostřednictvím

Kotvy v regulárních výrazech

Začátek řetězce nebo řádku: ^

Ukončení řetězce nebo řádku: $

Začátek řetězce: \A

Ukončení řetězce nebo ukončení před novým řádkem: \Z

Ukončení řetězce: \z

Souvislé porovnávání: \G

Hranice slova: \b

Hranice mimo slovo: \B

Viz také

Odkaz

Koncepty

Další materiály