.NET 正則表達式

發行項
03/26/2025

正則表達式提供功能強大、彈性且有效率的方式來處理文字。正規表示式的豐富模式匹配語法可讓您快速解析大量文字，以：

尋找特定字元模式。
驗證文字以確保其符合預先定義的模式 (例如電子郵件地址)。
解壓縮、編輯、取代或刪除文字子字串。
將擷取的字串新增至集合，以產生報表。

對於處理字串或剖析大量文字區塊的許多應用程式，正則表達式是不可或缺的工具。

正則表達式的運作方式

使用正則表達式處理文字的核心部分是正則表達式引擎，由 .NET 中的 System.Text.RegularExpressions.Regex 物件表示。至少，在使用正則表示式處理文字時，需要向正則表達式引擎提供下列兩項資訊：

文字中要識別的正則表達式模式。

在 .NET 中，正則表達式模式是由特殊語法或語言所定義，其與 Perl 5 正則表達式相容，並新增一些其他功能，例如從右至左比對。如需詳細資訊，請參閱正規表示式語言 - 快速參考。
要剖析正則表示式模式的文字。

Regex 類別的方法可讓您執行下列作業：

藉由呼叫 Regex.IsMatch 方法，判斷正則表達式模式是否發生在輸入文字中。如需使用 IsMatch 方法來驗證文字的範例，請參閱如何：確認字串是否為有效的電子郵件格式。
藉由呼叫 Regex.Match 或 Regex.Matches 方法，擷取符合正則表達式模式的一個或多個文字。先前的方法會傳回 System.Text.RegularExpressions.Match 物件，以提供相符文字的相關信息。後者會傳回一個 MatchCollection 物件，該物件包含針對在解析過的文字中找到的每個匹配結果的 System.Text.RegularExpressions.Match 物件。
藉由呼叫 Regex.Replace 方法，取代符合正則表達式模式的文字。如需使用 Replace 方法來變更日期格式並移除字串中無效字元的範例，請參閱如何：從字串移除無效字元和範例：變更日期格式。

如需正規表示式物件模型的概觀，請參閱正則表示式物件模型。

如需正則表示式語言的詳細資訊，請參閱正則表示式語言 - 快速參考或下載並列印下列其中一本摺頁冊：

Word 中的快速參考（.docx）格式
PDF 格式的快速參考 (.pdf)

正則表達式範例

String 類別包含字串搜尋和取代方法，當您想要在較大的字串中尋找常值字串時，可以使用此方法。當您想要在較大的字串中找出數個子字串之一，或當您想要識別字串中的模式時，正則表達式最有用，如下列範例所示。

警告

使用 System.Text.RegularExpressions 來處理不受信任的輸入時，請設定逾時。惡意使用者可以提供輸入給 RegularExpressions，導致拒絕服務攻擊。使用 RegularExpressions 的 ASP.NET Core 框架 API 會傳遞一個逾時設定。

小提示

System.Web.RegularExpressions 命名空間包含許多正則表達式對象，這些物件會實作預先定義的正則表示式模式，以剖析 HTML、XML 和 ASP.NET 檔中的字元串。例如，TagRegex 類別會識別字串中的開始標記，而 CommentRegex 類別會識別字串中的 ASP.NET 批註。

範例 1：取代子字串

假設郵件清單包含有時包含標題的名稱（Mr.、Mrs.、Miss 或 Ms.），以及名字和姓氏。假設您不想從清單中生成信封標籤時包含標題。在此情況下，您可以使用正規表示式來移除標題，如下列範例所示：

using System;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      string pattern = "(Mr\\.? |Mrs\\.? |Miss |Ms\\.? )";
      string[] names = { "Mr. Henry Hunt", "Ms. Sara Samuels",
                         "Abraham Adams", "Ms. Nicole Norris" };
      foreach (string name in names)
         Console.WriteLine(Regex.Replace(name, pattern, String.Empty));
   }
}
// The example displays the following output:
//    Henry Hunt
//    Sara Samuels
//    Abraham Adams
//    Nicole Norris

Imports System.Text.RegularExpressions

Module Example
    Public Sub Main()
        Dim pattern As String = "(Mr\.? |Mrs\.? |Miss |Ms\.? )"
        Dim names() As String = {"Mr. Henry Hunt", "Ms. Sara Samuels", _
                                  "Abraham Adams", "Ms. Nicole Norris"}
        For Each name As String In names
            Console.WriteLine(Regex.Replace(name, pattern, String.Empty))
        Next
    End Sub
End Module
' The example displays the following output:
'    Henry Hunt
'    Sara Samuels
'    Abraham Adams
'    Nicole Norris

正則表達式模式 (Mr\.? |Mrs\.? |Miss |Ms\.? ) 符合任何出現的“Mr.”、“Mrs”、“Mrs.”、“Miss”、“Ms” 或 “Ms. ”。呼叫 Regex.Replace 方法會將相符的字串取代為 String.Empty;換句話說，它會從原始字串中移除它。

範例 2：識別重複的字組

不小心重複使用單字是作者常犯的錯誤。使用正則表示式來識別重複的單字，如下列範例所示：

using System;
using System.Text.RegularExpressions;

public class Class1
{
   public static void Main()
   {
      string pattern = @"\b(\w+?)\s\1\b";
      string input = "This this is a nice day. What about this? This tastes good. I saw a a dog.";
      foreach (Match match in Regex.Matches(input, pattern, RegexOptions.IgnoreCase))
         Console.WriteLine($"{match.Value} (duplicates '{match.Groups[1].Value}') at position {match.Index}");
   }
}
// The example displays the following output:
//       This this (duplicates 'This') at position 0
//       a a (duplicates 'a') at position 66

Imports System.Text.RegularExpressions

Module modMain
    Public Sub Main()
        Dim pattern As String = "\b(\w+?)\s\1\b"
        Dim input As String = "This this is a nice day. What about this? This tastes good. I saw a a dog."
        For Each match As Match In Regex.Matches(input, pattern, RegexOptions.IgnoreCase)
            Console.WriteLine("{0} (duplicates '{1}') at position {2}", _
                              match.Value, match.Groups(1).Value, match.Index)
        Next
    End Sub
End Module
' The example displays the following output:
'       This this (duplicates 'This') at position 0
'       a a (duplicates 'a') at position 66

正則表達式模式 \b(\w+?)\s\1\b 可以解譯如下：

圖案	解譯
`\b`	從單字分界開始。
`(\w+?)`	比對一或多個單字字元，但盡可能少字元。它們會組成群組，稱為 `\1`。
`\s`	比對空白字符。
`\1`	比對和名為 `\1`的群組相等的子字串。
`\b`	比對字邊界。

呼叫 Regex.Matches 方法時，正規表示式選項設定為 RegexOptions.IgnoreCase。因此，比對作業不區分大小寫，而此範例會將子字元串 “This this” 識別為重複。

輸入字串包含子字串「this？這個」。不過，由於中間有標點符號，因此不會將其識別為重複項。

範例 3：動態建置區分文化特性的正則表達式

下列範例說明了結合正則表達式與 .NET 全球化功能所提供的彈性之威力。它會使用 NumberFormatInfo 對象來判斷系統目前文化特性中的貨幣值格式。然後，它會使用該資訊來動態建構正則表達式，以從文字中擷取貨幣值。針對每個匹配項目，它會擷取只包含數值字串的子小組，將它轉換成 Decimal 值，並計算累積總和。

using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      // Define text to be parsed.
      string input = "Office expenses on 2/13/2008:\n" +
                     "Paper (500 sheets)                      $3.95\n" +
                     "Pencils (box of 10)                     $1.00\n" +
                     "Pens (box of 10)                        $4.49\n" +
                     "Erasers                                 $2.19\n" +
                     "Ink jet printer                        $69.95\n\n" +
                     "Total Expenses                        $ 81.58\n";

      // Get current culture's NumberFormatInfo object.
      NumberFormatInfo nfi = CultureInfo.CurrentCulture.NumberFormat;
      // Assign needed property values to variables.
      string currencySymbol = nfi.CurrencySymbol;
      bool symbolPrecedesIfPositive = nfi.CurrencyPositivePattern % 2 == 0;
      string groupSeparator = nfi.CurrencyGroupSeparator;
      string decimalSeparator = nfi.CurrencyDecimalSeparator;

      // Form regular expression pattern.
      string pattern = Regex.Escape( symbolPrecedesIfPositive ? currencySymbol : "") +
                       @"\s*[-+]?" + "([0-9]{0,3}(" + groupSeparator + "[0-9]{3})*(" +
                       Regex.Escape(decimalSeparator) + "[0-9]+)?)" +
                       (! symbolPrecedesIfPositive ? currencySymbol : "");
      Console.WriteLine( "The regular expression pattern is:");
      Console.WriteLine("   " + pattern);

      // Get text that matches regular expression pattern.
      MatchCollection matches = Regex.Matches(input, pattern,
                                              RegexOptions.IgnorePatternWhitespace);
      Console.WriteLine($"Found {matches.Count} matches.");

      // Get numeric string, convert it to a value, and add it to List object.
      List<decimal> expenses = new List<Decimal>();

      foreach (Match match in matches)
         expenses.Add(Decimal.Parse(match.Groups[1].Value));

      // Determine whether total is present and if present, whether it is correct.
      decimal total = 0;
      foreach (decimal value in expenses)
         total += value;

      if (total / 2 == expenses[expenses.Count - 1])
         Console.WriteLine($"The expenses total {expenses[expenses.Count - 1]:C2}.");
      else
         Console.WriteLine($"The expenses total {total:C2}.");
   }
}
// The example displays the following output:
//       The regular expression pattern is:
//          \$\s*[-+]?([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)
//       Found 6 matches.
//       The expenses total $81.58.

Imports System.Collections.Generic
Imports System.Globalization
Imports System.Text.RegularExpressions

Public Module Example
    Public Sub Main()
        ' Define text to be parsed.
        Dim input As String = "Office expenses on 2/13/2008:" + vbCrLf + _
                              "Paper (500 sheets)                      $3.95" + vbCrLf + _
                              "Pencils (box of 10)                     $1.00" + vbCrLf + _
                              "Pens (box of 10)                        $4.49" + vbCrLf + _
                              "Erasers                                 $2.19" + vbCrLf + _
                              "Ink jet printer                        $69.95" + vbCrLf + vbCrLf + _
                              "Total Expenses                        $ 81.58" + vbCrLf
        ' Get current culture's NumberFormatInfo object.
        Dim nfi As NumberFormatInfo = CultureInfo.CurrentCulture.NumberFormat
        ' Assign needed property values to variables.
        Dim currencySymbol As String = nfi.CurrencySymbol
        Dim symbolPrecedesIfPositive As Boolean = CBool(nfi.CurrencyPositivePattern Mod 2 = 0)
        Dim groupSeparator As String = nfi.CurrencyGroupSeparator
        Dim decimalSeparator As String = nfi.CurrencyDecimalSeparator

        ' Form regular expression pattern.
        Dim pattern As String = Regex.Escape(CStr(IIf(symbolPrecedesIfPositive, currencySymbol, ""))) + _
                                "\s*[-+]?" + "([0-9]{0,3}(" + groupSeparator + "[0-9]{3})*(" + _
                                Regex.Escape(decimalSeparator) + "[0-9]+)?)" + _
                                CStr(IIf(Not symbolPrecedesIfPositive, currencySymbol, ""))
        Console.WriteLine("The regular expression pattern is: ")
        Console.WriteLine("   " + pattern)

        ' Get text that matches regular expression pattern.
        Dim matches As MatchCollection = Regex.Matches(input, pattern, RegexOptions.IgnorePatternWhitespace)
        Console.WriteLine("Found {0} matches. ", matches.Count)

        ' Get numeric string, convert it to a value, and add it to List object.
        Dim expenses As New List(Of Decimal)

        For Each match As Match In matches
            expenses.Add(Decimal.Parse(match.Groups.Item(1).Value))
        Next

        ' Determine whether total is present and if present, whether it is correct.
        Dim total As Decimal
        For Each value As Decimal In expenses
            total += value
        Next

        If total / 2 = expenses(expenses.Count - 1) Then
            Console.WriteLine("The expenses total {0:C2}.", expenses(expenses.Count - 1))
        Else
            Console.WriteLine("The expenses total {0:C2}.", total)
        End If
    End Sub
End Module
' The example displays the following output:
'       The regular expression pattern is:
'          \$\s*[-+]?([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)
'       Found 6 matches.
'       The expenses total $81.58.

在目前文化特性為英文 -美國（en-US）的計算機上，此範例會動態建置正則表達式 \$\s*[-+]?([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)。此正則表示式模式可以解譯如下：

圖案	解譯
`\$`	在輸入字串中尋找單一出現的貨幣符號（`$`）。正則表達式模式字串包含反斜杠，表示貨幣符號要以字面方式解譯，而不是正則表達式錨點。單 `$` 符號表示正規表示式引擎應該嘗試在字串結尾開始比對。為了確保目前文化特性的貨幣符號不會誤譯為正則表達式符號，此範例會呼叫 Regex.Escape 方法來逸出字元。
`\s*`	尋找零個或多個空白字元的出現次數。
`[-+]?`	尋找零或一次出現的正號或負號。
`([0-9]{0,3}(,[0-9]{3})*(\.[0-9]+)?)`	外部括弧會將此表達式定義為擷取群組或子表達式。如果找到相符項目，則可以從 Match.Groups 屬性所傳回之 GroupCollection 物件中的第二個 Group 物件擷取比對字串的這部分信息。集合中的第一個元素代表整個匹配。
`[0-9]{0,3}`	尋找零到三個十進位數 0 到 9 的出現次數。
`(,[0-9]{3})*`	尋找零個或多個群組分隔符的出現次數，後面接著三個十進位數。
`\.`	尋找單一出現的小數分隔符號。
`[0-9]+`	尋找一或多個十進位數。
`(\.[0-9]+)?`	尋找零或一個出現的小數分隔符，後面接著至少一個十進位數。

如果在輸入字串中找到每個子模式，比對就會成功，而包含相符專案資訊的 Match 物件會新增至 MatchCollection 物件。

標題	說明
規則運算式語言 - 快速參考	提供可用來定義正則表示式之字元、運算符和建構集的資訊。
正則表示式物件模型	提供說明如何使用正則表示式類別的資訊和程式碼範例。
正則表達式行為的詳情	提供 .NET 正則表達式的功能和行為相關信息。
在Visual Studio 中使用正則表示式

共用方式為

.NET 正則表達式

正則表達式的運作方式

正則表達式範例

範例 1：取代子字串

範例 2：識別重複的字組

範例 3：動態建置區分文化特性的正則表達式

參考文獻

其他資源

共用方式為

.NET 正則表達式

正則表達式的運作方式

正則表達式範例

範例 1：取代子字串

範例 2：識別重複的字組

範例 3：動態建置區分文化特性的正則表達式

相關文章

參考文獻

其他資源