How to: Reorder the Fields of a Delimited File (LINQ)
A comma-separated value (CSV) file is a text file that is often used to store spreadsheet data or other tabular data that is represented by rows and columns. By using the Split method to separate the fields, it is very easy to query and manipulate CSV files by using LINQ. In fact, the same technique can be used to reorder the parts of any structured line of text; it is not limited to CSV files.
In the following example, assume that the three columns represent students' "last name," "first name", and "ID." The fields are in alphabetical order based on the students' last names. The query produces a new sequence in which the ID column appears first, followed by a second column that combines the student's first name and last name. The lines are reordered according to the ID field. The results are saved into a new file and the original data is not modified.
To create the data file
Create a new Visual C# or Visual Basic project and copy the following lines into a plain text file that is named spreadsheet1.csv. Save the file in your solution folder.
Adams,Terry,120 Fakhouri,Fadi,116 Feng,Hanying,117 Garcia,Cesar,114 Garcia,Debra,115 Garcia,Hugo,118 Mortensen,Sven,113 O'Donnell,Claire,112 Omelchenko,Svetlana,111 Tucker,Lance,119 Tucker,Michael,122 Zabokritski,Eugene,121
Example
Class CSVFiles
Shared Sub Main()
' Create the IEnumerable data source.
Dim lines As String() = System.IO.File.ReadAllLines("../../../spreadsheet1.csv")
' Execute the query. Put field 2 first, then
' reverse and combine fields 0 and 1 from the old field
Dim lineQuery = From line In lines
Let x = line.Split(New Char() {","})
Order By x(2)
Select x(2) & ", " & (x(1) & " " & x(0))
' Execute the query and write out the new file. Note that WriteAllLines
' takes a string array, so ToArray is called on the query.
System.IO.File.WriteAllLines("../../../spreadsheet2.csv", lineQuery.ToArray())
' Keep console window open in debug mode.
Console.WriteLine("Spreadsheet2.csv written to disk. Press any key to exit")
Console.ReadKey()
End Sub
End Class
' Output to spreadsheet2.csv:
' 111, Svetlana Omelchenko
' 112, Claire O'Donnell
' 113, Sven Mortensen
' 114, Cesar Garcia
' 115, Debra Garcia
' 116, Fadi Fakhouri
' 117, Hanying Feng
' 118, Hugo Garcia
' 119, Lance Tucker
' 120, Terry Adams
' 121, Eugene Zabokritski
' 122, Michael Tucker
class CSVFiles
{
static void Main(string[] args)
{
// Create the IEnumerable data source
string[] lines = System.IO.File.ReadAllLines(@"../../../spreadsheet1.csv");
// Create the query. Put field 2 first, then
// reverse and combine fields 0 and 1 from the old field
IEnumerable<string> query =
from line in lines
let x = line.Split(',')
orderby x[2]
select x[2] + ", " + (x[1] + " " + x[0]);
// Execute the query and write out the new file. Note that WriteAllLines
// takes a string[], so ToArray is called on the query.
System.IO.File.WriteAllLines(@"../../../spreadsheet2.csv", query.ToArray());
Console.WriteLine("Spreadsheet2.csv written to disk. Press any key to exit");
Console.ReadKey();
}
}
/* Output to spreadsheet2.csv:
111, Svetlana Omelchenko
112, Claire O'Donnell
113, Sven Mortensen
114, Cesar Garcia
115, Debra Garcia
116, Fadi Fakhouri
117, Hanying Feng
118, Hugo Garcia
119, Lance Tucker
120, Terry Adams
121, Eugene Zabokritski
122, Michael Tucker
*/
Compiling the Code
Create a Visual Studio project that targets the .NET Framework version 3.5. By default, the project has a reference to System.Core.dll and a using directive (C#) or Imports statement (Visual Basic) for the System.Linq namespace. In C# projects, add a using directive for the System.IO namespace.
Copy this code into your project.
Press F5 to compile and run the program.
Press any key to exit the console window.