Read a CSV file into an array in C#


Read a CSV file

The following LoadCsv method reads the CSV file into a two-dimensional array of strings.

// Load a CSV file into an array of rows and columns.
// Assume there may be blank lines but every line has
// the same number of fields.
private string[,] LoadCsv(string filename)
{
    // Get the file's text.
    string whole_file = System.IO.File.ReadAllText(filename);

    // Split into lines.
    whole_file = whole_file.Replace('\n', '\r');
    string[] lines = whole_file.Split(new char[] { '\r' },
        StringSplitOptions.RemoveEmptyEntries);

    // See how many rows and columns there are.
    int num_rows = lines.Length;
    int num_cols = lines[0].Split(',').Length;

    // Allocate the data array.
    string[,] values = new string[num_rows, num_cols];

    // Load the array.
    for (int r = 0; r < num_rows; r++)
    {
        string[] line_r = lines[r].Split(',');
        for (int c = 0; c < num_cols; c++)
        {
            values[r, c] = line_r[c];
        }
    }

    // Return the values.
    return values;
}

The code uses System.IO.File.ReadAllText to read the file’s contents into a string. It then uses Split to break the file into lines, ignoring any blank lines.

The code then loops through the lines using Split to split the lines into fields and adding their values to the array. When it’s done, the method returns the two-dimensional array of strings.

When you click the Go button, the following code calls the LoadCsv method and displays the result in a DataGridView control.

// Read the CSV file.
// Assumes each line has the same number of fields
// and no line is blank.
private void btnGo_Click(object sender, EventArgs e)
{
    // Get the data.
    string[,] values = LoadCsv(txtFile.Text);
    int num_rows = values.GetUpperBound(0) + 1;
    int num_cols = values.GetUpperBound(1) + 1;

    // Display the data to show we have it.

    // Make column headers.
    // For this example, we assume the first row
    // contains the column names.
    dgvValues.Columns.Clear();
    for (int c = 0; c < num_cols; c++)
        dgvValues.Columns.Add(values[0, c], values[0, c]);

    // Add the data.
    for (int r = 1; r < num_rows; r++)
    {
        dgvValues.Rows.Add();
        for (int c = 0; c < num_cols; c++)
        {
            dgvValues.Rows[r - 1].Cells[c].Value = values[r, c];
        }
    }
}

This code calls LoadCsv and saves the returned strings in an array. It uses the array’s GetUpperBound method twice to see how many rows and columns it contains.

Next the program clears the DataGridView control’s columns. It loops through the first values in each of the array’s columns and uses it to build a column in the DataGridView. For each value, it creates a column named after the value and displaying the value as its header text.

Next the code loops through the remaining rows in the table. For each row, it adds the row’s values to the DataGridView control’s cells.


Download Example   Follow me on Twitter   RSS feed   Donate




This entry was posted in files, strings and tagged , , , , , , , , , , , . Bookmark the permalink.

36 Responses to Read a CSV file into an array in C#

  1. Michel Despatie says:

    Could not display the last (longitude) column. Had to modify the C< num_cols; to C<= num_cols; and r

  2. Rod Stephens says:

    Oops! The problem is in the code that displays the values. Instead of this:

    int num_rows = values.GetUpperBound(0);
    int num_cols = values.GetUpperBound(1); 

    It should have used this:

    int num_rows = values.GetUpperBound(0) + 1;
    int num_cols = values.GetUpperBound(1) + 1; 

    The number of rows (and columns) in an array is 1 more than the upper bound.

    Thanks for pointing this out!

  3. Cartoons says:

    Nice blog. Your code helped me out at work.

  4. Adrian says:

    Rather than two statements with “replace” and “split” it may be slightly quicker to use:

    string[] lines = whole_file.Split(new char[] { ‘\r’, ‘\n’ },
    StringSplitOptions.RemoveEmptyEntries);

  5. Habeeb says:

    My favourite csv parser is one built into .net library. This is a hidden treasure inside Microsoft.VisualBasic namespace.
    Below is a sample code:

    using Microsoft.VisualBasic.FileIO;
    
    // Habeeb, "Dubai Media City, Dubai"
    var path = @"C:\Person.csv";
    using (TextFieldParser csvParser = new TextFieldParser(path))
    {
        csvParser.CommentTokens = new string[] { "#" };
        csvParser.SetDelimiters(new string[] { "," });
        csvParser.HasFieldsEnclosedInQuotes = true;
         
        // Skip the row with the column names
        csvParser.ReadLine();
         
        while (!csvParser.EndOfData)
        {
            // Read current line fields,
            // pointer moves to the next line.
            string[] fields = csvParser.ReadFields();
            string Name = fields[0];
            string Address = fields[1];
        }
    }

    More details about the parser is given here: http://codeskaters.blogspot.ae/2015/11/c-easiest-csv-parser-built-in-net.html

  6. Sandhya R says:

    I am a few days old in C#.
    I am expected to read a .csv file, which contains numbers, into a two dimensional array. Can u please tell me how to do it?
    Thank you for the help!

    • RodStephens says:

      This example should work. The only difference is that your values are numbers and this example works with strings. Simply use int.Parse, double.Parse, or a similar Parse method for your data type to convert the strings into numbers.

      • Sandhya R says:

        i tried this code as it is, with one change:

        var filePath = @”C:\MatlabPDNA\csvlist.csv”;
        // Get the data.
        string[,] values = LoadCsv(filePath);

        this is the error i get
        The name ‘dgvValues’ does not exist in the current context

        please help

        • RodStephens says:

          dgvValues would be a DataGridView control that the program is trying to use to display the values.

          If you just want to load the values into an array and not display them in a DataGridView, remove the code in the btnGo_Click event handler after the comment that says, “Display the data to show we have it.”

      • Sandhya R says:

        Sir, as a beginner i find your code to be complex.

        This is my code, it prints the numbers in a particular column from .csv file

        Console.WriteLine(“Prints a particular COLUMN”);
        reader = new StreamReader(filePath);
        while (!reader.EndOfStream)
        {
        line = reader.ReadLine();
        var values = line.Split(‘,’);
        Console.WriteLine(values[6]);
        // prints the values in 7th column
        }
        Console.ReadLine();

        how can i alter this code to convert the string in each column to a number and store them all in a 2D array.
        I code everything in main. I have not been exposed to functions and classes in c#. This is my first program. Please bear with me. Thank you very much.

        • RodStephens says:

          Sorry but, yes, some of the examples assume you have some previous experience.

          You should use the code in the LoadCsv method. All that method does it open the file and copy the values into the values array.

          You can copy that code into your main method. Or better yet, read about methods so you can just call LoadCsv. You would do it like this in your main method:

          string[,] values = LoadCsv(filename)

          Where filename is a variable holding the name of the file. Or you could make it a constant like “C:\\wherever\\values.csv.”

          To use numbers instead of strings, change the arrays to int[,] instead of string[,]. Then when the LoadCsv code saves a value, use int.Parse to convert it into an integer as in:

          values[r, c] = int.Parse(line_r[c]);

  7. J.claude says:

    hello sir ,
    how to remove empty row in array list?
    thanks!!!

    • RodStephens says:

      This line of code:

      string[] lines = whole_file.Split(new char[] { '\r' },
          StringSplitOptions.RemoveEmptyEntries);

      Should remove any lines that are completely empty. If you want to also remove lines that contain two empty values separated by a comma, that will take a bit more work.

      One approach would be to add the rows to a List instead of an array. Then you can skip any rows that don’t contain any data.

      Another approach would be to create the lines array as before. Then loop through it and count the rows that have data. Use the count to size the values array.

      Next loop through the entries and add them to the values array, skipping any that don’t contain data.

      All in all, quite a bit of extra work.

  8. s . chits says:

    hello sir. where in the code do i convert the array to an integer.

    • RodStephens says:

      This example reads the values as strings. You can either:

      1. Write another method that converts the array of strings into an array of integers

      2. Use integers instead of strings and then change this line:

      values[r, c] = line_r[c];

      Into this:

      values[r, c] = int.Parse(line_r[c]);
  9. DWalker07 says:

    The LoadCSV module code does a Split on the comma character. That doesn’t handle commas within quoted fields.

    • RodStephens says:

      True. Handling quoted fields that may contain commas is a bit more involved than simply using Split.

      To handle that situation, keep track of whether a quoted field is open. Create a character variable quote. When the first character in a field is a ” or ‘, set quote equal to that character. After that the field extends until you find a matching close quote.

      Habeeb’s suggestion of using a TextFieldParser might handle this for you.

  10. ameet says:

    Great Blog !!!!

    How can I calculate SUM on specific column I have same csv file and I want to calculate sum, as per your previous comment I tried” values[r, c] = int.Parse(line_r[c]); ” But getting Error Can not convert implicitly ……. ! Please tell me how to do it in a proper way.
    Thanks in advance.

    • RodStephens says:

      That looks like the right approach. You should check (possibly in the debugger) to see what value is in line_r[c] when you get the error. If it’s not an integer, then you would get problems. You may need to clean up the data or add code to check for that.

  11. Joe Pallagi says:

    hello Sir,
    I straggling on an issue for hours i cant find how to accomplish to whole column’s values in a single array, if you know the solution i would really really really appreciate 😀

    have fun!

    • RodStephens says:

      I don’t know if I understand what you need. Do you want to read the values from one column into an array instead of reading the whole page? If so, try using the code above with this loop:

      // Allocate the array for column col_num.
      string[] values = new string[num_rows];
      
      // Load the array.
      for (int r = 0; r < num_rows; r++)
      {
          string[] line_r = lines[r].Split(',');
          values[r] = line_r[col_num];
      }
  12. yen says:

    mine appeared just in one column instead of the two-dimensional array. please help

    • RodStephens says:

      Make sure you have the code correct. For example, if your data uses a different delimiter instead of commas, then it would put all of the values on a line in the first column.

  13. yen says:

    hi sir,
    thanks…secondly the string[,] values contains all sorts of data types…
    1. i need to use them individually with their own data types.
    2 for example putting them in a list
    3. I also want to put different items there into different variables inside different classes
    4. how can i store it in a dictionary

    thanks

    • RodStephens says:

      You can use Parse methods such as int.Parse and float.Parse to convert text values into other data types. You can use TryParse if you’re not sure whether a value has the right format.

      After you convert the items into the right data types, you should be able to save them in lists, arrays, variables, members of a class, etc.

  14. yen says:

    sir,
    Thank you very much for your good help.
    Assuming there are many columns such as:
    (i) subjects, (ii) classroom, and (iii) finished time……means that there are many subjects different classrooms and all the associated times. Subjects, classrooms and finished time can appear multiple times throughout the array.
    Please how do I search through the an array [ , ] to know which particular classroom corresponds to the last finished time for a particular subject.
    Please which approach and method can be used for this?
    many thanks
    yen.

  15. yen says:

    Also is it possible to parse a whole column array [1,N] at once without having to loop through each element one by one?

    • RodStephens says:

      You can either loop through the array or use LINQ. LINQ code is shorter so it’s easier to debug, but you need to learn a bit about LINQ to use it so that can make getting started a bit harder.

      If I were you, I would use LINQ. It can essentially look through a column and find the latest date/time, add up a column, and do just about anything for the values in one or more columns.

  16. yen says:

    Sir,
    Please kindly give me a simple example for using Linq to query the .csv file/array[N,N] and then send the result to a class.
    thanks

    • RodStephens says:

      If the data is in an string[,] then try something like this (without LINQ):

      private class ValuePair
      {
          public int X, Y;
      }
      
      private void Form1_Load(object sender, EventArgs e)
      {
          // Load some values.
          string[,] values =
          {
              {"A", "1", "1"},
              {"B", "2", "22"},
              {"C", "3", "333"},
          };
      
          // Loop through the array building the results.
          List pairs = new List();
          for (int i = 0; i < = values.GetUpperBound(0); i++)
          {
              int x = int.Parse(values[i, 1]);
              int y = int.Parse(values[i, 2]);
              ValuePair new_pair = new ValuePair() { X = x, Y = y };
              pairs.Add(new_pair);
          }
      
          // Convert the list into an array.
          ValuePair[] results = pairs.ToArray();
      
          // Display the results.
          for (int i = 0; i < results.Length; i++)
              Console.WriteLine(
                  results[i].X + ", " + results[i].Y);
      }
  17. yen says:

    Happy new year. Thank you very much.

  18. yen says:

    Please how can you show a 3 dimensional array in a DataGridView.
    thanks

  19. yen says:

    OK, Many thanks.
    Let me make it simpler, I am trying to load the data below into a 3 dimensional list/Jagged array. e.g converting [21, 7] into [x, y, z]
    where:
    x -there are just two dates for x but different dimensions -1st column
    y -this varies
    z -there are just two flights for z different dimensions -2nd column

    final output will be :
    2 pages:
    page 1: 5×5
    2×5
    and
    page 2: 5×5
    7×5

    Date flight from Start time to finish time distance(km)
    01/01/2017 FLI 101 country 1 2pm country 7 3pm 1000
    01/01/2017 FLI 101 country 2 4pm country 8 5pm 1500
    01/01/2017 FLI 101 country 3 9am country 9 10am 1000
    01/01/2017 FLI 101 country 4 12noon country 10 1pm 1500
    01/01/2017 FLI 101 country 5 10am country 11 11am 2000
    01/01/2017 FLI 102 country 6 1pm country 12 2pm 1000
    01/01/2017 FLI 102 country 7 4pm country 1 5pm 1000
    02/01/2017 FLI 101 country 8 9am country 2 10am 1500
    02/01/2017 FLI 101 country 9 3pm country 3 4pm 2000
    02/01/2017 FLI 101 country 10 2pm country 4 3pm 1000
    02/01/2017 FLI 101 country 11 4pm country 5 5pm 1000
    02/01/2017 FLI 101 country 12 9am country 6 10am 2000
    02/01/2017 FLI 102 country 1 12noon country 7 1pm 1000
    02/01/2017 FLI 102 country 2 10am country 8 11am 1000
    02/01/2017 FLI 102 country 3 1pm country 9 2pm 1500
    02/01/2017 FLI 102 country 4 4pm country 10 5pm 1000
    02/01/2017 FLI 102 country 5 9am country 11 10am 1500
    02/01/2017 FLI 102 country 6 3pm country 12 4pm 2000
    02/01/2017 FLI 102 country 1 12noon country 7 1pm 1000

    the actual data is above but for better understanding, the above is simplified to :

    Date flight from Start time to finish time distance(km)
    01/01/2017 FLI 101 country 1 2pm country 7 3pm 1000
    country 2 4pm country 8 5pm 1500
    country 3 9am country 9 10am 1000
    country 4 12noon country 10 1pm 1500
    country 5 10am country 11 11am 2000
    FLI 102 country 6 1pm country 12 2pm 1000
    country 7 4pm country 1 5pm 1000
    02/01/2017 FLI 101 country 8 9am country 2 10am 1500
    country 9 3pm country 3 4pm 2000
    country 10 2pm country 4 3pm 1000
    country 11 4pm country 5 5pm 1000
    country 12 9am country 6 10am 2000
    FLI 102 country 1 12noon country 7 1pm 1000
    country 2 10am country 8 11am 1000
    country 3 1pm country 9 2pm 1500
    country 4 4pm country 10 5pm 1000
    country 5 9am country 11 10am 1500
    country 6 3pm country 12 4pm 2000
    country 1 12noon country 7 1pm 1000

    • RodStephens says:

      If I understand (still not guaranteed), you might want some sort of tree/grid combination control. Unfortunately Visual Studio doesn’t have one.

      You could use a TreeView if you don’t need the data to be in a grid. I.e. if you’re okay having it in text as you shown above: country 6 3pm country 12 4pm 2000.

      Or I still think you could use master-detail in a DataGrid.

      One approach would be to lay out the interface you want using Word or something just so you can see exactly what you want. Then you can start looking for the controls you need to build it.

      In the very worst case, you could draw the data on something like a FlowDocument or FixedDocument. That would be a lot more work, but you get exactly what you want.

Leave a Reply

Your email address will not be published. Required fields are marked *