While working on some data and importing a few files from an s3 bucket, Snowflake encountered an error that took me too long to figure out: Field delimiter ‘,’ found while expecting record delimiter ‘\n’

To start of , I’m using a simple csv file format :

create or replace file format my_csv_format
  type = csv
  record_delimiter = '\n'
  field_delimiter = ','
  skip_header = 1
  null_if = ('NULL', 'null')
  empty_field_as_null = true

The files that I’m importing are structured like this example :

74942,”Trading Alliance”,1492176193,80,Cooperative,,,,1
Line example from the csv file

As you can see, the field is delimited by a ‘,’, empty fields are present and should be treated as NULL values, and from reviewing the file, I can see that every record has its own line.

With everything set , I started the imported but was quickly hit by the dreaded error :

Field delimiter ‘,’ found while expecting record delimiter ‘\n’ File ‘factions.csv’, line 759, character 85 Row 758, column “FACTIONS”[“IS_PLAYER_FACTION”:9]

the corresponding line :

74927,”Strategic Wings, Order and Retaliation Division”,1475489397,80,Cooperative,,,,1
Line giving the error

We can simply read that “Snowflake is finding a field delimiter while expecting a record delimiter” but what does it mean? It simply means that after an X number of Field Delimiters, Snowflake expects the next delimiter to be a record delimiter. The X here is the number of columns in the table (which is 8).

If we count the number of “,” we can spot the error ; this line has 9 instead of 8! One of the fields contains an extra : “Strategic Wings, Order and Retaliation Division”. The field has a “,” in its contents but shouldn’t be used as a delimiter.

This is where a change to the file format comes in :

FIELD_OPTIONALLY_ENCLOSED_BY makes it possible to handle these kind of field values. We should let Snowflake know that the fields as a whole can be enclosed by double quotes (“).
Our first instinct would tell us to use FIELD_OPTIONALLY_ENCLOSED_BY = ” ” ” or FIELD_OPTIONALLY_ENCLOSED_BY ‘ ” ‘ to make this happen but this is the second reason for writing this piece. Snowflake also references this in its documentation:

Character used to enclose strings. Value can be NONE, single quote character ('), or double quote character ("). To use the single quote character, use the octal or hex representation (0x27)

The hex representation of a double quote is 0x22 , making the new csv file format :

create or replace file format my_csv_format
  type = csv
  record_delimiter = '\n'
  field_delimiter = ','
  skip_header = 1
  null_if = ('NULL', 'null')
  empty_field_as_null = true
  FIELD_OPTIONALLY_ENCLOSED_BY = '0x22'

In need of further explanations or help? Check out our trainings page and find the one that suits your needs best. You can also hire me as a consultant!

Tags:

Bel ons

Afspraak

Mail ons