CSV to JSON: Parsing Quoted Fields, Detecting Types, and Avoiding Pitfalls
Why CSV to JSON Is Harder Than It Looks
Reading a CSV with `split(",")` works fine — until your data contains a name like `Smith, John`, a description with a line break in it, or a quoted string with a quote inside. Suddenly your "5-column CSV" becomes a 7-column one, every row from there on misaligns, and your pipeline silently produces garbage.
Real CSV parsing has to follow RFC 4180. That means understanding **quoted fields**, **escaped quotes** (`""` inside a quoted field means a literal quote), **newlines inside quoted values**, and the fact that the delimiter can be a comma, semicolon, tab or pipe depending on where the file came from.
The Three Things Every CSV Parser Has to Get Right
1. Quoted fields
This row has three columns, not four:
```
1,"Smith, John",Engineer
```
A naive split on comma sees four. A proper parser recognizes the opening quote at `"Smith` and keeps consuming characters (including the comma) until it sees the closing quote.
2. Escaped quotes
```
1,"She said ""hi""",Engineer
```
The value of column 2 is `She said "hi"`. Inside a quoted field, two consecutive double quotes (`""`) represent one literal double quote. Get this wrong and you'll either truncate fields or include extra quotes in your output.
3. Newlines inside quoted values
```
1,"Line one
Line two",Engineer
```
That's **one row** with three columns, even though it spans two lines of text. A line-based parser that reads file.split("\n") will split it incorrectly. A real CSV parser tracks an in-quotes state and only treats newlines as row terminators when it's outside quotes.
Our [CSV to JSON converter](/csv-to-json) implements RFC 4180 properly, so all three cases just work.
The Delimiter Trap
In the US and UK, "CSV" means comma-separated. In Germany, France, Italy and most of continental Europe, Excel exports use **semicolons** because the comma is reserved as the decimal separator (`3,14` instead of `3.14`).
If you assume comma and your user uploads a German Excel export, every row lands in column A. The fix is to auto-detect: count delimiters in the header row and pick the most common one. Our tool does this automatically — and lets you override if detection guesses wrong.
Type Coercion: Useful, but Dangerous
When you load CSV into JSON, every cell starts as a string. Usually you want `42` to become the number 42, `true` to become the boolean true, and `null` (or empty cells) to become null. Smart parsers do this for you.
But beware:
The right default is **type detection on** for analytics data and **off** for IDs and ZIP codes. Our tool gives you a single toggle.
Unflatten: The Mirror of Flatten
If you used our [JSON to CSV converter](/json-to-csv) to flatten nested objects into dot-notation columns, you can round-trip back. A CSV like:
```
id,address.city,address.country
1,Paris,FR
2,Berlin,DE
```
With "Unflatten dot.keys" turned on, becomes:
```json
[
{"id": 1, "address": {"city": "Paris", "country": "FR"}},
{"id": 2, "address": {"city": "Berlin", "country": "DE"}}
]
```
This makes the JSON shape match what an API typically expects (nested objects) rather than a flat dump.
A Practical Workflow
The pattern we recommend for one-off conversions:
When Not to Use CSV at All
CSV is great for tabular, flat, human-readable data. It's a poor fit for: deeply nested structures, mixed-shape rows, binary blobs, or data where types matter precisely (dates with timezones, decimal currencies, very large integers). For those, JSON Lines (one JSON object per line) or Parquet are usually better source formats.
Conclusion
A solid CSV-to-JSON converter has to handle quoted fields, escaped quotes, multi-line values, multiple delimiters, and optional type coercion — and do it without uploading your data to a server. Our [CSV to JSON converter](/csv-to-json) does exactly that: paste, click, get clean typed JSON. Round-trip with our [JSON to CSV tool](/json-to-csv) when you need to go the other way.