I want to create, sort, filter, query, update, etc. hierarchical data like JSON or XML or YAML with the same ease as a spreadsheet. Does such a thing exist?
Everyone saying relational databases but MongoDB is literally a json document database
So you can bring json into a spreadsheet as ling as you are cool with some if your cells containing multi-valued entries. The question is how do you want those multi valued fields to o be handled? How are they displayed? How would sorting and filtering on these fields work?
One of the best organising tools I ever found. And still under active development.
I just checked this out. It’s not quite what I’m looking for right now but it does answer my question as asked. I can see it coming in handy later.
jq might work.
That’s generally what relational databases are for. You might try LibreOffice Base or sqlite.
As for JSON, XML, and YAML files in particular, there are tools for doing one-off queries/transformations against them, like jq and yq.
That’s generally what relational databases are for.
But they need rectangular structure. How do they work on tree structures, like OP has asked?
one-off queries/transformations
Again, that wasn’t the question.
The question reads like an XY problem, they describe DB functions for data structures so unless there’s some specific reason they can’t use a DB that’s the right answer. A “spreadsheet for data structures” describes a relational database.
But they need rectangular structure. How do they work on tree structures, like OP has asked?
Relationships. You don’t dump all your data in a single table. Take for instance the following sample JSON:
JSON
"users": [ { "id": 1, "name": "Alice", "email": "alice@example.com", "favorites": { "games": [ { "title": "The Witcher 3", "platforms": [ { "name": "PC", "release_year": 2015, "rating": 9.8 }, { "name": "PS4", "release_year": 2015, "rating": 9.5 } ], "genres": ["RPG", "Action"] }, { "title": "Minecraft", "platforms": [ { "name": "PC", "release_year": 2011, "rating": 9.2 }, { "name": "Xbox One", "release_year": 2014, "rating": 9.0 } ], "genres": ["Sandbox", "Survival"] } ] } }, { "id": 2, "name": "Bob", "email": "bob@example.com", "favorites": { "games": [ { "title": "Fortnite", "platforms": [ { "name": "PC", "release_year": 2017, "rating": 8.6 }, { "name": "PS5", "release_year": 2020, "rating": 8.5 } ], "genres": ["Battle Royale", "Action"] }, { "title": "Rocket League", "platforms": [ { "name": "PC", "release_year": 2015, "rating": 8.8 }, { "name": "Switch", "release_year": 2017, "rating": 8.9 } ], "genres": ["Sports", "Action"] } ] } } ] }
You’d structure that in SQL tables something like this:
Tables
dbo.users
user_id name email 1 Alice alice@example.com 2 Bob bob@example.com dbo.games
game_id title genre 1 The Witcher 3 RPG 2 Minecraft Sandbox 3 Fortnite Battle Royale 4 Rocket League Sports dbo.favorites
user_id game_id 1 1 1 2 2 3 2 4 dbo.platforms
platform_id game_id name release_year rating 1 1 PC 2015 9.8 2 1 PS4 2015 9.5 3 2 PC 2011 9.2 4 2 Xbox One 2014 9.0 5 3 PC 2017 8.6 6 3 PS5 2020 8.5 7 4 PC 2015 8.8 8 4 Switch 2017 8.9 The dbo.favorites table handles the many-to-many relationship between users and games; users can have as many favourite games as they want, and multiple users can have the same favourite game. The dbo.platforms handles one-to-many relationships; each record in this table represents a single release, but each game can have multiple releases on different platforms.
So the real question was, which tool to use in order to transform the JSON’s tree into these tables & relations?
(hopefully you didn’t just write this all up manually! :-))
I’ll second the SQL database here. Especially since most people who use a spreadsheet actually treat it as a database in the first place, and not as a way to lay out data in a 2D table.
But if a hierarchical table is really what’s desired, any visual database interface should do the trick.
There’s a whole range of cli tools to extract and query structured data like that, but you might consider loading it into something like sqlite3 and treating it as a database because those formats are really not intended for queries, they’re designed for sharing data.
TreeSheets?
Closest thing I know is Grist (https://www.getgrist.com/), because columns can contain lists of values or lists of references.
Probably not exactly what you’re looking for, but it served both my wife and I very well.
Visidata, maybe.
Visual Studio Code with a jq plugin maybe
Ms Access?