|
| 1 | +# 5 Levels of data usability |
| 2 | + |
| 3 | +Not all data are created equal. |
| 4 | +There are notable differences in how much you can do with data, how flexible it is. |
| 5 | +The more usable data is, the easier it will be to re-use it for developer, researcher or other type of data user. |
| 6 | + |
| 7 | +_This list is inspired by Tim Berners-Lee's [5-star open data](https://5stardata.info/en/)_. |
| 8 | + |
| 9 | +## Level 1: unstructured data |
| 10 | + |
| 11 | +_Examples: images, videos, plain text_ |
| 12 | + |
| 13 | +Unstructured data is the least usable. |
| 14 | +Humans can read it, and AI / Machine Learning systems can draw more conclusions from it then ever, |
| 15 | +but it's hard to build an actual application or graphic from only unstructured data. |
| 16 | + |
| 17 | +``` |
| 18 | +Hi! I'm Joep, I'm born in 1991. |
| 19 | +``` |
| 20 | + |
| 21 | +## Level 2: structured data |
| 22 | + |
| 23 | +_Examples: CSV, XML, JSON, TOML, EXCEL_ |
| 24 | + |
| 25 | +Structured data can be read by machines, and this allows us to do all sorts of useful things. |
| 26 | +We can _query_, _sort_ and _filter_. |
| 27 | +But still, this type of data often requires human input when it needs to be processed. |
| 28 | +A human needs to make |
| 29 | + |
| 30 | + |
| 31 | +- Requires human interpretation |
| 32 | +- No semantic definitions of what properties represent |
| 33 | +- Can be readed by machines if mapped correctly |
| 34 | +- Often requires handling invalid data |
| 35 | + |
| 36 | +```json |
| 37 | +{ |
| 38 | + "name": "Joep", |
| 39 | + "birthYear": "" |
| 40 | +} |
| 41 | +``` |
| 42 | + |
| 43 | +## Level 3: type-safe data |
| 44 | + |
| 45 | +_Examples: SQL + DB SCHEMA, JSON + JSON schema, XSD + XML, RDF + SHACL_ |
| 46 | + |
| 47 | +Type-safe data means that every value of the data has an explicit datatype, and that these datatypes can be constrained. |
| 48 | +This means that someone re-using this data can know for certain that it conforms to a certain specification, a set of rules. |
| 49 | +The shape of the data is predictable. |
| 50 | + |
| 51 | + |
| 52 | +```json |
| 53 | +{ |
| 54 | + "https://atomicdata.dev/properties/name": "Joep", |
| 55 | + "https://atomicdata.dev/properties/birthYear": 1991 |
| 56 | +} |
| 57 | +``` |
| 58 | + |
| 59 | +## Level 4: browsable data |
| 60 | + |
| 61 | +_Examples: Atomic Data_ |
| 62 | + |
| 63 | +If your data is _connected_ to other pieces of machine-readable dat, is becomes browsable, similar to how websites link to each other. |
| 64 | +This effectively creates a _web of data_, and allows for a whole new way to think about the internet. |
| 65 | +This is what allows decentralized applications, true data ownership, and a new set of applications. |
| 66 | + |
| 67 | +- Is connected to other pieces of machine verifiable data |
0 commit comments