Information? Data? Which is which?
Information vs data. I think they are the same thing, but only in practice, not in theory.
Even Wikipedia agrees with me: “However, in academic treatments of the subject data are simply units of information.”
One of the big problems is that most people think data is only something you can do math on. It’s numbers and spreadsheets and requires a lot of statistics. Which is somewhat true, but discounts non-numeric data or qualitative data. Or is it qualitative information? It is a single point in many, but looking at that single point in context turns it from data to information. Which is the same as if you were looking at a number. At a certain point when there is so much data you need a machine to read it, or a visual of some kind, then it is just information at a different scale.
At the end of the day, what I am saying is that, in practice, data and information are the same thing to people; they just use the word data to mean numbers and algorithms. And information is something confusing and hard to organize and not worth the time to “technology.”
But, since I think they are the same thing, they have the same problems and a lot of the same solutions. The big thing most people do not realize is they need to solve the information problems of their data before they can do the fun things they want to do (hello to: Big Data, Machine Learning, AI). This is the fundamental problem that is getting bigger and bigger with AI ethics, and tech ethics in general. They are standard information problems. (Standard, but not easy) To paraphrase Dr. Timnit Gebru in a talk I saw her give: “These problems are now happening at scale!” Same problems, but exacerbated by how much technology we can bring to bear on the information. The fundamental information problems are not sexy or cool, but they need to get done so that all of the big technology that wants to work at scale can do the really cool stuff, and do it better.
I think if the library world stopped talking about information, and just called everything they did “data,” they would get a lot more respect and a lot more work really fast. Same stuff, just use a different word.