One idea I have had for a while is to create a Wikibase of consumer products – electronics, appliances, things of that nature. The potential scope of such a database would be so large as to potentially warrant its own Wikibase, but the opportunity would be tremendous. It could be used as a data backend for product review sites like https://lib.reviews/, or as a data backend for product search across websites. I think it would also provide a fun entrypoint into the Wikibase ecosystem.
What I am wondering is how such a database could be bootstrapped and then built upon. Amazon’s product inventory is probably not public information. Would a good source be government agencies like the U.S. Consumer Product Safety Commission?
Government agencies like the U.S. Consumer Product Safety Commission might provide valuable data on regulated products. However, their focus is primarily on safety-related aspects rather than a comprehensive product database.
Some manufacturers provide comprehensive product information on their websites. However, accessing and scraping data from websites should be done ethically and within the bounds of their terms of service.
That would be awesome, but why also not include services?
Many products are sold with services, some are offered as ‘rental’ service.
OMG - yes - I keep forgeting this exists!
I think it would make sense to include services as well.
OpenFoodFacts (OFF) has the ambition to become OpenProductFacts. OFF is objectively a very succesfull project, with a large community of contributors. They’re now exploring a homemade “Folksonomy Engine” to expand the use cases, but I think Wikibase would definitely be the best option for this kind of crowdsourced project. Merging their existing community with the power of Wikibase would be awesome.
I would like that very much! How would we do that?
I previously (2021) tried to convince them to use Wikibase but my understanding is that their perception of Wikibase is quite dated and they decided that it didn’t fit their needs.
I think the best way to convince them would be to build a small proof of concept, with the minimal amount of work (which is still free work, I know):
- Using wikibase.cloud (to avoid deploying a new instance)
- Choosing a specific scope of data from the OFF database (to focus the effort on something minimal)
- Mapping the OFF data to a Wikibase data model (i.e. entities, properties, qualifiers…)
- Importing the OFF data into Wikibase with the choosen data model
- Building Wikibase queries showcasing the data (e.g. number of products in a category, map, etc.)
- Showing use cases, for example:
- A contributor adds the country of origin to several products (item pages), another user can see them on a map (with a query)
- A contributor adds an image to a product…
- A contributor wants to correct an error on a product…
- A contributor wants to add a property to describe a product in a way that is not yet possible…
This would make the advantages of using Wikibase obvious.
Based on a PoC that address all their needs, and considering the downsides of their current solution (complexity / cost), I think OFF would be willing to go forward with this.
Let’s start with this. (I’m happy to set up the Wikibase once we have a project in mind.)
What area of OFF (or OPF) would benefit the most from Wikibase? If we simply make another copy of OFF inside Wikibase, that doesn’t add much value.
Here are the questions I would ask:
- What areas of OFF have poor data coverage and would benefit from being connected to more databases, including Wikidata?
- What areas of OFF are inaccurate or have unrefined data, and would benefit from crowdsourced improvements?
- What does OFF not cover yet, that is covered in one or more (public domain) datasets that can be merged together and annotated through Wikibase?
I would start with something very small, like a single product category. What is our opportunity here?
If we simply make another copy of OFF inside Wikibase, that doesn’t add much value.
I agree but importing existing data would require less work, which IMHO should be the priority for a proof of concept.
I agree that crowdsourcing totally new product data would be very effective to convince the OFF/OPF community that wikibase is the better tool, but it would require more effort, time and coordination with other people.
What does OFF not cover yet, that is covered in one or more (public domain) datasets that can be merged together and annotated through Wikibase?
Unfortunately, I don’t know any open data source that cover products in a similar a way as OFF.
I think I understand your point better.
I downloaded the dataset, and it should not be too much work for me to begin importing it into a Wikibase. My question for you is, once I set up this data in a Wikibase, what would the next course of action be?
I just registered for Lib.Reviews and wanna start using it this winter break…anyone else?