Now is the time that ‘Data’ specialist expertise comes into play.
The description of a tyre sold online is one of the elements that can vary significantly from one e-commerce website to another.
The list of attributes which constitute the description of a tyre and which can vary are as follows:
- The brand
- The product name
- The technical markings (runflat, OE Marking)
- The tyre manufacturer code
To be able to attach the prices collected to the right tyre, it is necessary to “understand” and decipher the key elements displayed which allow this product to be identified in order to compare them with a ‘specific’ reference base of existing tyres (current or in the past) and validated (with an official source attesting it). This is the data unification step.
This referent database is the cornerstone of an efficient matching system to deliver data with an unparalleled level of quality.
The comparison of data collected with this referent database checks that the information found online for this tyre exists: does this tyre come from this brand? with its technical attributes? etc.
This allows data collected online to be systematically and automatically categorised:
- Unknown product: is it new?
- Known product: attachment of the collected price, its source and the collection date
- “False” product: the combination of information is not possible - the product does not exist
Finally, the most outrageous prices are also filtered (e.g.: a touring tyre sold for less than 20€ is unlikely to be real)
The combination of a highly qualitative tyre database, matching technologies, Machine Learning algorithms and product marketing expertise provides quality data with a high level of completeness.
This will allow you to focus on your business (the analysis) and gain in efficiency. Indeed, analysts and Data Scientists spend between 50 and 80% of their time cleaning data before they can start handling it!