Our automated data collection engine captures data from multiple sources and formats (online and offline data), while maintaining privacy and security. E-shops, websites and marketplaces data are collected using the latest ethical scrapping and machine learning technologies, while ensuring comprehensive, fresh data. Data collection is monitored at all stages while tracking and saving all sources.
Lizeo.data collection services allow you to collect data from different sources (online (websites, e-shops, marketplaces), offline (product catalogs), digital (spreadsheets, PDF, etc.) and retrieve them at different stage in the data value chain: raw, parsed or matched.
Data collected online is parsed after being captured in order to integrate the most relevant information. The parsing step organises data by identifying and extracting all textual inputs from html pages. The text is then analyzed, and the named entity extracted (product name, attributes, etc.).
Name, description and product attributes are components that may vary from one e-shop to another. In order to create one unified view, the parsed data is matched to a known referenced product in our database. Lizeo has an exceptional matching rate thanks to a mix of machine learning technologies and human expertise.
Lizeo.data matching services cleans data (internal, external, tiered data) and converts them into a unified view. Data matching services can be based on Lizeo.product catalog or your own data referential.
Data preparation turns unified data into smart data. By computing data with dedicated algorithms, we provide you with a dataset that integrates your business rules and requirements but also your market vision and segmentation. You’ll get the right set of data, with the highest quality and usability to perform your analytics. This is the final step before starting data consumption.