Data on LEGO sets release dates and retail prices combined with aftermarket transaction prices between June 2018 and June 2023.
Opis
The dataset contains LEGO bricks sets item count and pricing history for AI-based set pricing prediction.
The data was downloaded from three sources. LEGO sets retail prices, release dates, and IDs were downloaded from Brickset.com, where one can find the ID number of each set that was released by Lego and its retail prices. The current status of the sets was downloaded from Lego.com and the retail prices for Poland and prices from aftermarket transactions were downloaded from promoklocki.pl. The data was merged based on the official LEGO set ID.
The data is composed of one aggregated table stored in an XLSX file named lego_final_data.xlsx. All data was scrapped from lego.com, brickset.com, and promoklocki.pl websites. The table contains the following columns:
- setID – internal Brickset.com LEGO set identification number,
- number – official LEGO set ID,
- numberVariant – official LEGO set sub variant (e.g. different minifigure hidden in a random bag),
- name – official LEGO set name,
- year – the set release year,
- theme – official name of the set theme,
- themeGroup – official name of the set themes grup (if available),
- subtheme – official name of the set sub-theme (if available),
- category – brickset.com internal set type,
- released – indicates whether the set was officially released (1) or not (0),
- pieces – number of parts in the set,
- minifigs – number of minifigures in the set,
- ownedBy – number of brickset.com users claiming that he or she owns the set,
- wantedBy – number of brickset.com users claiming that he or she wants to buy the set,
- rating – average set rating according to brickset.com users,
- reviewCount – number of the set reviews written by brickset.com users,
- packagingType – type of packaging for the set (if specified),
- availability – indicates whether the set was available in retail shops or only on official LEGO shop web site,
- instructionsCount – number of books with building instructions added to the set,
- minAge – LEGO recommended minimal user age for the set,
- maxAge – LEGO recommended maximal user age for the set (either not specified or 99),
- tags – list of brickset.com assigned set tags,
- LastUpdated – the date and time of the last update of the data in brickset.com in ISO 8601 format,
- urlRetailPriceCheckPLN – URL where retail price in PLN was downloaded from,
- US_retailPrice – retail price in United States in US dollars,
- US_dateFirstAvailable – date and time when the set became available in United States in ISO 8601 format,
- US_dateLastAvailable – date and time when the set stopped being officially available in United States in ISO 8601 format,
- UK_retailPrice – retail price in United Kingdom in GBP,
- UK_dateFirstAvailable – date and time when the set became available in United Kingdom in ISO 8601 format,
- UK_dateLastAvailable – date and time when the set became available in United Kingdom in ISO 8601 format,
- CA_retailPrice – retail price in Canada in Canadian dollars,
- CA_dateFirstAvailable – date and time when the set became available in Canada in ISO 8601 format,
- CA_dateLastAvailable – date and time when the set became available in Canada in ISO 8601 format,
- DE_retailPrice – retail price in Germany in EUR,
- DE_dateFirstAvailable – date and time when the set became available in Germany in ISO 8601 format,
- DE_dateLastAvailable – date and time when the set became available in Germany in ISO 8601 format,
- PL_retailPrice – retail price in Poland in PLN,
- Date – year and month for which the PriceMonthPLN is given,
- priceMonthPLN – price in PLN read from promoklocki.pl for year and month specified in Date column,
- status – official status of the set (if available) in LEGO web shop,
- urlRetailPriceHistoryPLN – URL containing retail and aftermarket price changes from the day of the release of the set, in PLN.
Plik z danymi badawczymi
hexmd5(md5(part1)+md5(part2)+...)-{parts_count}
gdzie pojedyncza część pliku jest wielkości 512 MBPrzykładowy skrypt do wyliczenia:
https://github.com/antespi/s3md5
Informacje szczegółowe o pliku
- Licencja:
-
otwiera się w nowej karcieCC BYUznanie autorstwa
Informacje szczegółowe
- Rok publikacji:
- 2023
- Data zatwierdzenia:
- 2023-10-24
- Język danych badawczych:
- angielski
- Dyscypliny:
-
- informatyka techniczna i telekomunikacja (Dziedzina nauk inżynieryjno-technicznych)
- informatyka (Dziedzina nauk ścisłych i przyrodniczych)
- DOI:
- Identyfikator DOI 10.34808/s25h-sx91 otwiera się w nowej karcie
- Seria:
- Weryfikacja:
- Politechnika Gdańska
Słowa kluczowe
Powiązane zasoby
Cytuj jako
Autorzy
wyświetlono 963 razy