Download Open Datasets on 1000s of Projects + Share Projects on One Platform. Explore Popular Topics Like Government, Sports, Medicine, Fintech, Food, More. Flexible Data Ingestion.
provided, including statistics from 444 datasets, covering 8 language categories and spanning 32 domains. Information from 20 dimensions is incorporated into the dataset statistics. The total data size surveyed surpasses 774.5 TB for pre-training corpora and 700M instances for other datasets. We aim ...
The data is sourced from thisWorld Bank datasetwhich in turn lists as sources:(1) United Nations Population Division. World Population Prospects, (2) United Nations Statistical Division. Population and Vital Statistics Reprot (various years), (3) Census reports and other statistical publications from...
The vast majority of the participants completed the survey in less than 15 min, however, in few cases, the completion time exceeded one hour. Three hundred and nine participants started the survey, 107 participants did not finish it, and 4 opted for withdrawal post-completion. Hence, the f...
Diamond Bay Data: Chinese counties, census statistics and Digital Chart of the World China GIS layers. China Historical GIS: Historical boundaries, tribal areas etc for China from 1820 back to 222 BCE. Registration required China High Air Pollutants: Daily high resolution (1km) air pollution dat...
In this work, we target this data documentation debt by surveying over two hundred datasets employed in algorithmic fairness research, and producing standardized and searchable documentation for each of them. Moreover we rigorously identify the three most popular fairness datasets, namely Adult, COMPAS...
Includes annual gross domestic product (GDP) data for countries around the world and all provinces in the Chinese mainland. National statistics dataset national_data TPC performance data TPC-DS TPC-DS is a decision support benchmark that models several generally applicable aspects of a decision suppo...
usage, we will enrich metadata by registering ETDs that are selected for different datasets and papers. For example, we will mark ETDs used in the ETD500, ScanBank, and ChapterParse datasets. This will allow future projects to access the enriched metadata and other derived data (see below)...
3、Page view statistics for Wikimedia projects http://dammit.lt/wikistats/ 4、AOL Search Query Logs - RP http://www.researchpipeline.com/mediawiki/index.php?title=AOL_Search_Query_Logs 5、livedoor gourmet http://blog.livedoor.jp/techblog/archives/65836960.html ...
It helps performance to update statistics, indexes and constraints, and in some cases, to sort the data appropriately.To avoid any accidental kills and because not all RDBMS support the disabling of all constraints, kill and fill are operations that are best done in the right dependency...