main (26 files)
README.md |
2.16kB |
fandom_25.parquet |
250.43MB |
fandom_24.parquet |
250.85MB |
fandom_23.parquet |
251.93MB |
fandom_22.parquet |
250.47MB |
fandom_21.parquet |
248.87MB |
fandom_20.parquet |
247.68MB |
fandom_19.parquet |
247.21MB |
fandom_18.parquet |
249.77MB |
fandom_17.parquet |
248.20MB |
fandom_16.parquet |
246.68MB |
fandom_15.parquet |
247.91MB |
fandom_14.parquet |
250.01MB |
fandom_13.parquet |
245.79MB |
fandom_12.parquet |
249.68MB |
fandom_11.parquet |
249.52MB |
fandom_10.parquet |
249.11MB |
fandom_09.parquet |
244.70MB |
fandom_08.parquet |
250.38MB |
fandom_07.parquet |
248.41MB |
fandom_06.parquet |
249.04MB |
fandom_05.parquet |
246.37MB |
fandom_04.parquet |
250.50MB |
fandom_03.parquet |
249.66MB |
fandom_02.parquet |
252.00MB |
fandom_01.parquet |
249.48MB |
Type: Dataset
Bibtex:
Tags:
Bibtex:
@article{,
title= {Fandom.com Community Database Dumps Dataset},
journal= {},
author= {nyuuzyou},
year= {},
url= {},
abstract= {# Dataset Card for Fandom.com Community Database Dumps
### Dataset Summary
This dataset contains 7,040,984 current pages from all available [Fandom.com community wiki dumps](https://community.fandom.com/wiki/Help:Database_download) as of February 18, 2025. The dataset was created by processing the "Current pages" database dumps from all available Fandom.com wikis. These dumps contain only the current versions of pages without edit history and includes article text, metadata, and structural information across multiple languages.
### Languages
The dataset is multilingual, covering [40+ languages](https://community.fandom.com/wiki/Help:Language_codes).
## Dataset Structure
### Data Fields
This dataset includes the following fields:
- `id`: Unique identifier for the article (string)
- `title`: Title of the article (string)
- `text`: Main content of the article (string)
- `metadata`: Dictionary containing:
- `templates`: List of templates used in the article
- `categories`: List of categories the article belongs to
- `wikilinks`: List of internal wiki links and their text
- `external_links`: List of external links
- `sections`: List of section titles and their levels
### Data Splits
All examples are in a single split.
## Additional Information
### License
This dataset inherits the licenses from the source Fandom communities, which use Creative Commons Attribution-ShareAlike 3.0 (CC-BY-SA 3.0).
To learn more about CC-BY-SA 3.0, visit: https://creativecommons.org/licenses/by-sa/3.0/},
keywords= {},
terms= {},
license= {CC-BY-SA 3.0},
superseded= {}
}
README.md