main (16 files)
data/train-00000-of-00015.parquet |
201.65MB |
data/train-00001-of-00015.parquet |
207.60MB |
data/train-00002-of-00015.parquet |
230.08MB |
data/train-00003-of-00015.parquet |
216.15MB |
data/train-00004-of-00015.parquet |
201.78MB |
data/train-00005-of-00015.parquet |
234.70MB |
data/train-00006-of-00015.parquet |
257.54MB |
data/train-00007-of-00015.parquet |
233.57MB |
data/train-00008-of-00015.parquet |
229.98MB |
data/train-00009-of-00015.parquet |
205.83MB |
data/train-00010-of-00015.parquet |
203.11MB |
data/train-00011-of-00015.parquet |
193.10MB |
data/train-00012-of-00015.parquet |
208.23MB |
data/train-00013-of-00015.parquet |
191.64MB |
data/train-00014-of-00015.parquet |
38.32MB |
README.md |
1.54kB |
Type: Dataset
Metadata:
Tags:
Metadata:
@article{,
title= {PPT Online},
journal= {},
author= {nyuuzyou},
year= {},
url= {https://huggingface.co/datasets/nyuuzyou/pptonline},
abstract= {### Dataset Summary
This dataset contains metadata about 1,418,349 PowerPoint (.ppt) files hosted on the ppt-online.org platform. PPT Online is a service designed to display PowerPoint presentations. The dataset includes information such as presentation titles, categories, file sizes, and content snippets. The majority of the presentations are in Russian, Ukrainian, Belarusian, Kazakh, and English, but other languages are also present.
### Languages
The dataset is multilingual, with the primary languages being Russian, Ukrainian, Belarusian, Kazakh, and English. However, presentations in other languages are also included.
## Dataset Structure
### Data Fields
This dataset includes the following fields:
- `id`: Unique identifier for the presentation (integer)
- `title`: Title of the PowerPoint presentation (string)
- `category`: Category or topic of the presentation (string)
- `file_size`: Size of the PowerPoint file (string)
- `body_content`: Snippet or summary of the presentation content. Generated by a service, quite low quality (string)
### Data Splits
All examples are in a single split.},
keywords= {},
terms= {},
license= {},
superseded= {}
}
Citation:
nyuuzyou. (2026). PPT Online [Data set]. Academic Torrents. https://academictorrents.com/details/9a63af9f7305cbf9f060f1e4080ef5d703a3a4f5
data/train-00000-of-00015.parquet