main (4 files)
README.md |
1.49kB |
questions-00002-of-00003.parquet |
1.11GB |
questions-00001-of-00003.parquet |
852.82MB |
questions-00000-of-00003.parquet |
974.38MB |
Type: Dataset
Bibtex:
Tags:
Bibtex:
@article{,
title= {9111.ru Questions Dataset},
journal= {},
author= {nyuuzyou},
year= {},
url= {https://huggingface.co/datasets/nyuuzyou/9111-questions},
abstract= {# Dataset Card for 9111.ru Questions
### Dataset Summary
This dataset includes legal questions and answers from the Russian law forum [9111.ru](https://9111.ru). It contains inquiries from users and corresponding responses from lawyers. The dataset was created by processing around 21 million questions, providing a significant corpus of legal discussions.
### Languages
The dataset is mostly in Russian, but there may be other languages present.
## Dataset Structure
### Data Fields
This dataset includes the following fields:
- `id`: Identifier for the item (integer)
- `title`: Title of the question (string)
- `description`: Description of the question (string)
- `answers`: An array of answer objects (array)
- `user_name`: Name of the user who answered (string)
- `status`: Status of the user (string)
- `rating`: Rating of the user (integer)
- `text`: Text of the answer (string)
### Data Format
The dataset is stored as Apache Parquet files with zstd compression (level 19), split into 3 shards:
- `questions-00000-of-00003.parquet`
- `questions-00001-of-00003.parquet`
- `questions-00002-of-00003.parquet`
### Data Splits
All examples are in the train split, there is no validation split.},
keywords= {},
terms= {},
license= {},
superseded= {}
}
README.md