9111.ru Questions Dataset
nyuuzyou

folder main (4 files)
fileREADME.md 1.49kB
filequestions-00002-of-00003.parquet 1.11GB
filequestions-00001-of-00003.parquet 852.82MB
filequestions-00000-of-00003.parquet 974.38MB
Type: Dataset
Tags:

Bibtex:
@article{,
title= {9111.ru Questions Dataset},
journal= {},
author= {nyuuzyou},
year= {},
url= {https://huggingface.co/datasets/nyuuzyou/9111-questions},
abstract= {# Dataset Card for 9111.ru Questions

### Dataset Summary

This dataset includes legal questions and answers from the Russian law forum [9111.ru](https://9111.ru). It contains inquiries from users and corresponding responses from lawyers. The dataset was created by processing around 21 million questions, providing a significant corpus of legal discussions.

### Languages

The dataset is mostly in Russian, but there may be other languages present.

## Dataset Structure

### Data Fields

This dataset includes the following fields:

- `id`: Identifier for the item (integer)
- `title`: Title of the question (string)
- `description`: Description of the question (string)
- `answers`: An array of answer objects (array)
  - `user_name`: Name of the user who answered (string)
  - `status`: Status of the user (string)
  - `rating`: Rating of the user (integer)
  - `text`: Text of the answer (string)
  
### Data Format

The dataset is stored as Apache Parquet files with zstd compression (level 19), split into 3 shards:

- `questions-00000-of-00003.parquet`
- `questions-00001-of-00003.parquet`
- `questions-00002-of-00003.parquet`

### Data Splits

All examples are in the train split, there is no validation split.},
keywords= {},
terms= {},
license= {},
superseded= {}
}


Send Feedback