<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:academictorrents="https://academictorrents.com" version="2.0">
<channel>
<title>Academic Torrents</title>
<description>Recent Torrents</description>
<link>https://academictorrents.com/</link>
<item>
<title>Wikipedia European languages 2026-06-01</title>
<category>Dataset</category>
<infohash>04c55531c1617744bd76a3b126a19c4ba48cb2a2</infohash>
<guid>https://academictorrents.com/details/04c55531c1617744bd76a3b126a19c4ba48cb2a2</guid>
<link>https://academictorrents.com/details/04c55531c1617744bd76a3b126a19c4ba48cb2a2</link>
<description>Wikipedia database dumps of European language wikis of 10k articles or more. enwiki excluded. Wikipedia Multistream 2026-06-01. These 67 languages are included: Albanian, Alemannic, Aragonese, Asturian, Basque, Bavarian, Belarusian, Benetian, Bosnian, Breton, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, Emilian-Romagnol, Esperanto, Estonian, Faroese, Finnish, French, Galician, German, Greek, Hungarian, Icelandic, Irish, Italian, Ladin, Latin, Latvian, Ligurian, Limburgish, Lithuanian, Lombard, Low German, Macedonian, Maltese, Neapolitan, North Frisian, Norwegian, Nynorsk, Occitan, Piedmontese, Polish, Portuguese, Romanian, Romansh, Rusyn, Samogitian, Scots, Scottish Gaelic, Serbian, Serbo-Croatian, Sicilian, Silesian, Slovak, Slovenian, Spanish, Swedish, Ukrainian, Upper Sorbian, Walloon, Welsh, West Frisian, Yiddish.</description>
<size>53381860475</size>
</item><item>
<title>Reddit comments/submissions 2026-05</title>
<category>Dataset</category>
<infohash>55199eff9368cde1f5c1262dd7c1af09f7503ea5</infohash>
<guid>https://academictorrents.com/details/55199eff9368cde1f5c1262dd7c1af09f7503ea5</guid>
<link>https://academictorrents.com/details/55199eff9368cde1f5c1262dd7c1af09f7503ea5</link>
<description>Reddit comments and submisReddit comments and submissions from 2026-05 Documentation, json schemas and more can be found at https://github.com/ArthurHeitmann/arctic_shift Helper scripts for processing files can be found at https://github.com/Watchful1/PushshiftDumpssions</description>
<size>71668069069</size>
</item><item>
<title>Crossref Event Data Archive</title>
<category>Dataset</category>
<infohash>16396475b640d8487a6b723eada8a440fb33d3ce</infohash>
<guid>https://academictorrents.com/details/16396475b640d8487a6b723eada8a440fb33d3ce</guid>
<link>https://academictorrents.com/details/16396475b640d8487a6b723eada8a440fb33d3ce</link>
<description># Crossref Event Data archive This is an archive of all events collected by selected Crossref Event Data agents between its launch on 2017/02/17 and its deprecation on 2026/04/23. The DOI of this dataset is https://doi.org/10.13003/wjyr-rv9j ## File format The data are provided in [JSONL](https://jsonlines.org/) format. Each data file has a  .jsonl  file extension and contains up to 5000 entries (lines). Filenames follow this pattern:  agent-nnnn.jsonl  ## Structure of the archive The data are hierarchically grouped in directories by agent, year, month, day and a 4-digit directory ID. For example:    shell $ tree -L 6 data | head -n 18 data ├── crossref │   ├── 2021 │   │   ├── 01 │   │   │   └── 01 │   │   │       └── 0000 │   │   │           └── crossref-0001.jsonl │   │   ├── 04 │   │   │   ├── 28 │   │   │   │   └── 0000 │   │   │   │       └── crossref-0001.jsonl │   │   │   ├── 29 │   │   │   │   └── 0000 │   │   │   │       ├── crossref-0001.jsonl │   │   │   │       └── crossref-0002.jsonl │   │   │   └── 30 │   │   │       └── 0000 │   │   │           └── crossref-0001.jsonl     Each daily directory contains one or more 4 digit directories containing JSONL data files. There can be at most 1000 files within each 4 digit directory. ## Agents The export contains data for the following agents: | Agent name      | Description                                                                                                | | &amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;- | &amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash; | | crossref        | Relationships and references to datasets and DOI registration agencies other than Crossref (e.g. DataCite) | | f1000           | Recommendations of research publications                                                                   | | facultyopinions | Recommendations of research publications (formerly F1000)                                                  | | hypothesis      | Annotations in Hypothes\.is                                                                                | | newsfeed        | Discussed in blogs and media                                                                               | | reddit          | Discussed on Reddit                                                                                        | | reddit-links    | Discussed on sites linked to in subreddits                                                                 | | stackexchange   | Discussed on StackExchange sites                                                                           | | web             | Discussed on selected webpages                                                                             | | wikipedia       | References on Wikipedia pages                                                                              | | wordpressdotcom | Discussed on Wordpress\.com sites                                                                          | ## Event data structure The main purpose of each event is to capture a relationship between a subject and an object, a triplet of [subject, relationship, object]. The  event  data structure has a few properties but the most important ones are: | Property              | Description                                                                                                                                            | | &amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;- | &amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash;&amp;mdash; | | license               | The license per event. Different agents may have a different license and that may also change over time. It is best to consider the license per event. | | obj_id                | The id of the object                                                                                                                                   | | subj_id               | The id of the subject                                                                                                                                  | | occurred_at           | When was the event observed?                                                                                                                           | | id                    | A unique id, can be helpful when processing the archive                                                                                                | | subj:pid              | Same as subj_id                                                                                                                                        | | obj:pid               | Same as obj_id                                                                                                                                         | | subj:url              | The source of the event                                                                                                                                | | obj:url               | Same as obj:pid                                                                                                                                        | | subj:title            | Some agents (for example Wikipedia) capture a title for the subject s url                                                                              | | subj/obj:work_type_id | The type of the identified pid                                                                                                                         | | source_id             | The agent that captured this event                                                                                                                     | | relation_type_id      | The relation type of the relationship between the subject and the object                                                                               | One thing to note: Depending on the agent the  subj:url  may or may not be equal to the  subj:pid . In any case the  subj:url  should be treated as an independent value. An example of a Crossref agent event:    json  "license": "https://creativecommons.org/publicdomain/zero/1.0/", "obj_id": "https://doi.org/10.14383/cri.2017.12.2.149", "source_token": "36c35e23-8757-4a9d-aacf-345e9b7eb50d", "occurred_at": "2025-01-01T11:10:07.000Z", "subj_id": "https://doi.org/10.3390/vetsci11020090", "id": "9d5be3e3-3141-4965-836a-8831178726b2", "action": "add", "subj":  "pid": "https://doi.org/10.3390/vetsci11020090", "url": "https://doi.org/10.3390/vetsci11020090", "work_type_id": "journal-article" , "source_id": "crossref", "obj":  "pid": "https://doi.org/10.14383/cri.2017.12.2.149", "url": "https://doi.org/10.14383/cri.2017.12.2.149", "method": "doi-literal", "verification": "literal" , "relation_type_id": "references"      An example of a Wikipedia agent event:    json  "license": "https://creativecommons.org/publicdomain/zero/1.0/", "obj_id": "https://doi.org/10.2307/2128863", "source_token": "36c35e23-8757-4a9d-aacf-345e9b7eb50d", "occurred_at": "2025-01-01T17:02:25Z", "subj_id": "https://en.wikipedia.org/api/rest_v1/page/html/Mohammad_Reza_Pahlavi/1266654138", "id": "6618b137-da03-4f15-ac86-ca3283c28cb6", "action": "add", "subj":  "pid": "https://en.wikipedia.org/wiki/Mohammad_Reza_Pahlavi"</description>
<size>107590238836</size>
</item><item>
<title>GTDB R09-RS220</title>
<category>Dataset</category>
<infohash>d4056fe87d24aaed9d366453f17abb08f7c4c62d</infohash>
<guid>https://academictorrents.com/details/d4056fe87d24aaed9d366453f17abb08f7c4c62d</guid>
<link>https://academictorrents.com/details/d4056fe87d24aaed9d366453f17abb08f7c4c62d</link>
<description>Release 09-RS220 (24th April 2024) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>410496532480</size>
</item><item>
<title>GTDB R07-RS207</title>
<category>Dataset</category>
<infohash>13e25c59c31920abce5599a071991a8d8ca94e89</infohash>
<guid>https://academictorrents.com/details/13e25c59c31920abce5599a071991a8d8ca94e89</guid>
<link>https://academictorrents.com/details/13e25c59c31920abce5599a071991a8d8ca94e89</link>
<description>Release 07-RS207 (8th April 2022) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>317416538112</size>
</item><item>
<title>GTDB R06-RS202</title>
<category>Dataset</category>
<infohash>4ad45a3bcd78a36f700530060ae7839638b09840</infohash>
<guid>https://academictorrents.com/details/4ad45a3bcd78a36f700530060ae7839638b09840</guid>
<link>https://academictorrents.com/details/4ad45a3bcd78a36f700530060ae7839638b09840</link>
<description>Release 06-RS202 (27th April 2021) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>191990071296</size>
</item><item>
<title>GTDB R04-RS89</title>
<category>Dataset</category>
<infohash>fe8c256fd07464c365c2403b6f34be3ef510aae1</infohash>
<guid>https://academictorrents.com/details/fe8c256fd07464c365c2403b6f34be3ef510aae1</guid>
<link>https://academictorrents.com/details/fe8c256fd07464c365c2403b6f34be3ef510aae1</link>
<description>Release 04-RS89 (19th June 2019) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>103494451200</size>
</item><item>
<title>GTDB R03-RS86.2</title>
<category>Dataset</category>
<infohash>937539d2e0c9d774c63625af5106c0e6d4bc5a7a</infohash>
<guid>https://academictorrents.com/details/937539d2e0c9d774c63625af5106c0e6d4bc5a7a</guid>
<link>https://academictorrents.com/details/937539d2e0c9d774c63625af5106c0e6d4bc5a7a</link>
<description>Release 3-RS86.2 (15th January 2019) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>27277656064</size>
</item><item>
<title>GTDB R05-RS95</title>
<category>Dataset</category>
<infohash>4793809ff9e07b0217baed7a4fe6980a0444ac20</infohash>
<guid>https://academictorrents.com/details/4793809ff9e07b0217baed7a4fe6980a0444ac20</guid>
<link>https://academictorrents.com/details/4793809ff9e07b0217baed7a4fe6980a0444ac20</link>
<description>Release 05-RS95 (17th July 2020) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>132191879168</size>
</item><item>
<title>GTDB R03-RS86</title>
<category>Dataset</category>
<infohash>7ced89299d243e10b37664a5ee1799444eee8c5b</infohash>
<guid>https://academictorrents.com/details/7ced89299d243e10b37664a5ee1799444eee8c5b</guid>
<link>https://academictorrents.com/details/7ced89299d243e10b37664a5ee1799444eee8c5b</link>
<description>Release 3-RS86 (19th August 2018) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>29144121344</size>
</item><item>
<title>GTDB R01-RS80</title>
<category>Dataset</category>
<infohash>8020b6f824e38c9937d669c52cd7a577dca63f65</infohash>
<guid>https://academictorrents.com/details/8020b6f824e38c9937d669c52cd7a577dca63f65</guid>
<link>https://academictorrents.com/details/8020b6f824e38c9937d669c52cd7a577dca63f65</link>
<description>Release 1-RS80 (1st November 2017) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>3929014272</size>
</item><item>
<title>GTDB R02-RS83</title>
<category>Dataset</category>
<infohash>3b2c4748e377147559c0a9fc6b2e30c40aed166f</infohash>
<guid>https://academictorrents.com/details/3b2c4748e377147559c0a9fc6b2e30c40aed166f</guid>
<link>https://academictorrents.com/details/3b2c4748e377147559c0a9fc6b2e30c40aed166f</link>
<description>Release 2-RS83 (8th March 2018) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>28099739648</size>
</item><item>
<title>GTDB R11-RS232</title>
<category>Dataset</category>
<infohash>642f971d1229cbf0cbe1de5903c969fb82cc4365</infohash>
<guid>https://academictorrents.com/details/642f971d1229cbf0cbe1de5903c969fb82cc4365</guid>
<link>https://academictorrents.com/details/642f971d1229cbf0cbe1de5903c969fb82cc4365</link>
<description>Release 11-RS232 (15th April 2026) of the Genome Taxonomy Database (GTDB), an initiative to establish a standardised microbial taxonomy based on genome phylogeny.</description>
<size>611596632064</size>
</item><item>
<title>enwiki-20260601-pages-articles-multistream.xml.bz2</title>
<category>Dataset</category>
<infohash>bac5df1f39fd83fc87826a8dc546e56db34f2322</infohash>
<guid>https://academictorrents.com/details/bac5df1f39fd83fc87826a8dc546e56db34f2322</guid>
<link>https://academictorrents.com/details/bac5df1f39fd83fc87826a8dc546e56db34f2322</link>
<description>English Wikipedia Multistream 2026-06-01 https://en.wikipedia.org/wiki/Wikipedia:Database_download Corresponding index file: https://academictorrents.com/details/c1236c4d35b6d2adcba502e3271d6a3c5261b1ab</description>
<size>26437250146</size>
</item><item>
<title>enwiki-20260601-pages-articles-multistream-index.txt.bz2</title>
<category>Dataset</category>
<infohash>c1236c4d35b6d2adcba502e3271d6a3c5261b1ab</infohash>
<guid>https://academictorrents.com/details/c1236c4d35b6d2adcba502e3271d6a3c5261b1ab</guid>
<link>https://academictorrents.com/details/c1236c4d35b6d2adcba502e3271d6a3c5261b1ab</link>
<description>English Wikipedia Multistream Index 2026-06-01 https://en.wikipedia.org/wiki/Wikipedia:Database_download Corresponding multistream file: https://academictorrents.com/details/bac5df1f39fd83fc87826a8dc546e56db34f2322</description>
<size>281979710</size>
</item><item>
<title>Places in the Wild: Ecologically-sampled RAW photographs</title>
<category>Dataset</category>
<infohash>a1d810fbb54c21dfe8a68538d917c668e48663de</infohash>
<guid>https://academictorrents.com/details/a1d810fbb54c21dfe8a68538d917c668e48663de</guid>
<link>https://academictorrents.com/details/a1d810fbb54c21dfe8a68538d917c668e48663de</link>
<description>Places in the Wild comprises over 67,000 RAW-format images, each captured with a 45-megapixel Canon EOS R5 full-frame mirrorless camera at 5-degree intervals, providing 360-degree coverage across over 800 unique locations. These locations span 260 basic-level scene categories, including both indoor and outdoor environments such as bedrooms, train stations, forests, and parking garages.</description>
<size>2618655068160</size>
</item><item>
<title>Edus2 Ultrasounds</title>
<category>Dataset</category>
<infohash>e59a4244be98b0123c47f4205c94c95123318935</infohash>
<guid>https://academictorrents.com/details/e59a4244be98b0123c47f4205c94c95123318935</guid>
<link>https://academictorrents.com/details/e59a4244be98b0123c47f4205c94c95123318935</link>
<description>Ultrasound Videos Database: Collection of 32 medical ultrasound video files for simulations, case discussions, and training. Includes cardiac normal, tamponade, FAST exams (RUQ free fluid), AAA, and Edus2 open-source set. Free for non-commercial educational use. This license applies to all video in this directory. Copyright 2011,2012 Paul Kulyk and Paul Olszynski All videos made available under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. http://creativecommons.org/licenses/by-nc-sa/3.0/</description>
<size>121597065</size>
</item><item>
<title>Reddit comments/submissions 2026-04</title>
<category>Dataset</category>
<infohash>85d017ddd06920534187e7d45f21c7cec90c9bca</infohash>
<guid>https://academictorrents.com/details/85d017ddd06920534187e7d45f21c7cec90c9bca</guid>
<link>https://academictorrents.com/details/85d017ddd06920534187e7d45f21c7cec90c9bca</link>
<description>Reddit comments and submisReddit comments and submissions from 2026-04 Documentation, json schemas and more can be found at https://github.com/ArthurHeitmann/arctic_shift Helper scripts for processing files can be found at https://github.com/Watchful1/PushshiftDumpssions</description>
<size>68544608390</size>
</item><item>
<title>Wikipedia Asian languages 2026-05-01</title>
<category>Dataset</category>
<infohash>0d6c1cb68beb572c88f302048be4f2917d226168</infohash>
<guid>https://academictorrents.com/details/0d6c1cb68beb572c88f302048be4f2917d226168</guid>
<link>https://academictorrents.com/details/0d6c1cb68beb572c88f302048be4f2917d226168</link>
<description>Wikipedia database dumps of Asian language wikis of 10k articles or more. Wikipedia Multistream 2026-05-01. These 85 languages are included: Acehnese, Armenian, Assamese, Azerbaijani, Balinese, Bangla, Banjar, Banyumasan, Bashkir, Bishnupriya, Buginese, Burmese, Cantonese, Cebuno, Central Bikol, Central Kurdhish, Chechen, Chinese, Chuvash, Classical Chinese, Dimli, Eastern Mari, Georgian, Gilaki, Gorontalo, Gujarati, Hakka, Hebrew, Hindi, Iloko, Indonesian, Japanese, Javanese, Kannada, Kara-Kalpak, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Maithili, Malay, Malayalam, Manipuri, Marathi, Mazanderani, Minangkabau, Mindong, Mingrelian, Minnan, Mongolian, Nepali, Newari, Odia, Ossetic, Pampangan, Pashto, Persian, Punjabi, Russian, Sanskrit, Santali, Saraiki, Shan, Sindhi, Sinhala, South Azerbaijani, Sundanese, Tagalog, Tajik, Talysh, Tamil, Tatar, Telugu, Thai, Turkish, Urdu, Uzbek, Vietnamese, Waray, Western Armenian, Western Mari, Western Punjabi, Wu, Yakut.</description>
<size>31820494896</size>
</item><item>
<title>Stack Exchange Data Dump (2026-03-31)</title>
<category>Dataset</category>
<infohash>95d3cd024872ccb240b867afb7ba4ea275a9d7a8</infohash>
<guid>https://academictorrents.com/details/95d3cd024872ccb240b867afb7ba4ea275a9d7a8</guid>
<link>https://academictorrents.com/details/95d3cd024872ccb240b867afb7ba4ea275a9d7a8</link>
<description>This data dump is sourced from the various sites in the Stack Exchange network of Q&amp;A sites. This dump contains data up to and including 2026-03-31. The exact licenses for each bit of content is embedded in each entry. For license date ranges, see the root-level license.txt, or https://stackoverflow.com/help/licensing. For the schema, see the sede-and-data-dump-schema.md file within each .7z This torrent has also been archived at https://archive.org/details/stackexchange_20260331</description>
<size>98446659547</size>
</item><item>
<title>IUGC: A benchmark of landmark detection in end-to-end intrapartum ultrasound biometry</title>
<category>Dataset</category>
<infohash>f7b5259dfeadf9869d919276006f53c1969c74cd</infohash>
<guid>https://academictorrents.com/details/f7b5259dfeadf9869d919276006f53c1969c74cd</guid>
<link>https://academictorrents.com/details/f7b5259dfeadf9869d919276006f53c1969c74cd</link>
<description>In 2018, the World Health Organization (WHO) published 56 recommendations to improve the quality of intrapartum care and enhance women’s childbirth experiences. In response, the WHO developed the Labour Care Guide (LCG) in 2020, a next-generation tool designed to promote evidence-based, respectful, and woman-centered care during labor and delivery. The LCG was created through expert consultations, primary research with maternity healthcare providers, and usability studies across multiple countries. It serves as a practical tool for monitoring labor progress and maternal and fetal well-being by recording key clinical parameters. When deviations from normal labor progression are detected, the LCG highlights these issues, prompting timely interventions to ensure safe and effective care. Intrapartum ultrasound for labor progression analysis is a crucial examination in labor management. The core operation in this analysis is the identification of landmarks from intrapartum ultrasound images. These landmarks serve as the basis for subsequent qualitative evaluations of angles and distances, which offer valuable diagnostic information regarding labor arrest and influence decisions about the timing and type of intervention. However, obtaining reliable landmark annotations typically demands experienced physicians, and even for proficient obstetricians, manual landmark identification is a time-consuming and labor-intensive endeavor. Consequently, the development of fully automatic and precise landmark localization techniques has been an area of significant and persistent need. The Intrapartum Ultrasound Grand Challenge (IUGC) 2025 is a collaborative initiative involving the "Deep Learning in Intrapartum Ultrasound Image Analysis" cooperative group and prominent clinical societies such as the International Society of Ultrasound in Obstetrics &amp; Gynecology (ISUOG), the World Association of Perinatal Medicine (WAPM), the Perinatal Medicine Foundation (PMF), and the National Institute for Health and Care-Excellence (NICE). The objective of this partnership is to formulate and promote clinically relevant challenges, thereby maximizing the potential clinical impact of innovative algorithmic contributions from participating teams. Since its inception at MICCAI 2023, the IUGC has advanced the Pubic Symphysis - Fetal Head Segmentation (PSFHS) by facilitating and benchmarking algorithmic progress and providing high-quality annotated image datasets. In MICCAI 2024, the IUGC expanded to incorporate multiple benchmarks: (1) The analysis objects were extended from images to videos; (2) The tasks were augmented from image segmentation to classification, segmentation, and biometric parameter measurement; (3) The quantitative parameters were increased from one (i.e., Angle of Progression (AOP)) to two (i.e., AOP and head - symphysis distance (HSD)); and (4) The data sources were broadened from being solely from Asia to include Asia, Europe, and Africa. This novel and inventive design has established a benchmarking ecosystem for the systematic comparison of algorithms across diverse tasks and clinical challenges. The significance of the IUGC 2025 lies in its concentration on addressing the actual clinical assessment of labor progress, covering (1) end-to-end measurements (which are currently indirect measurements based on segmentation results); (2) all fetal descent stations during the childbirth process (comprising five “minus”, one “zero", and three “plus” stations for reliable longitudinal assessment of labor progress); (3) computational tasks (such as regression, detection); and (4) learning methods (semi-supervised, weakly-supervised, and barely-supervised learning). In line with the IUGC s goal of addressing clinical requirements, authoritative and leading clinical organizations have allied with the IUGC. We have extended the IUGC 2024 Challenge from an indirect ultrasound measurement based on segmentation results to an end-to-end measurement based on landmarks. Specifically, we provide 300 labeled cases and 31,421 unlabeled cases in the training set, 100 visible cases for validation, and 501 hidden cases for testing. The targets are the coordinates of three landmarks and the corresponding biometric parameter. In addition to the typical Mean Radial Error (MRE) and the absolute difference between predicted and manually measured parameters, our evaluation metrics also emphasize inference speed. In summary, the IUGC 2025 challenge exhibits three primary characteristics: (1) Task: Employing semi-supervised landmark detection. (2) Dataset: Curating a large-scale and diverse fetal ultrasound dataset that accounts for all fetal descent stations during the childbirth process. It comprises 28,919 ultrasound images from over 20 medical groups. (3) Evaluation measures: Focusing on detection accuracy. (4) Multiple raters independently annotate a subset of test cases to compare algorithmic performance against human expert inter-rater variability. https://pubmed.ncbi.nlm.nih.gov/41604894/ Bai J, Tang Y, Liu X, Hu J, Li Y, Chen X, Wang Y, Ma C, Li Y, Guo B, Jiao J, Huang Y, Wang K, Li L, Ma Y, Han X, Shao H, Yang Z, Liu Q, Hu Y, Kuang J, Song S, Krishna A, Khan ZA, Li Z, Zhang Z, Zhang H, Cheng Y, Zhang X, Chen X, Yan H, Tong L, Du B, Deng B, Chen Y, Peng Z, Rezaei S, Gan J, Cai W, Wang F, Curran KM, Silvestre G, Khobo I, Lu Y, Ni D, Huang Y, Yaqub M, Ma J, Lekadir K, Li S. IUGC: A benchmark of landmark detection in end-to-end intrapartum ultrasound biometry. Med Image Anal. 2026 May;110:103960. doi: 10.1016/j.media.2026.103960. Epub 2026 Jan 23. PMID: 41604894.</description>
<size>1204118461</size>
</item><item>
<title>Annotated Ultrasound Liver images</title>
<category>Dataset</category>
<infohash>cf97c52651867d2e78d234aebb1fa45432ddbe9a</infohash>
<guid>https://academictorrents.com/details/cf97c52651867d2e78d234aebb1fa45432ddbe9a</guid>
<link>https://academictorrents.com/details/cf97c52651867d2e78d234aebb1fa45432ddbe9a</link>
<description>We public the ultrasound liver images, which were annotated to show the outlines, livers, and liver mass regions. Xu Yiming, Zheng Bowen, Liu Xiaohong, Wu Tao, Ju Jinxiu, Wang Shijie, Lian Yufan, Zhang Hongjun, Liang Tong, Sang Ye, Jiang Rui, Wang Guangyu, Ren Jie, &amp; Chen Ting. (2022). Annotated Ultrasound Liver images [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7272660</description>
<size>70369237</size>
</item><item>
<title>TRUSTED: The Paired 3D Ultrasound and CT Human Data for Kidney Segmentation and Registration Research</title>
<category>Dataset</category>
<infohash>aff393b0688d0bf396407cb8d02b4284879c1cf7</infohash>
<guid>https://academictorrents.com/details/aff393b0688d0bf396407cb8d02b4284879c1cf7</guid>
<link>https://academictorrents.com/details/aff393b0688d0bf396407cb8d02b4284879c1cf7</link>
<description>We propose TRUSTED (the Tridimensional Renal Ultra Sound Tomod Ensitometrie Dataset), comprising paired transabdominal 3DUS and CT kidney images from 48 human patients (96 kidneys), including segmentation, and anatomical landmark annotations by two experienced radiographers. Abstract Inter-modal image registration (IMIR) and image segmentation with abdominal Ultrasound (US) data have many important clinical applications, including image-guided surgery, automatic organ measurement, and robotic navigation. However, research is severely limited by the lack of public datasets. We propose TRUSTED (the Tridimensional Renal Ultra Sound TomodEnsitometrie Dataset), comprising paired transabdominal 3DUS and CT kidney images from 48 human patients (96 kidneys), including segmentation, and anatomical landmark annotations by two experienced radiographers. Inter-rater segmentation agreement was over 93% (Dice score), and gold-standard segmentations were generated using the STAPLE algorithm. Seven anatomical landmarks were annotated, for IMIR systems development and evaluation. To validate the dataset’s utility, 4 competitive Deep-Learning models for kidney segmentation were benchmarked, yielding average DICE scores from 79.63% to 90.09% for CT, and 70.51% to 80.70% for US images. Four IMIR methods were benchmarked, and Coherent Point Drift performed best with an average Target Registration Error of 4.47 mm and Dice score of 84.10%. The TRUSTED dataset may be used freely to develop and validate segmentation and IMIR methods. Ndzimbong, W., Fourniol, C., Themyr, L. et al. TRUSTED: The Paired 3D Transabdominal Ultrasound and CT Human Data for Kidney Segmentation and Registration Research. Sci Data 12, 615 (2025). https://doi.org/10.1038/s41597-025-04467-1</description>
<size>15987228105</size>
</item><item>
<title>Abdominal Ultrasound Image Dataset for Organ Classification and Disease Detection</title>
<category>Dataset</category>
<infohash>fb252da63e5ba59bea91821018e5c83d172346ba</infohash>
<guid>https://academictorrents.com/details/fb252da63e5ba59bea91821018e5c83d172346ba</guid>
<link>https://academictorrents.com/details/fb252da63e5ba59bea91821018e5c83d172346ba</link>
<description>This is a dataset of Ultrasound (US) images of abdominal organs. US imaging is widely accessible and a very common diagnostic tool, as it is non-invasive and does not involve radiation risk. This dataset was curated solely for research in deep learning, with potential applications in supervised, semi-supervised, and unsupervised learning to support disease detection in resource-constrained settings. The dataset comprises 5,005 images of different abdominal organs, namely: Abdominal Aorta (0), Gallbladder (1), Hepatic Vein (2), Kidneys (3), Liver (4), Ovaries (5), Pancreas (6), Portal Vein (7), Spleen (8), and the Urinary System (9), which includes the Urinary Bladder, Prostate, and Uterus. Images were collected from 563 patients at MH Samorita Medical College and Hospital and in Dhaka, Bangladesh. The author tried her best to curate this dataset systematically and organize uniquely. Every folder and subfolder has correctly numbered, serially ordered images, making it the first dataset one of its kind in terms of structure and usability. This ensures reproducibility and reliability for researchers worldwide In total, the dataset is organized into five distinct formats/folders, described below. Two radiologists were examining the patients. ## Radiologist one: - organ_classification_1: Contains 2,784 images of the 10 organs listed above. Designed for classification tasks. - anomaly_detection_1: Contains two sub-folders: normal (2,014 images) and abnormal (799 images). Designed for anomaly detection tasks. - organ_classification+anomaly_detection: Contains 2 sub-folders. One represents the normal organs (1,948 images) and one represents abnormal cases (981 images) including a newly added ascites folder. Ascites is a condition where the abdominal cavity becomes overly filled with fluid, and it was separated as a distinct abnormal class. This hybrid dataset (organ_classification+anomaly_detection) is an experimental extension combining both tasks. While curated carefully, users are advised to double-check labels for their specific tasks. ## Radiologist two: - anomaly_detection_2: Contains two sub-folders: normal (656 images) and abnormal (269 images). This batch was collected last and was used for semi-supervised anomaly detection tasks. - organ_classification_2: Contains 10 sub-folders representing the 10 different organs, with a total of 1,293 images. - Patient_Wise: This folder contains 170 patient images, their diagnosis as metadata in an xlsx file and a text file, Update Version 01_USG.txt. ## Acknowledgements: This dataset was developed as part of the author s research under the supervision of Dr. John E. Ball, Mississippi State University. The author is grateful for his guidance, encouragement, and support throughout the course of this work. The author would like to thank logistical support of her father, Dr. Md. Enayet Karim for his support in coordinating with MH Samorita Medical College and Hospital to obtain the ultrasound images and metadata for this dataset. The author gratefully acknowledges the contributions of the radiologists, sonographers, and staff at MH Samorita Medical College and Hospital, for their assistance in conducting the ultrasound examinations and providing access to the imaging data. Their efforts in patient care and technical support were invaluable in making this dataset possible. Ethics and Data Access Permissions: This dataset was collected with formal approval from the Institutional Ethical Review Board (IERB) of MH Samorita Medical College and Hospital, Dhaka, Bangladesh. The ethical clearance was obtained before data collection and written institutional permission was granted before sharing. All images were anonymized and de-identified before inclusion in this dataset, ensuring patient privacy and compliance with ethical research standards. ## License: This dataset is released under the Creative Commons Attribution 4.0 International License (CC BY 4.0). Users are required to cite this dataset when using it in publications, research, or derivative works. This dataset is openly available for research and academic purposes, supporting reproducibility and transparency in medical AI research. https://orcid.org/0009-0005-8413-4149</description>
<size>1491269509</size>
</item><item>
<title>Reddit comments/submissions 2026-03</title>
<category>Dataset</category>
<infohash>668087bb8c8c9c763b27a1a4c5e7fcb6add25f2c</infohash>
<guid>https://academictorrents.com/details/668087bb8c8c9c763b27a1a4c5e7fcb6add25f2c</guid>
<link>https://academictorrents.com/details/668087bb8c8c9c763b27a1a4c5e7fcb6add25f2c</link>
<description>Reddit comments and submisReddit comments and submissions from 2026-03 Documentation, json schemas and more can be found at https://github.com/ArthurHeitmann/arctic_shift Helper scripts for processing files can be found at https://github.com/Watchful1/PushshiftDumpssions</description>
<size>68404242141</size>
</item><item>
<title>UIdataGB Gallblader Diseases Dataset</title>
<category>Dataset</category>
<infohash>aa2092fb6910c90c467a62293f74042a6ff7d251</infohash>
<guid>https://academictorrents.com/details/aa2092fb6910c90c467a62293f74042a6ff7d251</guid>
<link>https://academictorrents.com/details/aa2092fb6910c90c467a62293f74042a6ff7d251</link>
<description>The dataset is composed of ultrasound images of the GB organ from inside the gastrointestinal tract. The dataset includes 9 classes according to anatomical landmarks. Each class represents a GB disease. Published: 23 January 2024 | Version 1 | DOI: 10.17632/r6h24d2d3y.1 Turki, Amina; Mahdi Obaid, Ahmed; Bellaaj, Hatem; Ksantini, Mohamed; Altaee, Abdulla (2024), “Gallblader Diseases Dataset  ”, Mendeley Data, V1, doi: 10.17632/r6h24d2d3y.1 "The UIdataGB dataset consists of 10692 images, annotated, and verified by medical doctorsand experienced radiologists. It includes 9 classes according to anatomical landmarks. Each classcontains nearly 1200 images. Therefore, the dataset is balanced in terms of diseases. In total,1782 patients were involved in the data collection; the number of female images was 6246,with an average age of 63.4, while the number of male images was 4446, with an average ageof 59.6.The number of images is sufficient to be used for different tasks, e.g., image retrieval, ML, DL,and transfer learning (TL), etc. The anatomical landmark of the GB determines the pathologicalfinding like cholecystitis, stone of the GB and polyps.The dataset consists of images with a resolution of 90 0×120 0 pixels and they are sorted intoseparate nine folders named according to the content. Tables 1 and 2 show the distribution ofdiseases in terms of images and patients’ numbers as well as the distribution of images accord-ing to gender." https://www.sciencedirect.com/science/article/pii/S2352340924003950</description>
<size>2042724460</size>
</item><item>
<title>enwiki-20260401-pages-articles-multistream-index.txt.bz2</title>
<category>Dataset</category>
<infohash>8ed9dcc05b0fdecb47f998186e9e5c30f8212cfc</infohash>
<guid>https://academictorrents.com/details/8ed9dcc05b0fdecb47f998186e9e5c30f8212cfc</guid>
<link>https://academictorrents.com/details/8ed9dcc05b0fdecb47f998186e9e5c30f8212cfc</link>
<description>English Wikipedia Multistream Index 2026-04-01 https://en.wikipedia.org/wiki/Wikipedia:Database_download Corresponding multistream file: https://academictorrents.com/details/2b2ffc80941b61dcfe55fc444a6a78a60eef5944</description>
<size>280349081</size>
</item><item>
<title>enwiki-20260401-pages-articles-multistream.xml.bz2</title>
<category>Dataset</category>
<infohash>2b2ffc80941b61dcfe55fc444a6a78a60eef5944</infohash>
<guid>https://academictorrents.com/details/2b2ffc80941b61dcfe55fc444a6a78a60eef5944</guid>
<link>https://academictorrents.com/details/2b2ffc80941b61dcfe55fc444a6a78a60eef5944</link>
<description>English Wikipedia Multistream 2026-04-01 https://en.wikipedia.org/wiki/Wikipedia:Database_download Corresponding index file: https://academictorrents.com/details/8ed9dcc05b0fdecb47f998186e9e5c30f8212cfc</description>
<size>26207949994</size>
</item><item>
<title>EchoCP_dataset</title>
<category>Dataset</category>
<infohash>22508526b4bf4b08641b3dbba010eb083388322c</infohash>
<guid>https://academictorrents.com/details/22508526b4bf4b08641b3dbba010eb083388322c</guid>
<link>https://academictorrents.com/details/22508526b4bf4b08641b3dbba010eb083388322c</link>
<description>A dataset of contrast transthoracic echocardiography, EchoCP, for patent foramen ovale diagnosis is published. We present EchoCP, the first dataset for cTTE based PFO diagnosis. EchoCP contains both VM and rest echocardiography videos captured from 30 patients. Data annotation including diagnosis annotation and segmentation annotation are performed by four experienced cardiovascular sonographers. As there are more than a thousand images in each patient s video, sparse labeling (only select representative frames) of the segmentation is adopted. EchoCP contains cTTE videos from 30 patients. For each patient, two videos corresponding to the rest and VM state of the patients are captured. Note that in the rest state, patients just relax and breathe normally. While in the VM, patients need to close their mouths, pinch their noses shut while expelling air out as if blowing up a balloon. The video is captured in the apical-4-chamber view and contains at least ten cardiac cycles. For the VM state, the action is performed three to five times during acquisition, and we selected the most representative one. If you used our dataset, please consider to cite our paper in MICCAI 2021, Tianchen Wang, Zhihe Li, Shanshan Bi, Meiping Huang, Jiawei Zhang, Jian Zhuang, Yiyu Shi, Hongwen Fei, Xiaowei Xu, "ImageCHD: A 3D Computed Tomography Image Dataset for Classification of Congenital Heart Disease," in Proc. of Medical Image Computing and Computer Assisted Interventions (MICCAI), Online, 2021. https://arxiv.org/abs/2101.10799 HIGHLIGHT 20231101: We have deployed the dataset on Kaggle! Please send emails to me xiao.wei.xu@foxmail.com if you have any questions about the dataset and the benchmark.</description>
<size>5554695068</size>
</item><item>
<title>cardiacUDC_dataset</title>
<category>Dataset</category>
<infohash>55cce2068badb8204b5de896922afc301c37a691</infohash>
<guid>https://academictorrents.com/details/55cce2068badb8204b5de896922afc301c37a691</guid>
<link>https://academictorrents.com/details/55cce2068badb8204b5de896922afc301c37a691</link>
<description>We collect CardiacUDA from our two hospitals: site G and site R. In order to guarantee all echocardiogram videos are standardscompliant, all cases of CardiacUDA are collected, annotated and approved by 5-6 experienced physicians. For ethical issues, we have required approval from medical institutions. Each patient underwent four views during scanning, which included parasternal left ventricle long axis (LVLA), pulmonary artery long axis (PALA), left ventricular short-axis (LVSA), and apical fourchamber heart (A4C), resulting in four videos per patient. The resolution of each video was either 800x600 or 1024x768, depending on the scanner used (Philips or HITACHI). A total of 516 and 476 videos were collected from Site G and Site R, respectively, from approximately 100 different patients. Each video consists of over 100 frames, covering at least one heartbeat cycle. We have provided pixel-level annotations for each view, including masks for the left ventricle (LV) and right ventricle (RV) in the LVLA view, masks for the pulmonary artery (PA) in the PALA view, masks for the LV and RV in the LVSA view, and masks for the LV, RV, left atrium (LA), and right atrium (RA) in the A4C view. The videos in both Site R and Site G were divided into a ratio of 8:1:1 for training, validation, and testing, respectively. To lower annotation costs in the training set, only five frames per video are provided with pixellevel annotation masks. To better measure the model performance, we provide pixel-level annotations for every frame in each video in the validation and testing sets. HIGHLIGHT 20231101: We have deployed the dataset on Kaggle! Please refer to the code (https://github.com/xmed-lab/GraphEcho) and our ICCV paper (https://arxiv.org/abs/2309.11145) for more detailes. Please send emails to me xiao.wei.xu@foxmail.com if you have any questions about the dataset and the benchmark.</description>
<size>4547904911</size>
</item></channel>
</rss>
