Skip to content

Latest commit

 

History

History
247 lines (240 loc) · 5.41 KB

splits.md

File metadata and controls

247 lines (240 loc) · 5.41 KB

Splits

The standard train / dev / test splits are used for the corpus:

dev

  • GUM_academic_exposure
  • GUM_academic_librarians
  • GUM_bio_byron
  • GUM_bio_emperor
  • GUM_conversation_grounded
  • GUM_conversation_risk
  • GUM_court_loan
  • GUM_essay_evolved
  • GUM_fiction_beast
  • GUM_fiction_lunre
  • GUM_interview_cyclone
  • GUM_interview_gaming
  • GUM_letter_arendt
  • GUM_news_homeopathic
  • GUM_news_iodine
  • GUM_podcast_wrestling
  • GUM_reddit_macroeconomics
  • GUM_reddit_pandas
  • GUM_speech_impeachment
  • GUM_speech_inauguration
  • GUM_textbook_governments
  • GUM_textbook_labor
  • GUM_vlog_portland
  • GUM_vlog_radiology
  • GUM_voyage_athens
  • GUM_voyage_coron
  • GUM_whow_joke
  • GUM_whow_overalls

test

  • GUM_academic_discrimination
  • GUM_academic_eegimaa
  • GUM_bio_dvorak
  • GUM_bio_jespersen
  • GUM_conversation_lambada
  • GUM_conversation_retirement
  • GUM_court_mitigation
  • GUM_essay_fear
  • GUM_fiction_falling
  • GUM_fiction_teeth
  • GUM_interview_hill
  • GUM_interview_libertarian
  • GUM_letter_mandela
  • GUM_news_nasa
  • GUM_news_sensitive
  • GUM_podcast_bezos
  • GUM_reddit_escape
  • GUM_reddit_monsters
  • GUM_speech_austria
  • GUM_speech_newzealand
  • GUM_textbook_chemistry
  • GUM_textbook_union
  • GUM_vlog_london
  • GUM_vlog_studying
  • GUM_voyage_oakland
  • GUM_voyage_vavau
  • GUM_whow_cactus
  • GUM_whow_mice

train

  • GUM_academic_art
  • GUM_academic_census
  • GUM_academic_economics
  • GUM_academic_enjambment
  • GUM_academic_epistemic
  • GUM_academic_games
  • GUM_academic_huh
  • GUM_academic_implicature
  • GUM_academic_lighting
  • GUM_academic_mutation
  • GUM_academic_replication
  • GUM_academic_salinity
  • GUM_academic_theropod
  • GUM_academic_thrones
  • GUM_bio_bernoulli
  • GUM_bio_chao
  • GUM_bio_enfant
  • GUM_bio_fillmore
  • GUM_bio_galois
  • GUM_bio_goode
  • GUM_bio_gordon
  • GUM_bio_hadid
  • GUM_bio_higuchi
  • GUM_bio_holt
  • GUM_bio_jerome
  • GUM_bio_marbles
  • GUM_bio_moreau
  • GUM_bio_nida
  • GUM_bio_padalecki
  • GUM_bio_theodorus
  • GUM_conversation_atoms
  • GUM_conversation_blacksmithing
  • GUM_conversation_christmas
  • GUM_conversation_erasmus
  • GUM_conversation_family
  • GUM_conversation_gossip
  • GUM_conversation_scientist
  • GUM_conversation_toys
  • GUM_conversation_vet
  • GUM_conversation_zero
  • GUM_court_carpet
  • GUM_court_equality
  • GUM_court_fire
  • GUM_court_prince
  • GUM_essay_distraction
  • GUM_essay_dividends
  • GUM_essay_sexlife
  • GUM_fiction_claus
  • GUM_fiction_error
  • GUM_fiction_frankenstein
  • GUM_fiction_garden
  • GUM_fiction_giants
  • GUM_fiction_honour
  • GUM_fiction_moon
  • GUM_fiction_oversite
  • GUM_fiction_pag
  • GUM_fiction_pixies
  • GUM_fiction_rose
  • GUM_fiction_sneeze
  • GUM_fiction_time
  • GUM_fiction_veronique
  • GUM_fiction_wedding
  • GUM_interview_ants
  • GUM_interview_brotherhood
  • GUM_interview_chomsky
  • GUM_interview_cocktail
  • GUM_interview_daly
  • GUM_interview_dungeon
  • GUM_interview_herrick
  • GUM_interview_licen
  • GUM_interview_mcguire
  • GUM_interview_mckenzie
  • GUM_interview_messina
  • GUM_interview_onion
  • GUM_interview_peres
  • GUM_interview_shalev
  • GUM_interview_stardust
  • GUM_letter_flood
  • GUM_letter_gorbachev
  • GUM_letter_roomers
  • GUM_letter_zora
  • GUM_news_afghan
  • GUM_news_asylum
  • GUM_news_clock
  • GUM_news_crane
  • GUM_news_defector
  • GUM_news_election
  • GUM_news_expo
  • GUM_news_flag
  • GUM_news_hackers
  • GUM_news_ie9
  • GUM_news_imprisoned
  • GUM_news_korea
  • GUM_news_lanterns
  • GUM_news_soccer
  • GUM_news_stampede
  • GUM_news_taxes
  • GUM_news_warhol
  • GUM_news_warming
  • GUM_news_worship
  • GUM_podcast_addiction
  • GUM_podcast_brave
  • GUM_podcast_collaboration
  • GUM_reddit_bobby
  • GUM_reddit_callout
  • GUM_reddit_card
  • GUM_reddit_conspiracy
  • GUM_reddit_gender
  • GUM_reddit_introverts
  • GUM_reddit_polygraph
  • GUM_reddit_racial
  • GUM_reddit_ring
  • GUM_reddit_social
  • GUM_reddit_space
  • GUM_reddit_steak
  • GUM_reddit_stroke
  • GUM_reddit_superman
  • GUM_speech_albania
  • GUM_speech_data
  • GUM_speech_destiny
  • GUM_speech_floyd
  • GUM_speech_humanitarian
  • GUM_speech_maiden
  • GUM_speech_nixon
  • GUM_speech_remarks
  • GUM_speech_school
  • GUM_speech_telescope
  • GUM_speech_trump
  • GUM_textbook_alamo
  • GUM_textbook_anthropology
  • GUM_textbook_artwork
  • GUM_textbook_cognition
  • GUM_textbook_entrepreneurship
  • GUM_textbook_evoethics
  • GUM_textbook_grit
  • GUM_textbook_history
  • GUM_textbook_sociology
  • GUM_textbook_spacetime
  • GUM_textbook_stats
  • GUM_vlog_appearance
  • GUM_vlog_college
  • GUM_vlog_covid
  • GUM_vlog_exams
  • GUM_vlog_hair
  • GUM_vlog_hiking
  • GUM_vlog_lipstick
  • GUM_vlog_mermaid
  • GUM_vlog_pizzeria
  • GUM_vlog_pregnant
  • GUM_vlog_wine
  • GUM_voyage_chatham
  • GUM_voyage_cleveland
  • GUM_voyage_cuba
  • GUM_voyage_fortlee
  • GUM_voyage_guadeloupe
  • GUM_voyage_isfahan
  • GUM_voyage_lodz
  • GUM_voyage_merida
  • GUM_voyage_phoenix
  • GUM_voyage_socotra
  • GUM_voyage_sydfynske
  • GUM_voyage_thailand
  • GUM_voyage_tulsa
  • GUM_voyage_york
  • GUM_whow_arrogant
  • GUM_whow_ballet
  • GUM_whow_basil
  • GUM_whow_chicken
  • GUM_whow_cupcakes
  • GUM_whow_elevator
  • GUM_whow_flirt
  • GUM_whow_glowstick
  • GUM_whow_languages
  • GUM_whow_packing
  • GUM_whow_parachute
  • GUM_whow_procrastinating
  • GUM_whow_quidditch
  • GUM_whow_quinoa
  • GUM_whow_skittles