-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
358 additions
and
256 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,24 +1,24 @@ | ||
Package: sweary | ||
Type: Package | ||
Title: Database of Swear Words in R | ||
Version: 0.0.0.9000 | ||
Authors@R: person("Patrik", "Drhlik", email = "patrik.drhlik@gmail.com", | ||
role = c("aut", "cre")) | ||
Description: The packages tries to offer an extensive list of swear | ||
words from different languages, cherry picked by native speakers. | ||
It should then be used in different text analyses. | ||
URL: https://github.com/pdrhlik/sweary | ||
BugReports: https://github.com/pdrhlik/sweary/issues | ||
Depends: R (>= 2.10) | ||
License: MIT + file LICENSE | ||
Encoding: UTF-8 | ||
LazyData: true | ||
RoxygenNote: 6.0.1 | ||
Suggests: | ||
testthat, | ||
devtools, | ||
dplyr, | ||
purrr, | ||
rmarkdown, | ||
stringr | ||
Imports: glue | ||
Package: sweary | ||
Type: Package | ||
Title: Database of Swear Words in R | ||
Version: 0.0.0.9000 | ||
Authors@R: person("Patrik", "Drhlik", email = "patrik.drhlik@gmail.com", | ||
role = c("aut", "cre")) | ||
Description: The packages tries to offer an extensive list of swear | ||
words from different languages, cherry picked by native speakers. | ||
It should then be used in different text analyses. | ||
URL: https://github.com/pdrhlik/sweary | ||
BugReports: https://github.com/pdrhlik/sweary/issues | ||
Depends: R (>= 2.10) | ||
License: MIT + file LICENSE | ||
Encoding: UTF-8 | ||
LazyData: true | ||
RoxygenNote: 6.1.0 | ||
Suggests: | ||
testthat, | ||
devtools, | ||
dplyr, | ||
purrr, | ||
rmarkdown, | ||
stringr | ||
Imports: glue |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
# Generated by roxygen2: do not edit by hand | ||
|
||
export(get_swearwords) | ||
# Generated by roxygen2: do not edit by hand | ||
export(get_swearwords) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,116 +1,118 @@ | ||
|
||
<!-- README.md is generated from README.Rmd. Please edit this file. --> | ||
|
||
[![Join the chat at | ||
https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr) | ||
[![Build | ||
Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary) | ||
|
||
# sweary <img src="sticker/sweary-sticker.png" align="right" width="150" /> | ||
|
||
Sweary is an R package that contains a database of swear words from | ||
different languages, cherry picked by native speakers. | ||
|
||
## Installation | ||
|
||
The development version of this package can be installed using | ||
[devtools](https://github.com/r-lib/devtools): | ||
|
||
devtools::install_github("pdrhlik/sweary") | ||
|
||
## Current swear word lists | ||
|
||
| Language | Language code | Number of swear words | | ||
| --------------- | ------------- | --------------------- | | ||
| Czech | cs | 57 | | ||
| English | en | 39 | | ||
| French (Canada) | fr-CA | 20 | | ||
| Polish | pl | 41 | | ||
| Romanian | ro | 38 | | ||
| Slovak | sk | 28 | | ||
| **Total** | **6 langs** | **223** | | ||
|
||
## Examples | ||
|
||
All languages are stored in a `swear_words` data frame. | ||
|
||
``` r | ||
library(sweary) | ||
head(swear_words) | ||
``` | ||
|
||
## # A tibble: 6 x 2 | ||
## word language | ||
## <chr> <chr> | ||
## 1 buzerant cs | ||
## 2 čubka cs | ||
## 3 čurák cs | ||
## 4 čůrák cs | ||
## 5 debil cs | ||
## 6 dement cs | ||
|
||
You can only extract one language that you are interested in. | ||
|
||
``` r | ||
en_swear_words <- get_swearwords("en") | ||
head(en_swear_words) | ||
``` | ||
|
||
## # A tibble: 6 x 2 | ||
## word language | ||
## <chr> <chr> | ||
## 1 arse en | ||
## 2 arsehole en | ||
## 3 ass en | ||
## 4 asshole en | ||
## 5 bitch en | ||
## 6 bollocks en | ||
|
||
## Add (modify) a language | ||
|
||
If you are not comfortable with `git` and pull requests, you can just | ||
follow steps **1-3**. After you create the file, send it to me via | ||
[email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary | ||
language: {LANG\_CODE}**. We will acknowledge you in the README after we | ||
approve of the changes. | ||
|
||
1. **Choose a new language.** | ||
Find its two letter [ISO 639-1 | ||
code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). | ||
2. **Create a language file.** | ||
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`. | ||
Example for English: `data-raw/swear-word-lists/en`. | ||
3. **Fill in the file with swear-words.** Following rules must apply: | ||
- **One** swear-word per line. | ||
- All words must be **lowercase**. | ||
- The list must only contain **unique** words. | ||
- The list must be **sorted** alphabetically. | ||
4. **Make sure all the tests pass.** | ||
You can do that using a development function called | ||
`build_sweary()`. It becomes available when you `git clone` the | ||
repository and call `devtools::load_all()`. Or pressing | ||
`Ctrl+Shift+L` in RStudio. Learn more about calling this function | ||
using `?build_sweary`. | ||
5. **Update README.Rmd**. | ||
Update the `langs` data frame in README.Rmd by adding a new row to | ||
it. More precise instructions are in the raw file itself. | ||
6. **Create a pull request.** | ||
|
||
## Origin | ||
|
||
The idea first appeared after the [South Park text analysis lightning | ||
talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why | ||
R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the | ||
contributors will be acknowledged as the work progresses. | ||
|
||
## Acknowledgments | ||
|
||
Here we would like to say **BIG THANKS** to native speakers that help us | ||
with swear words dictionaries: | ||
|
||
- Czech - [Patrik Drhlík](https://github.com/pdrhlik) | ||
- English - [Patrik Drhlík](https://github.com/pdrhlik) | ||
- French (Canada) - [Marc-André Désautels](https://github.com/desautm) | ||
- Polish - [Michal Czyz](https://github.com/mczyzj) | ||
- Romanian - Alexandru Supeanu | ||
- Slovak - Šimon Žďárský | ||
|
||
<!-- README.md is generated from README.Rmd. Please edit this file. --> | ||
|
||
[![Join the chat at | ||
https://gitter.im/pdrhlik/sweary](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/swearyr) | ||
[![Build | ||
Status](https://travis-ci.org/pdrhlik/sweary.svg?branch=master)](https://travis-ci.org/pdrhlik/sweary) | ||
|
||
# sweary <img src="sticker/sweary-sticker.png" align="right" width="150" /> | ||
|
||
Sweary is an R package that contains a database of swear words from | ||
different languages, cherry picked by native speakers. | ||
|
||
## Installation | ||
|
||
The development version of this package can be installed using | ||
[devtools](https://github.com/r-lib/devtools): | ||
|
||
devtools::install_github("pdrhlik/sweary") | ||
|
||
## Current swear word lists | ||
|
||
| Language | Language code | Number of swear words | | ||
| --------------- | ------------- | --------------------- | | ||
| Czech | cs | 57 | | ||
| German | de | 99 | | ||
| English | en | 39 | | ||
| French (Canada) | fr-CA | 20 | | ||
| Polish | pl | 41 | | ||
| Romanian | ro | 38 | | ||
| Slovak | sk | 28 | | ||
| **Total** | **7 langs** | **322** | | ||
|
||
## Examples | ||
|
||
All languages are stored in a `swear_words` data frame. | ||
|
||
``` r | ||
library(sweary) | ||
head(swear_words) | ||
``` | ||
|
||
## # A tibble: 6 x 2 | ||
## word language | ||
## <chr> <chr> | ||
## 1 buzerant cs | ||
## 2 čubka cs | ||
## 3 čurák cs | ||
## 4 čůrák cs | ||
## 5 debil cs | ||
## 6 dement cs | ||
|
||
You can only extract one language that you are interested in. | ||
|
||
``` r | ||
en_swear_words <- get_swearwords("en") | ||
head(en_swear_words) | ||
``` | ||
|
||
## # A tibble: 6 x 2 | ||
## word language | ||
## <chr> <chr> | ||
## 1 arse en | ||
## 2 arsehole en | ||
## 3 ass en | ||
## 4 asshole en | ||
## 5 bitch en | ||
## 6 bollocks en | ||
|
||
## Add (modify) a language | ||
|
||
If you are not comfortable with `git` and pull requests, you can just | ||
follow steps **1-3**. After you create the file, send it to me via | ||
[email](mailto:patrik.drhlik@gmail.com) with a subject **New sweary | ||
language: {LANG\_CODE}**. We will acknowledge you in the README after we | ||
approve of the changes. | ||
|
||
1. **Choose a new language.** | ||
Find its two letter [ISO 639-1 | ||
code](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes). | ||
2. **Create a language file.** | ||
Place the file in `data-raw/swear-word-lists/{LANG_CODE}`. | ||
Example for English: `data-raw/swear-word-lists/en`. | ||
3. **Fill in the file with swear-words.** Following rules must apply: | ||
- **One** swear-word per line. | ||
- All words must be **lowercase**. | ||
- The list must only contain **unique** words. | ||
- The list must be **sorted** alphabetically. | ||
4. **Make sure all the tests pass.** | ||
You can do that using a development function called | ||
`build_sweary()`. It becomes available when you `git clone` the | ||
repository and call `devtools::load_all()`. Or pressing | ||
`Ctrl+Shift+L` in RStudio. Learn more about calling this function | ||
using `?build_sweary`. | ||
5. **Update README.Rmd**. | ||
Update the `langs` data frame in README.Rmd by adding a new row to | ||
it. More precise instructions are in the raw file itself. | ||
6. **Create a pull request.** | ||
|
||
## Origin | ||
|
||
The idea first appeared after the [South Park text analysis lightning | ||
talk](https://github.com/pdrhlik/southparktalk-whyr2018) at the [Why | ||
R? 2018 conference](http://whyr2018.pl/) in Wrocław. All the | ||
contributors will be acknowledged as the work progresses. | ||
|
||
## Acknowledgments | ||
|
||
Here we would like to say **BIG THANKS** to native speakers that help us | ||
with swear words dictionaries: | ||
|
||
- Czech - [Patrik Drhlík](https://github.com/pdrhlik) | ||
- English - [Patrik Drhlík](https://github.com/pdrhlik) | ||
- French (Canada) - [Marc-André Désautels](https://github.com/desautm) | ||
- German - [Peter Meißner](https://github.com/petermeissner) | ||
- Polish - [Michal Czyz](https://github.com/mczyzj) | ||
- Romanian - Alexandru Supeanu | ||
- Slovak - Šimon Žďárský |
Oops, something went wrong.