Skip to content

Commit

Permalink
ar-latn: add Latin-script Arabic
Browse files Browse the repository at this point in the history
Related to words/cuss#16.
  • Loading branch information
wooorm committed Oct 31, 2018
1 parent ed99a06 commit 3787ffe
Show file tree
Hide file tree
Showing 3 changed files with 262 additions and 11 deletions.
249 changes: 249 additions & 0 deletions ar-latn.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,249 @@
[
"'attaï",
"2adeeb",
"3abeet",
"3agala",
"3ahera",
"3akroot",
"3arbagy",
"3ars",
"3as",
"3asba",
"3ass",
"3attay",
"3ayba",
"3eer",
"3el2",
"3ezgetek",
"5anith",
"5anza",
"5anzeer",
"5anzer",
"5ara",
"5awal",
"5oneeth",
"5oneth",
"5orm",
"5ra",
"7akeer",
"7alama",
"7amka",
"7aqeer",
"7aywan",
"7omar",
"9a7ba",
"Kerfa",
"Kol Airi",
"Kol Khara",
"Sowa",
"a7a",
"ahbal",
"airi",
"asba",
"ayaar",
"ayre",
"ba8l",
"badeen",
"baghl",
"bandook",
"barbook",
"batrona",
"bayra",
"bazr",
"bedan",
"berkaka",
"besht",
"beskeletta",
"betetrakab",
"beyetnak",
"beyetrakab",
"bez",
"bezaz",
"boofta",
"btetnak",
"bzazlha",
"chalb",
"coso5tak",
"cosomak",
"da3ara",
"da3era",
"dayer ki zebbi",
"dayooth",
"dayoth",
"dayouth",
"deen abook",
"deen omak",
"deiouse",
"e8tesab",
"eacha",
"eghtesab",
"ethwel",
"far5",
"farasa",
"farg",
"fash5",
"fash5a",
"gamoosa",
"gamosa",
"gawad",
"gema3",
"hakeer",
"hamama",
"haqeer",
"hatchoun yemak",
"hawi",
"haygan",
"hbeela",
"hbila",
"homar",
"hway",
"hwiha",
"ibn al 3ahra",
"jamal",
"jarrar",
"jru",
"ka7ba",
"kaboul",
"kalb",
"karrak",
"kawwad",
"kawwazi",
"kes ekhtak",
"kes emmak",
"khanez",
"khanith",
"khanza",
"khanzeer",
"khanzer",
"koll zakk",
"koso5tak",
"kosomak",
"la39",
"la7s",
"labawy",
"labwa",
"lehbeela",
"lehbila",
"louat",
"louty",
"m3arras",
"mahbool",
"makwa",
"malhat",
"mamsoo5",
"mamsookh",
"manyak",
"manyok",
"manyook",
"mara",
"marioah",
"mas5a",
"mas5ota",
"matny",
"maybon",
"maybona",
"mayboon",
"mayboona",
"mazloula",
"menaswen",
"metnak",
"metnaka",
"metrama",
"mfal2as",
"mfal2asa",
"mfannes",
"mfannesa",
"miboun",
"mkauda",
"mo3ak",
"mo5anath",
"mok 9a7ba",
"mok ka7ba",
"mok kahba",
"moss",
"mota5alef",
"motakhalef",
"motasawel",
"mozza",
"ms5ota",
"naa'hachoun yemak",
"nahwik",
"namm",
"naqal",
"naqsh",
"nayek Emmak",
"nayek hareemek",
"neek",
"neka7",
"nikmok",
"nouna",
"qlawi",
"rass el zebb",
"rooh etnak",
"salgot",
"shahwa",
"shalaga",
"shanbora",
"sharag",
"sharameet",
"sharmoot",
"sharmoota",
"sharmot",
"sharmota",
"shayet",
"shayta",
"shaz",
"sheikha",
"skayrey",
"skayreyya",
"so7ak",
"so7akeyya",
"sobisa",
"sorm",
"sormaha",
"sormak",
"spontchi",
"ssossi",
"ta77an",
"tabon",
"tabonak",
"tabonek",
"taboon",
"taboonak",
"taboonek",
"tahhan",
"tarma",
"teez",
"teezak",
"teezha",
"teezhom",
"teezo",
"telhas ras airi",
"terma",
"tezak",
"tina",
"tkaffat",
"w3ra",
"wa3ra",
"zaaka",
"zabbour",
"zac",
"zacomak",
"zaghnaboot",
"zak",
"zakomak",
"zanboor",
"zebak",
"zebala",
"zeby",
"zeg",
"zega",
"zegg",
"zegga",
"zippak",
"zippy",
"zobr",
"zobrak",
"zobry",
"zwamel"
]
1 change: 1 addition & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
],
"main": "index.json",
"files": [
"ar-latn.json",
"index.json"
],
"dependencies": {},
Expand Down
23 changes: 12 additions & 11 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,21 +31,18 @@ console.log(typeof profanities[0]) // 'string'

## Support

**profanities** supports 1772 English profane words and phrases.
For a complete list, see [support.md][support] (:warning: **this file
contains (very) offensive terms**).
**profanities** supports many profane words and phrases in different languages.

Note that the words listed in **profanities** might **not** be profane
in certain contexts.

## Data

Part of the list is scraped from [Luis von Ahn’s Research Group (Carnegie
Mellon)][luis-von-ahn]. I could not find
any license information on that page.

Another list is based on the [`List of ethnic slurs` from
WikiPedia][racial-slurs].
* [`index.json`](index.json) — ± 1772 English profane words and phrases from
[Luis von Ahn’s Research Group (Carnegie Mellon)][luis-von-ahn], the [`List
of ethnic slurs` from WikiPedia][racial-slurs], and more (see
[support.md][support])
* [`ar-latn.json`](ar-latn.json) — ± 250 Arabic (Latin-Script) profane words
and phrases from [`naughty-words`][ar-source-naughty-words] and
[`youswear`][ar-source-youswear]

## Contributing

Expand Down Expand Up @@ -100,3 +97,7 @@ and open a Pull Request.
[racial-list]: script/racial.txt

[rest-list]: script/rest.txt

[ar-source-naughty-words]: https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/blob/master/ar

[ar-source-youswear]: http://www.youswear.com/index.asp?language=Arabic

0 comments on commit 3787ffe

Please sign in to comment.