Githubs policy on offensive terms #24014
-
Hey! I just published a new repo on github. it is currently private, as I am still working on it and wondering about this question. essentially, I am creating a moderation program. part of this program is the automatic removal of certain terms that could be regarded as derogatory and/or offensive. to accomplish this, included in the repository is a uncensored list the program reads from containing these terms. to clarify, these words are not aimed at anyone, they are just in a newline separated document. before I go public, I would like confirmation that this sort of thing does not violate github’s community guidelines, specifically GitHub Community Guidelines - GitHub Docs . as a further mesure, I have also included a file called PLEASEREADMEFIRST in the directory with the wordlist, that warns of the content ahead. any insight into this issue by someone well versed in github policy is appreciated. cheers! |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
I’m not a lawyer nor a GitHub staff member, but I think you’re fine. As the policy you referred to states: As a matter of fact, there seem to be similar repo’s around that exist without problems: |
Beta Was this translation helpful? Give feedback.
-
Yeah, that was part of my thought process as well, the fact that other repositories containing similar content exist, seemingly with no issue I do suppose one could make the argument that
also handles code as many of the words in my list do contain speech, words, whatever that attacks a group on the basis of who they are. I will likely go forward and assume that that policy, although not explicitly stated, does not apply in this circumstance. if a github staff member is reading and has any input on this policy, it is much appreciated, have no intention to violate any sort of rules. |
Beta Was this translation helpful? Give feedback.
-
Hi 👋 GitHub Staff - but not a legal representative, nor am I able to give consent on behalf of GitHub. Now that we have that covered: In my experience, such content in that context has typically held up when it’s had to be investigated. This does not mean it always will, but that is my experience, if you find that helpful. Obviously when we are storing and publishing words that are used to perpetuate inequalities or otherwise cause harm, we want to be mindful of how we do it, how visible it is, and how it may be used by others. Saying that, there is a clear need for these in the software world, if only to prevent their more widespread use. For instance, if you can operate without that list being public, then that is ideal. If it must be public for your project to be, then it might be appropriate to nest it in your repository so it’s as unlikely as possible that someone may open it in error. If at some point your list becomes a problem, you will be able to discuss the situation with a human. I hope that helps. |
Beta Was this translation helpful? Give feedback.
-
I have no pony in this race, but this is a good case for Rot13 or other simple obfuscation. To the casual observer the data looks like gibberish, but it was trivial to make changes. We used this once in a project I was working on and the IDE could Rot13 on checkin as part of a formatting and Rot13 on checkout to restore. Its been more than 10 years so I don’t know specific tools any longer. Good luck. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the note on the human discussion, that is useful to know that it can be resolved with a real person and not some robot, no matter how cool robots are 🤖! your idea of not storing the data in the repository is interesting, and makes sense, as the list is not needed for contribution and debugging of the main code - our good friends that being said, it is always useful to have the words public to some degree, so that one can contribute words to said list. @davidintelinair’s idea seems perfect for my use case - contain a lightly obfuscated list that can easily be deobfuscated as needed - so that the unsuspecting public should not need their eyes bleached. Thanks again to everyone who provided input on this, it helps me out a lot going forward. |
Beta Was this translation helpful? Give feedback.
-
Couldn’t find a edit button so just replying here. rot13 ended up being perfect for this project, looks like gibberish and is easy to de encode using only one line in python! cheers to everyone who helped! |
Beta Was this translation helpful? Give feedback.
I have no pony in this race, but this is a good case for Rot13 or other simple obfuscation. To the casual observer the data looks like gibberish, but it was trivial to make changes. We used this once in a project I was working on and the IDE could Rot13 on checkin as part of a formatting and Rot13 on checkout to restore. Its been more than 10 years so I don’t know specific tools any longer. Good luck.